CVE-2024-0116
📋 TL;DR
NVIDIA Triton Inference Server has an out-of-bounds read vulnerability where users can release shared memory regions while they're in use. This could allow attackers to cause denial of service by crashing the server. Organizations using Triton Inference Server for AI/ML inference are affected.
💻 Affected Systems
- NVIDIA Triton Inference Server
📦 What is this software?
⚠️ Risk & Real-World Impact
Worst Case
Complete denial of service causing Triton server crashes and disrupting AI inference services
Likely Case
Service disruption through server crashes requiring restart, potentially affecting inference workloads
If Mitigated
Limited impact with proper access controls and monitoring in place
🎯 Exploit Status
Requires user access to trigger shared memory operations
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: 24.01 or later
Vendor Advisory: https://nvidia.custhelp.com/app/answers/detail/a_id/5565
Restart Required: Yes
Instructions:
1. Download Triton Inference Server 24.01 or later from NVIDIA NGC. 2. Stop current Triton server. 3. Deploy updated version. 4. Restart Triton server with updated configuration.
🔧 Temporary Workarounds
Disable shared memory
allConfigure Triton to not use shared memory regions
Set '--allow-shared-memory=system' to 'false' in Triton startup configuration
🧯 If You Can't Patch
- Implement strict access controls to limit who can interact with Triton inference endpoints
- Monitor Triton server logs for abnormal termination and implement automated restart procedures
🔍 How to Verify
Check if Vulnerable:
Check Triton server version with: tritonserver --version
Check Version:
tritonserver --version
Verify Fix Applied:
Verify version is 24.01 or later and test shared memory operations
📡 Detection & Monitoring
Log Indicators:
- Unexpected Triton server crashes
- Segmentation fault errors in logs
- Abnormal termination of inference processes
Network Indicators:
- Sudden drop in inference request responses
- Connection resets to Triton endpoints
SIEM Query:
source="triton" AND ("segmentation fault" OR "crash" OR "abnormal termination")