CVE-2025-33201

7.5 HIGH

📋 TL;DR

NVIDIA Triton Inference Server has a vulnerability where sending excessively large payloads can trigger improper condition checking, potentially causing denial of service. This affects organizations using Triton Inference Server for AI model deployment and inference workloads.

💻 Affected Systems

Products:
  • NVIDIA Triton Inference Server
Versions: All versions prior to 24.09
Operating Systems: Linux, Windows
Default Config Vulnerable: ⚠️ Yes
Notes: Affects all deployments using the vulnerable versions regardless of configuration.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete service disruption of Triton Inference Server, affecting all AI inference workloads and dependent applications.

🟠

Likely Case

Temporary service degradation or crashes requiring server restart, impacting inference availability.

🟢

If Mitigated

Minimal impact with proper input validation and monitoring in place.

🌐 Internet-Facing: HIGH - Attackers can send malicious payloads directly to exposed endpoints.
🏢 Internal Only: MEDIUM - Internal attackers or compromised systems could still exploit this vulnerability.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploitation requires sending specially crafted large payloads to Triton endpoints.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 24.09 and later

Vendor Advisory: https://nvidia.custhelp.com/app/answers/detail/a_id/5734

Restart Required: Yes

Instructions:

1. Download Triton Inference Server version 24.09 or later from NVIDIA NGC. 2. Stop the current Triton service. 3. Deploy the updated version. 4. Restart the service.

🔧 Temporary Workarounds

Payload Size Limiting

all

Configure reverse proxy or load balancer to limit maximum request size

# Example nginx configuration: client_max_body_size 10M;
# Example Apache configuration: LimitRequestBody 10485760

Network Segmentation

linux

Restrict access to Triton endpoints to trusted networks only

# Example iptables rule: iptables -A INPUT -p tcp --dport 8000 -s trusted_network -j ACCEPT

🧯 If You Can't Patch

  • Implement strict network access controls to limit who can send requests to Triton endpoints
  • Deploy rate limiting and request size validation at the network perimeter

🔍 How to Verify

Check if Vulnerable:

Check Triton version with: tritonserver --version

Check Version:

tritonserver --version

Verify Fix Applied:

Confirm version is 24.09 or higher and test with normal inference requests

📡 Detection & Monitoring

Log Indicators:

  • Unusually large request sizes in access logs
  • Triton service crashes or restarts
  • Error messages related to payload processing

Network Indicators:

  • Large HTTP/GRPC requests to Triton endpoints (typically ports 8000, 8001, 8002)
  • Sudden spikes in request sizes

SIEM Query:

source="triton" AND (message="*large*" OR message="*payload*" OR message="*crash*")

🔗 References

📤 Share & Export