CVE-2025-33201 – NVIDIA Triton Inference Server has a vulnerabil... (How to Fix)

Q: What is the severity of CVE-2025-33201?

CVE-2025-33201 has a CVSS score of 7.5 (HIGH). NVIDIA Triton Inference Server has a vulnerability where sending excessively large payloads can trigger improper condition checking, potentially causing denial of service. This affects organizations using Triton Inference Server for AI model deployment and inference workloads.

📋 TL;DR

NVIDIA Triton Inference Server has a vulnerability where sending excessively large payloads can trigger improper condition checking, potentially causing denial of service. This affects organizations using Triton Inference Server for AI model deployment and inference workloads.

💻 Affected Systems

Products:

NVIDIA Triton Inference Server

Versions: All versions prior to 24.09

Operating Systems: Linux, Windows

Default Config Vulnerable: ⚠️ Yes

Notes: Affects all deployments using the vulnerable versions regardless of configuration.

📦 What is this software?

Triton Inference Server by Nvidia

View all CVEs affecting Triton Inference Server →

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete service disruption of Triton Inference Server, affecting all AI inference workloads and dependent applications.

🟠

Likely Case

Temporary service degradation or crashes requiring server restart, impacting inference availability.

🟢

If Mitigated

Minimal impact with proper input validation and monitoring in place.

🌐 Internet-Facing: HIGH - Attackers can send malicious payloads directly to exposed endpoints.

🏢 Internal Only: MEDIUM - Internal attackers or compromised systems could still exploit this vulnerability.

🎯 Exploit Status

Public PoC: ✅ No

Weaponized: UNKNOWN

Unauthenticated Exploit: ⚠️ Yes

Complexity: LOW

Exploitation requires sending specially crafted large payloads to Triton endpoints.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 24.09 and later

Vendor Advisory: https://nvidia.custhelp.com/app/answers/detail/a_id/5734

Restart Required: Yes

Instructions:

1. Download Triton Inference Server version 24.09 or later from NVIDIA NGC. 2. Stop the current Triton service. 3. Deploy the updated version. 4. Restart the service.

🔧 Temporary Workarounds

Payload Size Limiting

all

Configure reverse proxy or load balancer to limit maximum request size

# Example nginx configuration: client_max_body_size 10M;
# Example Apache configuration: LimitRequestBody 10485760

Network Segmentation

linux

Restrict access to Triton endpoints to trusted networks only

# Example iptables rule: iptables -A INPUT -p tcp --dport 8000 -s trusted_network -j ACCEPT

🧯 If You Can't Patch

Implement strict network access controls to limit who can send requests to Triton endpoints
Deploy rate limiting and request size validation at the network perimeter

🔍 How to Verify

Check if Vulnerable:

Check Triton version with: tritonserver --version

Check Version:

tritonserver --version

Verify Fix Applied:

Confirm version is 24.09 or higher and test with normal inference requests

📡 Detection & Monitoring

Log Indicators:

Unusually large request sizes in access logs
Triton service crashes or restarts
Error messages related to payload processing

Network Indicators:

Large HTTP/GRPC requests to Triton endpoints (typically ports 8000, 8001, 8002)
Sudden spikes in request sizes

SIEM Query:

source="triton" AND (message="*large*" OR message="*payload*" OR message="*crash*")

📊 Metadata

CVE ID: CVE-2025-33201

CVSS v3 Score: 7.5 (HIGH)

CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

CWE: CWE-754

EPSS Score: 0.06% exploit probability (18.4th percentile)

Published: December 3, 2025

Last Updated: December 5, 2025

🔗 References

https://nvd.nist.gov/vuln/detail/CVE-2025-33201 Vendor Advisory
https://nvidia.custhelp.com/app/answers/detail/a_id/5734 Vendor Advisory
https://www.cve.org/CVERecord?id=CVE-2025-33201 Third Party Advisory

📤 Share & Export

📄 Export Markdown 📋 Export JSON

🔗 Related Vulnerabilities

If you're affected by CVE-2025-33201, you might also want to check these: