CVE-2025-23321

7.5 HIGH

📋 TL;DR

NVIDIA Triton Inference Server contains a divide-by-zero vulnerability in request processing that could cause denial of service. Attackers can exploit this by sending specially crafted invalid requests to crash the server. This affects all deployments using vulnerable versions of NVIDIA Triton Inference Server.

💻 Affected Systems

Products:
  • NVIDIA Triton Inference Server
Versions: Versions prior to 24.09
Operating Systems: Windows, Linux
Default Config Vulnerable: ⚠️ Yes
Notes: All deployments using vulnerable versions are affected regardless of configuration.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete service disruption requiring manual restart of Triton Inference Server, potentially affecting all inference workloads and dependent applications.

🟠

Likely Case

Temporary denial of service affecting specific inference endpoints until service restart, with potential data loss for in-flight requests.

🟢

If Mitigated

Minimal impact with automatic restart mechanisms and request validation preventing exploitation.

🌐 Internet-Facing: HIGH - Publicly accessible Triton servers are directly exposed to attack from any internet source.
🏢 Internal Only: MEDIUM - Internal attackers or compromised internal systems could still exploit this vulnerability.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploitation requires sending malformed requests but no authentication is needed.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 24.09 or later

Vendor Advisory: https://nvidia.custhelp.com/app/answers/detail/a_id/5687

Restart Required: Yes

Instructions:

1. Download Triton Inference Server version 24.09 or later from NVIDIA NGC. 2. Stop the current Triton service. 3. Install the updated version. 4. Restart the Triton service.

🔧 Temporary Workarounds

Request Validation Proxy

all

Deploy a reverse proxy or API gateway to validate and filter incoming requests before they reach Triton.

Network Segmentation

all

Restrict access to Triton Inference Server to only trusted clients and networks.

🧯 If You Can't Patch

  • Implement strict network access controls to limit Triton server exposure
  • Deploy monitoring and automatic restart mechanisms for Triton service

🔍 How to Verify

Check if Vulnerable:

Check Triton version using 'tritonserver --version' or examine container image tags. Versions before 24.09 are vulnerable.

Check Version:

tritonserver --version

Verify Fix Applied:

Confirm version is 24.09 or later and test with malformed request simulation to ensure service remains stable.

📡 Detection & Monitoring

Log Indicators:

  • Unexpected server crashes
  • Divide-by-zero errors in logs
  • Abnormal termination of Triton process

Network Indicators:

  • Spike in malformed HTTP/GRPC requests to Triton endpoints
  • Unusual request patterns from single sources

SIEM Query:

source="triton" AND ("divide by zero" OR "segmentation fault" OR "unexpected termination")

🔗 References

📤 Share & Export