CVE-2025-30165

8.0 HIGH

📋 TL;DR

This vulnerability allows remote code execution in multi-node vLLM deployments using the V0 engine. Attackers can exploit unsafe pickle deserialization in ZeroMQ communication to execute arbitrary code on secondary hosts. Only users running vLLM with V0 engine enabled in multi-host tensor parallelism configurations are affected.

💻 Affected Systems

Products:
  • vLLM
Versions: All versions with V0 engine enabled
Operating Systems: All platforms running vLLM
Default Config Vulnerable: ✅ No
Notes: Only affects V0 engine (off by default since v0.8.0) in multi-host tensor parallelism deployments.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete compromise of all secondary hosts in the vLLM deployment, leading to data theft, model manipulation, and lateral movement within the network.

🟠

Likely Case

Limited exploitation requiring network access to the vLLM cluster, potentially leading to compromise of secondary nodes if primary is already breached.

🟢

If Mitigated

No impact if V0 engine is disabled or deployment uses single host or V1 engine.

🌐 Internet-Facing: MEDIUM - Requires specific multi-node configuration and network access, but exploit could be chained with other attacks.
🏢 Internal Only: HIGH - In internal multi-node deployments, this provides a critical escalation path if primary host is compromised.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ⚠️ Yes
Complexity: MEDIUM

Exploitation requires network access to vLLM cluster and knowledge of pickle deserialization attacks. Could be combined with ARP poisoning or other MITM techniques.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: None - maintainers decided not to fix

Vendor Advisory: https://github.com/vllm-project/vllm/security/advisories/GHSA-9pcc-gvx5-r5wm

Restart Required: No

Instructions:

No official patch. Migrate to V1 engine or implement workarounds.

🔧 Temporary Workarounds

Disable V0 Engine

all

Switch to V1 engine which is not vulnerable

Set environment variable: VLLM_USE_V1=1
Or configure in vLLM settings to use V1 engine

Network Segmentation

all

Isolate vLLM cluster on secure network

Implement firewall rules to restrict access to vLLM ports
Use VLANs or private subnets for vLLM communication

🧯 If You Can't Patch

  • Disable multi-host tensor parallelism and use single-host deployments only
  • Implement strict network controls and monitor for ARP poisoning or MITM attacks

🔍 How to Verify

Check if Vulnerable:

Check if V0 engine is enabled and deployment uses multi-host tensor parallelism. Review vLLM configuration files and environment variables.

Check Version:

python -c "import vllm; print(vllm.__version__)"

Verify Fix Applied:

Confirm V1 engine is active or multi-host tensor parallelism is disabled. Verify network segmentation is in place.

📡 Detection & Monitoring

Log Indicators:

  • Unusual pickle deserialization errors
  • Suspicious ZeroMQ connection attempts
  • Unexpected process execution on secondary hosts

Network Indicators:

  • ARP cache poisoning attempts
  • Unexpected traffic to vLLM ZeroMQ ports
  • Malformed pickle data in network captures

SIEM Query:

Search for 'pickle.loads' errors in vLLM logs combined with network anomalies

🔗 References

📤 Share & Export