CVE-2025-30165
📋 TL;DR
This vulnerability allows remote code execution in multi-node vLLM deployments using the V0 engine. Attackers can exploit unsafe pickle deserialization in ZeroMQ communication to execute arbitrary code on secondary hosts. Only users running vLLM with V0 engine enabled in multi-host tensor parallelism configurations are affected.
💻 Affected Systems
- vLLM
📦 What is this software?
Vllm by Vllm
⚠️ Risk & Real-World Impact
Worst Case
Complete compromise of all secondary hosts in the vLLM deployment, leading to data theft, model manipulation, and lateral movement within the network.
Likely Case
Limited exploitation requiring network access to the vLLM cluster, potentially leading to compromise of secondary nodes if primary is already breached.
If Mitigated
No impact if V0 engine is disabled or deployment uses single host or V1 engine.
🎯 Exploit Status
Exploitation requires network access to vLLM cluster and knowledge of pickle deserialization attacks. Could be combined with ARP poisoning or other MITM techniques.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: None - maintainers decided not to fix
Vendor Advisory: https://github.com/vllm-project/vllm/security/advisories/GHSA-9pcc-gvx5-r5wm
Restart Required: No
Instructions:
No official patch. Migrate to V1 engine or implement workarounds.
🔧 Temporary Workarounds
Disable V0 Engine
allSwitch to V1 engine which is not vulnerable
Set environment variable: VLLM_USE_V1=1
Or configure in vLLM settings to use V1 engine
Network Segmentation
allIsolate vLLM cluster on secure network
Implement firewall rules to restrict access to vLLM ports
Use VLANs or private subnets for vLLM communication
🧯 If You Can't Patch
- Disable multi-host tensor parallelism and use single-host deployments only
- Implement strict network controls and monitor for ARP poisoning or MITM attacks
🔍 How to Verify
Check if Vulnerable:
Check if V0 engine is enabled and deployment uses multi-host tensor parallelism. Review vLLM configuration files and environment variables.
Check Version:
python -c "import vllm; print(vllm.__version__)"
Verify Fix Applied:
Confirm V1 engine is active or multi-host tensor parallelism is disabled. Verify network segmentation is in place.
📡 Detection & Monitoring
Log Indicators:
- Unusual pickle deserialization errors
- Suspicious ZeroMQ connection attempts
- Unexpected process execution on secondary hosts
Network Indicators:
- ARP cache poisoning attempts
- Unexpected traffic to vLLM ZeroMQ ports
- Malformed pickle data in network captures
SIEM Query:
Search for 'pickle.loads' errors in vLLM logs combined with network anomalies
🔗 References
- https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L295-L301
- https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L468-L470
- https://github.com/vllm-project/vllm/security/advisories/GHSA-9pcc-gvx5-r5wm