CVE-2025-32444
📋 TL;DR
This vulnerability allows remote code execution on vLLM instances using mooncake integration via insecure pickle deserialization over ZeroMQ sockets. Attackers can execute arbitrary code on affected systems by sending malicious payloads to the vulnerable sockets. Only vLLM deployments with mooncake integration enabled are affected.
💻 Affected Systems
- vLLM with mooncake integration
📦 What is this software?
Vllm by Vllm
⚠️ Risk & Real-World Impact
Worst Case
Complete system compromise allowing attackers to execute arbitrary commands, steal data, install malware, or pivot to other systems in the network.
Likely Case
Remote code execution leading to data exfiltration, cryptocurrency mining, or system disruption.
If Mitigated
Limited impact if network segmentation and access controls prevent external access to ZeroMQ sockets.
🎯 Exploit Status
Exploitation requires sending malicious pickle payloads to ZeroMQ sockets, which is straightforward for attackers familiar with pickle deserialization attacks.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: vLLM 0.8.5
Vendor Advisory: https://github.com/vllm-project/vllm/security/advisories/GHSA-hj4w-hm2g-p6w5
Restart Required: Yes
Instructions:
1. Upgrade vLLM to version 0.8.5 or later using pip: 'pip install vllm>=0.8.5'. 2. Restart all vLLM services. 3. Verify mooncake integration is using secure serialization.
🔧 Temporary Workarounds
Disable mooncake integration
allRemove or disable mooncake integration if not required for your deployment.
Modify vLLM configuration to disable mooncake features
Remove mooncake-related imports and configurations
Network isolation
linuxRestrict network access to ZeroMQ sockets using firewall rules.
iptables -A INPUT -p tcp --dport [ZMQ_PORT] -s [TRUSTED_IP] -j ACCEPT
iptables -A INPUT -p tcp --dport [ZMQ_PORT] -j DROP
🧯 If You Can't Patch
- Implement strict network segmentation to isolate vLLM instances from untrusted networks
- Deploy application-level firewalls to monitor and block suspicious pickle payloads
🔍 How to Verify
Check if Vulnerable:
Check vLLM version and mooncake usage: 1. Run 'pip show vllm' to check version. 2. Review configuration files for mooncake integration references.
Check Version:
pip show vllm | grep Version
Verify Fix Applied:
1. Confirm vLLM version is 0.8.5 or newer. 2. Verify mooncake integration uses secure serialization methods instead of pickle.
📡 Detection & Monitoring
Log Indicators:
- Unusual process spawns from vLLM services
- Errors related to pickle deserialization
- Suspicious network connections to ZeroMQ ports
Network Indicators:
- Unexpected pickle serialization data sent to vLLM ports
- Malformed or unusually large pickle payloads
SIEM Query:
source="vllm.log" AND ("pickle" OR "deserialization" OR "mooncake") AND severity=ERROR
🔗 References
- https://github.com/vllm-project/vllm/blob/32b14baf8a1f7195ca09484de3008063569b43c5/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L179
- https://github.com/vllm-project/vllm/commit/a5450f11c95847cf51a17207af9a3ca5ab569b2c
- https://github.com/vllm-project/vllm/security/advisories/GHSA-hj4w-hm2g-p6w5
- https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7