CVE-2025-32444

10.0 CRITICAL

📋 TL;DR

This vulnerability allows remote code execution on vLLM instances using mooncake integration via insecure pickle deserialization over ZeroMQ sockets. Attackers can execute arbitrary code on affected systems by sending malicious payloads to the vulnerable sockets. Only vLLM deployments with mooncake integration enabled are affected.

💻 Affected Systems

Products:
  • vLLM with mooncake integration
Versions: vLLM versions 0.6.5 through 0.8.4
Operating Systems: All operating systems running vLLM
Default Config Vulnerable: ⚠️ Yes
Notes: Only affects vLLM instances with mooncake integration enabled. Standard vLLM deployments without mooncake are not vulnerable.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete system compromise allowing attackers to execute arbitrary commands, steal data, install malware, or pivot to other systems in the network.

🟠

Likely Case

Remote code execution leading to data exfiltration, cryptocurrency mining, or system disruption.

🟢

If Mitigated

Limited impact if network segmentation and access controls prevent external access to ZeroMQ sockets.

🌐 Internet-Facing: HIGH - Vulnerable sockets listen on all network interfaces by default, making internet-exposed instances easily exploitable.
🏢 Internal Only: HIGH - Even internally, attackers with network access can exploit this without authentication.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: LIKELY
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploitation requires sending malicious pickle payloads to ZeroMQ sockets, which is straightforward for attackers familiar with pickle deserialization attacks.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: vLLM 0.8.5

Vendor Advisory: https://github.com/vllm-project/vllm/security/advisories/GHSA-hj4w-hm2g-p6w5

Restart Required: Yes

Instructions:

1. Upgrade vLLM to version 0.8.5 or later using pip: 'pip install vllm>=0.8.5'. 2. Restart all vLLM services. 3. Verify mooncake integration is using secure serialization.

🔧 Temporary Workarounds

Disable mooncake integration

all

Remove or disable mooncake integration if not required for your deployment.

Modify vLLM configuration to disable mooncake features
Remove mooncake-related imports and configurations

Network isolation

linux

Restrict network access to ZeroMQ sockets using firewall rules.

iptables -A INPUT -p tcp --dport [ZMQ_PORT] -s [TRUSTED_IP] -j ACCEPT
iptables -A INPUT -p tcp --dport [ZMQ_PORT] -j DROP

🧯 If You Can't Patch

  • Implement strict network segmentation to isolate vLLM instances from untrusted networks
  • Deploy application-level firewalls to monitor and block suspicious pickle payloads

🔍 How to Verify

Check if Vulnerable:

Check vLLM version and mooncake usage: 1. Run 'pip show vllm' to check version. 2. Review configuration files for mooncake integration references.

Check Version:

pip show vllm | grep Version

Verify Fix Applied:

1. Confirm vLLM version is 0.8.5 or newer. 2. Verify mooncake integration uses secure serialization methods instead of pickle.

📡 Detection & Monitoring

Log Indicators:

  • Unusual process spawns from vLLM services
  • Errors related to pickle deserialization
  • Suspicious network connections to ZeroMQ ports

Network Indicators:

  • Unexpected pickle serialization data sent to vLLM ports
  • Malformed or unusually large pickle payloads

SIEM Query:

source="vllm.log" AND ("pickle" OR "deserialization" OR "mooncake") AND severity=ERROR

🔗 References

📤 Share & Export