CVE-2025-24357

7.5 HIGH

📋 TL;DR

This vulnerability in vLLM allows remote code execution when loading malicious model checkpoints from Hugging Face. Attackers can execute arbitrary code during unpickling when torch.load processes untrusted data. All vLLM users loading external model weights are affected.

💻 Affected Systems

Products:
  • vLLM
Versions: All versions before v0.7.0
Operating Systems: All platforms running vLLM
Default Config Vulnerable: ⚠️ Yes
Notes: Vulnerability triggers when loading model checkpoints via huggingface_hub or similar sources.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Full system compromise with attacker gaining shell access, data exfiltration, and persistent backdoor installation.

🟠

Likely Case

Arbitrary code execution in the vLLM process context, potentially leading to model theft, data leakage, or service disruption.

🟢

If Mitigated

No impact if loading only trusted, verified model checkpoints from controlled sources.

🌐 Internet-Facing: HIGH - If vLLM loads models from untrusted internet sources like public Hugging Face repositories.
🏢 Internal Only: MEDIUM - Risk exists if loading models from internal repositories without proper verification.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ✅ No
Complexity: MEDIUM

Requires attacker to supply malicious model checkpoint file to victim's vLLM instance.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: v0.7.0 and later

Vendor Advisory: https://github.com/vllm-project/vllm/security/advisories/GHSA-rh4j-5rhw-hr54

Restart Required: No

Instructions:

1. Update vLLM: pip install --upgrade vllm>=0.7.0
2. Verify installation: python -c "import vllm; print(vllm.__version__)"
3. Ensure all model loading operations use the patched version.

🔧 Temporary Workarounds

Use weights_only parameter

all

Manually set weights_only=True when calling torch.load in custom code

torch.load(model_path, weights_only=True)

Restrict model sources

all

Only load models from trusted, verified sources with integrity checks

🧯 If You Can't Patch

  • Implement strict model verification: Use cryptographic signatures or checksums for all model files before loading
  • Isolate vLLM in container with minimal privileges and network restrictions

🔍 How to Verify

Check if Vulnerable:

Check vLLM version: python -c "import vllm; print(vllm.__version__)" - if version < 0.7.0, vulnerable

Check Version:

python -c "import vllm; print(vllm.__version__)"

Verify Fix Applied:

Verify version >= 0.7.0 and check that hf_model_weights_iterator uses weights_only=True

📡 Detection & Monitoring

Log Indicators:

  • Unexpected process spawns from vLLM
  • Model loading errors with pickle exceptions
  • Unusual network connections during model loading

Network Indicators:

  • Downloads from unexpected Hugging Face repositories
  • Outbound connections from vLLM process to unknown IPs

SIEM Query:

process_name:vllm AND (process_spawn:* OR network_connection:* WHERE destination_ip NOT IN trusted_ips)

🔗 References

📤 Share & Export