CVE-2025-62164
📋 TL;DR
A memory corruption vulnerability in vLLM's Completions API endpoint allows attackers to send malicious prompt embeddings that bypass bounds checks and trigger out-of-bounds memory writes. This can cause denial-of-service crashes and potentially remote code execution on servers running vulnerable vLLM versions. Organizations using vLLM versions 0.10.2 through 0.11.0 for LLM inference are affected.
💻 Affected Systems
- vLLM
📦 What is this software?
Vllm by Vllm
Vllm by Vllm
Vllm by Vllm
⚠️ Risk & Real-World Impact
Worst Case
Remote code execution on the vLLM server, allowing complete system compromise and data exfiltration.
Likely Case
Denial-of-service crashes disrupting LLM inference services and potentially corrupting model states.
If Mitigated
Service disruption limited to the affected vLLM instance if proper network segmentation and monitoring are in place.
🎯 Exploit Status
Exploitation requires crafting malicious tensors but leverages a known PyTorch behavior change.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: 0.11.1
Vendor Advisory: https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf
Restart Required: Yes
Instructions:
1. Update vLLM to version 0.11.1 or later using pip: pip install --upgrade vllm==0.11.1. 2. Restart all vLLM services. 3. Verify the patch is applied by checking the version.
🔧 Temporary Workarounds
Downgrade PyTorch
allRevert to PyTorch version before 2.8.0 where sparse tensor integrity checks are enabled by default.
pip install torch==2.7.1
Disable Completions API
allTemporarily disable the vulnerable Completions API endpoint if not required.
🧯 If You Can't Patch
- Implement strict network access controls to limit Completions API access to trusted sources only.
- Deploy vLLM instances in isolated containers with minimal privileges to limit potential RCE impact.
🔍 How to Verify
Check if Vulnerable:
Check vLLM version and PyTorch version: if vLLM is 0.10.2-0.11.0 and PyTorch is >=2.8.0, the system is vulnerable.
Check Version:
python -c "import vllm; print(vllm.__version__)"
Verify Fix Applied:
Confirm vLLM version is 0.11.1 or later and test the Completions API with valid embeddings to ensure normal operation.
📡 Detection & Monitoring
Log Indicators:
- Unexpected crashes or segmentation faults in vLLM logs
- Errors related to torch.load() or tensor processing in Completions API requests
Network Indicators:
- Unusual spikes in requests to the /completions endpoint
- Requests with abnormally large or malformed payloads
SIEM Query:
source="vllm.logs" AND ("segmentation fault" OR "torch.load" OR "to_dense")