CVE-2025-62372
📋 TL;DR
This vulnerability allows users to crash the vLLM inference engine by passing malformed multimodal embedding inputs with correct dimensionality but incorrect shape. It affects vLLM deployments serving multimodal models from version 0.5.5 to before 0.11.1.
💻 Affected Systems
- vLLM
📦 What is this software?
Vllm by Vllm
Vllm by Vllm
Vllm by Vllm
⚠️ Risk & Real-World Impact
Worst Case
Denial of service causing complete unavailability of vLLM inference services, disrupting LLM-based applications.
Likely Case
Service crashes requiring manual restart, causing temporary service disruption.
If Mitigated
No impact if proper input validation is implemented or patched version is used.
🎯 Exploit Status
Exploitation requires sending specifically crafted multimodal embedding inputs to the vLLM API.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: 0.11.1
Vendor Advisory: https://github.com/vllm-project/vllm/security/advisories/GHSA-pmqf-x6x8-p7qw
Restart Required: Yes
Instructions:
1. Update vLLM to version 0.11.1 or later using pip: pip install vllm>=0.11.1
2. Restart the vLLM service
3. Verify the update with: python -c "import vllm; print(vllm.__version__)"
🔧 Temporary Workarounds
Input validation wrapper
allImplement custom input validation to check multimodal embedding shapes before passing to vLLM
API gateway filtering
allConfigure API gateway or proxy to filter or validate multimodal embedding inputs
🧯 If You Can't Patch
- Implement strict input validation for multimodal embedding dimensions
- Restrict API access to trusted users only and monitor for abnormal input patterns
🔍 How to Verify
Check if Vulnerable:
Check vLLM version: python -c "import vllm; print(vllm.__version__)" - if version is between 0.5.5 and 0.11.0 inclusive, system is vulnerable if serving multimodal models.
Check Version:
python -c "import vllm; print(vllm.__version__)"
Verify Fix Applied:
After updating to 0.11.1+, attempt to send malformed multimodal embedding input and verify service remains stable.
📡 Detection & Monitoring
Log Indicators:
- Service crash logs
- Unexpected termination of vLLM process
- Error messages related to tensor shape mismatches
Network Indicators:
- Unusual multimodal embedding requests with non-standard dimensions
- Sudden drop in successful API responses
SIEM Query:
source="vllm.logs" AND ("crash" OR "segmentation fault" OR "tensor shape" OR "dimension mismatch")