CVE-2025-62372

6.5 MEDIUM

📋 TL;DR

This vulnerability allows users to crash the vLLM inference engine by passing malformed multimodal embedding inputs with correct dimensionality but incorrect shape. It affects vLLM deployments serving multimodal models from version 0.5.5 to before 0.11.1.

💻 Affected Systems

Products:
  • vLLM
Versions: 0.5.5 to before 0.11.1
Operating Systems: All
Default Config Vulnerable: ⚠️ Yes
Notes: Only affects deployments serving multimodal models. Text-only models are not vulnerable.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Denial of service causing complete unavailability of vLLM inference services, disrupting LLM-based applications.

🟠

Likely Case

Service crashes requiring manual restart, causing temporary service disruption.

🟢

If Mitigated

No impact if proper input validation is implemented or patched version is used.

🌐 Internet-Facing: MEDIUM - Exploitable by any user with API access, but requires specific malformed input.
🏢 Internal Only: MEDIUM - Internal users or automated systems could accidentally or intentionally trigger the crash.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploitation requires sending specifically crafted multimodal embedding inputs to the vLLM API.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 0.11.1

Vendor Advisory: https://github.com/vllm-project/vllm/security/advisories/GHSA-pmqf-x6x8-p7qw

Restart Required: Yes

Instructions:

1. Update vLLM to version 0.11.1 or later using pip: pip install vllm>=0.11.1
2. Restart the vLLM service
3. Verify the update with: python -c "import vllm; print(vllm.__version__)"

🔧 Temporary Workarounds

Input validation wrapper

all

Implement custom input validation to check multimodal embedding shapes before passing to vLLM

API gateway filtering

all

Configure API gateway or proxy to filter or validate multimodal embedding inputs

🧯 If You Can't Patch

  • Implement strict input validation for multimodal embedding dimensions
  • Restrict API access to trusted users only and monitor for abnormal input patterns

🔍 How to Verify

Check if Vulnerable:

Check vLLM version: python -c "import vllm; print(vllm.__version__)" - if version is between 0.5.5 and 0.11.0 inclusive, system is vulnerable if serving multimodal models.

Check Version:

python -c "import vllm; print(vllm.__version__)"

Verify Fix Applied:

After updating to 0.11.1+, attempt to send malformed multimodal embedding input and verify service remains stable.

📡 Detection & Monitoring

Log Indicators:

  • Service crash logs
  • Unexpected termination of vLLM process
  • Error messages related to tensor shape mismatches

Network Indicators:

  • Unusual multimodal embedding requests with non-standard dimensions
  • Sudden drop in successful API responses

SIEM Query:

source="vllm.logs" AND ("crash" OR "segmentation fault" OR "tensor shape" OR "dimension mismatch")

🔗 References

📤 Share & Export