CVE-2024-12704

7.5 HIGH

📋 TL;DR

A vulnerability in the LangChainLLM class of llama_index v0.12.5 allows denial of service attacks through infinite loops when threads terminate abnormally. This affects applications using the stream_complete method with incorrect input types. Developers using the affected version are at risk.

💻 Affected Systems

Products:
  • run-llama/llama_index
Versions: v0.12.5 (specifically mentioned, check earlier versions for similar issues)
Operating Systems: All platforms running Python
Default Config Vulnerable: ⚠️ Yes
Notes: Only affects usage of the stream_complete method with LangChainLLM class. Requires incorrect input type to trigger.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete service unavailability with indefinite resource consumption, potentially affecting all users of the application.

🟠

Likely Case

Service degradation or crashes affecting specific endpoints using the vulnerable stream_complete method.

🟢

If Mitigated

Graceful error handling with proper thread termination and resource cleanup.

🌐 Internet-Facing: HIGH - If the vulnerable endpoint is exposed publicly, attackers can easily trigger DoS.
🏢 Internal Only: MEDIUM - Internal users could still trigger the vulnerability, but attack surface is smaller.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ✅ No
Complexity: MEDIUM

Requires knowledge of the API and ability to send malformed input. No authentication bypass needed if endpoint is accessible.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: Fixed in commit d1ecfb77578d089cbe66728f18f635c09aa32a05

Vendor Advisory: https://github.com/run-llama/llama_index/commit/d1ecfb77578d089cbe66728f18f635c09aa32a05

Restart Required: No

Instructions:

1. Update to latest llama_index version. 2. If using v0.12.5, apply the specific commit fix. 3. Test stream_complete functionality with various input types.

🔧 Temporary Workarounds

Input Validation

all

Add strict input type validation before calling stream_complete method

Disable Streaming

all

Use non-streaming alternatives if streaming is not required

🧯 If You Can't Patch

  • Implement rate limiting on endpoints using stream_complete
  • Add monitoring for abnormal thread termination and restart processes

🔍 How to Verify

Check if Vulnerable:

Check if using llama_index v0.12.5 and using LangChainLLM.stream_complete method

Check Version:

pip show llama_index | grep Version

Verify Fix Applied:

Test stream_complete with incorrect input types - should handle errors gracefully instead of infinite loop

📡 Detection & Monitoring

Log Indicators:

  • Thread termination errors in LangChainLLM
  • Unusually long processing times for stream_complete
  • High CPU usage without completion

Network Indicators:

  • Requests to stream_complete endpoints timing out
  • Increased error rates on LLM endpoints

SIEM Query:

source="application.logs" AND ("stream_complete" OR "LangChainLLM") AND ("thread" OR "infinite" OR "timeout")

🔗 References

📤 Share & Export