CVE-2024-2965
📋 TL;DR
This CVE describes a Denial-of-Service vulnerability in LangChain's SitemapLoader class where the parse_sitemap method can enter infinite recursion if a sitemap URL points to itself. This causes Python processes to crash by exceeding recursion limits, affecting any service using LangChain's sitemap parsing functionality. All users of the langchain-ai/langchain repository are affected.
💻 Affected Systems
- langchain-ai/langchain
📦 What is this software?
Langchain by Langchain
⚠️ Risk & Real-World Impact
Worst Case
Complete service outage as Python processes crash due to recursion depth exhaustion, potentially affecting multiple services if they share resources or dependencies.
Likely Case
Targeted DoS attacks causing intermittent service disruptions, increased resource consumption, and potential cascading failures in dependent systems.
If Mitigated
Minimal impact with proper input validation and recursion depth limits in place, potentially causing slower processing but no crashes.
🎯 Exploit Status
Exploitation requires ability to control sitemap input, which could be through user-supplied URLs or compromised sitemap sources.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: Versions after commit 73c42306745b0831aa6fe7fe4eeb70d2c2d87a82
Vendor Advisory: https://github.com/langchain-ai/langchain/commit/73c42306745b0831aa6fe7fe4eeb70d2c2d87a82
Restart Required: Yes
Instructions:
1. Update LangChain to latest version. 2. Verify the commit 73c42306745b0831aa6fe7fe4eeb70d2c2d87a82 is included. 3. Restart all services using LangChain.
🔧 Temporary Workarounds
Implement recursion depth limit
allAdd manual recursion depth checking in SitemapLoader usage
# In Python code using SitemapLoader:
# Add recursion depth tracking and limit
Input validation for sitemap URLs
allValidate sitemap URLs before passing to SitemapLoader
# Validate sitemap URLs don't reference themselves
# Check URL != current sitemap URL before parsing
🧯 If You Can't Patch
- Implement strict input validation for all sitemap URLs to prevent self-references
- Monitor Python process recursion depth and implement automatic restart thresholds
🔍 How to Verify
Check if Vulnerable:
Check if your LangChain version includes commit 73c42306745b0831aa6fe7fe4eeb70d2c2d87a82. If not, you are vulnerable.
Check Version:
pip show langchain | grep Version
Verify Fix Applied:
Test SitemapLoader with a self-referencing sitemap URL and verify it doesn't cause infinite recursion.
📡 Detection & Monitoring
Log Indicators:
- Python recursion depth errors
- Process crashes with maximum recursion depth exceeded
- Abnormal CPU/memory spikes in sitemap parsing processes
Network Indicators:
- Repeated requests to same sitemap URL
- Unusual patterns in sitemap fetching
SIEM Query:
process.name: "python" AND (error_message: "maximum recursion depth" OR error_message: "RecursionError")