CVE-2024-2965

4.7 MEDIUM

📋 TL;DR

This CVE describes a Denial-of-Service vulnerability in LangChain's SitemapLoader class where the parse_sitemap method can enter infinite recursion if a sitemap URL points to itself. This causes Python processes to crash by exceeding recursion limits, affecting any service using LangChain's sitemap parsing functionality. All users of the langchain-ai/langchain repository are affected.

💻 Affected Systems

Products:
  • langchain-ai/langchain
Versions: All versions prior to fix
Operating Systems: All operating systems running Python
Default Config Vulnerable: ⚠️ Yes
Notes: Only affects systems using the SitemapLoader class functionality for parsing sitemaps.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete service outage as Python processes crash due to recursion depth exhaustion, potentially affecting multiple services if they share resources or dependencies.

🟠

Likely Case

Targeted DoS attacks causing intermittent service disruptions, increased resource consumption, and potential cascading failures in dependent systems.

🟢

If Mitigated

Minimal impact with proper input validation and recursion depth limits in place, potentially causing slower processing but no crashes.

🌐 Internet-Facing: MEDIUM
🏢 Internal Only: LOW

🎯 Exploit Status

Public PoC: ⚠️ Yes
Weaponized: LIKELY
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploitation requires ability to control sitemap input, which could be through user-supplied URLs or compromised sitemap sources.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: Versions after commit 73c42306745b0831aa6fe7fe4eeb70d2c2d87a82

Vendor Advisory: https://github.com/langchain-ai/langchain/commit/73c42306745b0831aa6fe7fe4eeb70d2c2d87a82

Restart Required: Yes

Instructions:

1. Update LangChain to latest version. 2. Verify the commit 73c42306745b0831aa6fe7fe4eeb70d2c2d87a82 is included. 3. Restart all services using LangChain.

🔧 Temporary Workarounds

Implement recursion depth limit

all

Add manual recursion depth checking in SitemapLoader usage

# In Python code using SitemapLoader:
# Add recursion depth tracking and limit

Input validation for sitemap URLs

all

Validate sitemap URLs before passing to SitemapLoader

# Validate sitemap URLs don't reference themselves
# Check URL != current sitemap URL before parsing

🧯 If You Can't Patch

  • Implement strict input validation for all sitemap URLs to prevent self-references
  • Monitor Python process recursion depth and implement automatic restart thresholds

🔍 How to Verify

Check if Vulnerable:

Check if your LangChain version includes commit 73c42306745b0831aa6fe7fe4eeb70d2c2d87a82. If not, you are vulnerable.

Check Version:

pip show langchain | grep Version

Verify Fix Applied:

Test SitemapLoader with a self-referencing sitemap URL and verify it doesn't cause infinite recursion.

📡 Detection & Monitoring

Log Indicators:

  • Python recursion depth errors
  • Process crashes with maximum recursion depth exceeded
  • Abnormal CPU/memory spikes in sitemap parsing processes

Network Indicators:

  • Repeated requests to same sitemap URL
  • Unusual patterns in sitemap fetching

SIEM Query:

process.name: "python" AND (error_message: "maximum recursion depth" OR error_message: "RecursionError")

🔗 References

📤 Share & Export