CVE-2024-0243 – This CVE describes a Server-Side Request Forger... (How to Fix)

📋 TL;DR

This CVE describes a Server-Side Request Forgery (SSRF) vulnerability in LangChain's RecursiveUrlLoader where an attacker controlling the initial crawled website can trick the crawler into fetching content from arbitrary external domains despite the prevent_outside=True setting. This affects any application using vulnerable versions of LangChain's recursive URL loader functionality.

💻 Affected Systems

Products:

langchain-ai/langchain

Versions: Versions before commit bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22

Operating Systems: all

Default Config Vulnerable: ⚠️ Yes

Notes: Only affects RecursiveUrlLoader with prevent_outside=True configuration. The vulnerability exists in URL parsing logic.

📦 What is this software?

Langchain by Langchain

View all CVEs affecting Langchain →

⚠️ Risk & Real-World Impact

🔴

Worst Case

Attackers could use the crawler as a proxy to access internal network resources, perform port scanning, or retrieve sensitive data from systems that trust the crawler's IP address.

🟠

Likely Case

Data exfiltration from internal services, unauthorized access to cloud metadata endpoints, or fetching malicious content that could lead to further exploitation.

🟢

If Mitigated

Limited to accessing only publicly available external resources, still potentially enabling information gathering or content injection.

🌐 Internet-Facing: HIGH

🏢 Internal Only: MEDIUM

🎯 Exploit Status

Public PoC: ⚠️ Yes

Weaponized: LIKELY

Unauthenticated Exploit: ⚠️ Yes

Complexity: LOW

Exploit requires control over the initial crawled website content. The vulnerability is well-documented with public proof-of-concept in the bug bounty report.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: Commit bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22 or later

Vendor Advisory: https://github.com/langchain-ai/langchain/pull/15559

Restart Required: No

Instructions:

1. Update LangChain to version containing commit bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22 or later
2. Replace vulnerable file: libs/community/langchain_community/document_loaders/recursive_url_loader.py
3. No restart required for Python applications

🔧 Temporary Workarounds

Implement custom URL validation

all

Add additional URL validation before passing to RecursiveUrlLoader to ensure URLs match expected domains

# Python code to validate URLs before crawling
from urllib.parse import urlparse

def validate_url(url, allowed_domains):
    parsed = urlparse(url)
    return any(parsed.netloc.endswith(domain) for domain in allowed_domains)

Use allowlist approach

all

Maintain explicit allowlist of domains that can be crawled instead of relying on prevent_outside parameter

# Example allowlist implementation
allowed_domains = ['example.com', 'trusted.org']
# Filter URLs before passing to loader

🧯 If You Can't Patch

Implement network-level restrictions to limit crawler's outbound connections
Monitor crawler activity for unexpected external domain requests

🔍 How to Verify

Check if Vulnerable:

Check if your langchain_community/document_loaders/recursive_url_loader.py file contains the vulnerable URL parsing logic from before commit bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22

Check Version:

pip show langchain-community | grep Version

Verify Fix Applied:

Verify the recursive_url_loader.py file includes the fix from PR #15559 with proper URL domain validation

📡 Detection & Monitoring

Log Indicators:

Crawler accessing unexpected external domains
URLs with mismatched domains in crawl logs

Network Indicators:

Outbound HTTP requests from crawler to unexpected domains
Requests to internal IP ranges from crawler

SIEM Query:

source="crawler_logs" AND (url NOT CONTAINS "expected-domain.com" OR url CONTAINS "internal-ip")

📊 Metadata

CVE ID: CVE-2024-0243

CVSS v3 Score: 8.1 (HIGH)

CVSS Vector: CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE: CWE-918

Published: February 26, 2024

Last Updated: February 25, 2025

🔗 References

https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22 Patch
https://github.com/langchain-ai/langchain/pull/15559 Issue Tracking,Patch
https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861 Exploit,Issue Tracking,Third Party Advisory
https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22 Patch
https://github.com/langchain-ai/langchain/pull/15559 Issue Tracking,Patch
https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861 Exploit,Issue Tracking,Third Party Advisory

📤 Share & Export

📄 Export Markdown 📋 Export JSON

🔗 Related Vulnerabilities

If you're affected by CVE-2024-0243, you might also want to check these: