CVE-2023-46229

8.8 HIGH

📋 TL;DR

This Server-Side Request Forgery (SSRF) vulnerability in LangChain allows attackers to make the application send requests from external servers to internal network resources. It affects any system using LangChain's recursive_url_loader.py for web crawling before version 0.0.317. Attackers can potentially access internal services that should not be exposed.

💻 Affected Systems

Products:
  • LangChain
Versions: All versions before 0.0.317
Operating Systems: All
Default Config Vulnerable: ⚠️ Yes
Notes: Only affects systems using the recursive_url_loader.py component for web crawling functionality.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete internal network compromise allowing data exfiltration, lateral movement to sensitive systems, and potential credential theft from internal services.

🟠

Likely Case

Unauthorized access to internal APIs, metadata services, or internal web applications leading to data leakage.

🟢

If Mitigated

Limited to accessing only publicly available internal resources or blocked by network segmentation.

🌐 Internet-Facing: HIGH
🏢 Internal Only: MEDIUM

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploitation requires control over the initial URL provided to the recursive_url_loader, but no authentication is needed once the vulnerable component is invoked.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 0.0.317 and later

Vendor Advisory: https://github.com/langchain-ai/langchain/commit/9ecb7240a480720ec9d739b3877a52f76098a2b8

Restart Required: No

Instructions:

1. Update LangChain to version 0.0.317 or later using pip: pip install --upgrade langchain==0.0.317
2. Verify the update completed successfully
3. No application restart required for Python library updates

🔧 Temporary Workarounds

Disable recursive_url_loader

all

Temporarily disable or remove usage of the vulnerable recursive_url_loader.py component until patching is possible.

# Modify code to use alternative document loaders or disable web crawling functionality

Network segmentation controls

linux

Implement egress filtering to block outbound requests from application servers to internal network ranges.

# Configure firewall rules to block application server access to RFC1918 addresses
# iptables -A OUTPUT -d 10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 -j DROP

🧯 If You Can't Patch

  • Implement strict input validation and URL allowlisting for recursive_url_loader inputs
  • Deploy network monitoring and egress filtering to detect and block internal network access attempts

🔍 How to Verify

Check if Vulnerable:

Check LangChain version: python -c "import langchain; print(langchain.__version__)" and verify it's below 0.0.317

Check Version:

python -c "import langchain; print(langchain.__version__)"

Verify Fix Applied:

After updating, verify version is 0.0.317 or higher using same command and test that recursive_url_loader properly validates URLs

📡 Detection & Monitoring

Log Indicators:

  • Unusual outbound HTTP requests from application to internal IP addresses
  • Failed connection attempts to internal network ranges from application servers

Network Indicators:

  • HTTP traffic from application servers to RFC1918 addresses
  • Requests to internal metadata services (169.254.169.254, etc.)

SIEM Query:

source="application_logs" AND (dest_ip=10.0.0.0/8 OR dest_ip=172.16.0.0/12 OR dest_ip=192.168.0.0/16) AND http_request=*

🔗 References

📤 Share & Export