CVE-2023-46229
📋 TL;DR
This Server-Side Request Forgery (SSRF) vulnerability in LangChain allows attackers to make the application send requests from external servers to internal network resources. It affects any system using LangChain's recursive_url_loader.py for web crawling before version 0.0.317. Attackers can potentially access internal services that should not be exposed.
💻 Affected Systems
- LangChain
📦 What is this software?
Langchain by Langchain
⚠️ Risk & Real-World Impact
Worst Case
Complete internal network compromise allowing data exfiltration, lateral movement to sensitive systems, and potential credential theft from internal services.
Likely Case
Unauthorized access to internal APIs, metadata services, or internal web applications leading to data leakage.
If Mitigated
Limited to accessing only publicly available internal resources or blocked by network segmentation.
🎯 Exploit Status
Exploitation requires control over the initial URL provided to the recursive_url_loader, but no authentication is needed once the vulnerable component is invoked.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: 0.0.317 and later
Vendor Advisory: https://github.com/langchain-ai/langchain/commit/9ecb7240a480720ec9d739b3877a52f76098a2b8
Restart Required: No
Instructions:
1. Update LangChain to version 0.0.317 or later using pip: pip install --upgrade langchain==0.0.317
2. Verify the update completed successfully
3. No application restart required for Python library updates
🔧 Temporary Workarounds
Disable recursive_url_loader
allTemporarily disable or remove usage of the vulnerable recursive_url_loader.py component until patching is possible.
# Modify code to use alternative document loaders or disable web crawling functionality
Network segmentation controls
linuxImplement egress filtering to block outbound requests from application servers to internal network ranges.
# Configure firewall rules to block application server access to RFC1918 addresses
# iptables -A OUTPUT -d 10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 -j DROP
🧯 If You Can't Patch
- Implement strict input validation and URL allowlisting for recursive_url_loader inputs
- Deploy network monitoring and egress filtering to detect and block internal network access attempts
🔍 How to Verify
Check if Vulnerable:
Check LangChain version: python -c "import langchain; print(langchain.__version__)" and verify it's below 0.0.317
Check Version:
python -c "import langchain; print(langchain.__version__)"
Verify Fix Applied:
After updating, verify version is 0.0.317 or higher using same command and test that recursive_url_loader properly validates URLs
📡 Detection & Monitoring
Log Indicators:
- Unusual outbound HTTP requests from application to internal IP addresses
- Failed connection attempts to internal network ranges from application servers
Network Indicators:
- HTTP traffic from application servers to RFC1918 addresses
- Requests to internal metadata services (169.254.169.254, etc.)
SIEM Query:
source="application_logs" AND (dest_ip=10.0.0.0/8 OR dest_ip=172.16.0.0/12 OR dest_ip=192.168.0.0/16) AND http_request=*