CVE-2024-46455

9.8 CRITICAL

📋 TL;DR

CVE-2024-46455 is an XML External Entity (XXE) vulnerability in unstructured's XMLParser that allows attackers to read arbitrary files, perform server-side request forgery (SSRF), or cause denial of service by processing malicious XML input. This affects all users of unstructured v0.14.2 and earlier who process untrusted XML data. The vulnerability is particularly dangerous because it can be exploited remotely without authentication.

💻 Affected Systems

Products:
  • unstructured
Versions: v0.14.2 and earlier
Operating Systems: All platforms running affected unstructured versions
Default Config Vulnerable: ⚠️ Yes
Notes: All installations using XMLParser functionality are vulnerable when processing untrusted XML input. The vulnerability exists in the core XML parsing component.

⚠️ Manual Verification Required

This CVE does not have specific version information in our database, so automatic vulnerability detection cannot determine if your system is affected.

Why? The CVE database entry doesn't specify which versions are vulnerable (no version ranges provided by the vendor/NVD).

🔒 Custom verification scripts are available for registered users. Sign up free to download automated test scripts.

Recommended Actions:
  1. Review the CVE details at NVD
  2. Check vendor security advisories for your specific version
  3. Test if the vulnerability is exploitable in your environment
  4. Consider updating to the latest version as a precaution

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete system compromise through file disclosure of sensitive data (passwords, keys, configs), SSRF attacks against internal services, or denial of service via billion laughs attack.

🟠

Likely Case

Arbitrary file read from the server filesystem, potentially exposing configuration files, credentials, or sensitive application data.

🟢

If Mitigated

Limited impact if XML parsing is disabled or only trusted XML sources are processed with proper input validation.

🌐 Internet-Facing: HIGH - The vulnerability can be exploited remotely without authentication via XML input to affected endpoints.
🏢 Internal Only: HIGH - Even internal applications processing XML from untrusted sources are vulnerable to file disclosure and SSRF.

🎯 Exploit Status

Public PoC: ⚠️ Yes
Weaponized: LIKELY
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

XXE vulnerabilities are well-understood with readily available exploit techniques. The vulnerability requires XML input to be processed by the affected parser.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: v0.14.3 or later

Vendor Advisory: https://github.com/Unstructured-IO/unstructured/security/advisories

Restart Required: Yes

Instructions:

1. Update unstructured to v0.14.3 or later using pip: 'pip install --upgrade unstructured>=0.14.3' 2. Restart all services using unstructured 3. Verify the update with 'pip show unstructured'

🔧 Temporary Workarounds

Disable external entity processing

all

Configure XML parser to disable external entity resolution before processing untrusted XML

Set XMLParser with resolve_entities=False in Python code

Input validation and sanitization

all

Reject XML input containing DOCTYPE declarations or external entity references

Implement XML validation to block DOCTYPE and ENTITY declarations

🧯 If You Can't Patch

  • Implement network segmentation to isolate vulnerable systems and restrict outbound connections
  • Deploy web application firewall (WAF) rules to block XML containing DOCTYPE or external entity references

🔍 How to Verify

Check if Vulnerable:

Check if unstructured version is 0.14.2 or earlier and if XML parsing functionality is used with untrusted input

Check Version:

pip show unstructured | grep Version

Verify Fix Applied:

Verify unstructured version is 0.14.3 or later and test with XXE payloads that should be rejected

📡 Detection & Monitoring

Log Indicators:

  • XML parsing errors, file access attempts via XML entities, unusual outbound connections from XML parser

Network Indicators:

  • HTTP requests with XML payloads containing DOCTYPE or ENTITY declarations, outbound connections to internal services from XML parser

SIEM Query:

source="application_logs" AND ("DOCTYPE" OR "ENTITY" OR "SYSTEM") AND "unstructured"

🔗 References

📤 Share & Export