CVE-2021-3869

7.5 HIGH

📋 TL;DR

CVE-2021-3869 is an XXE (XML External Entity) vulnerability in Stanford CoreNLP that allows attackers to read arbitrary files from the server filesystem or conduct server-side request forgery attacks. This affects any system running vulnerable versions of CoreNLP that processes untrusted XML input. The vulnerability is particularly dangerous when CoreNLP is exposed to user-controlled XML data.

💻 Affected Systems

Products:
  • Stanford CoreNLP
Versions: Versions prior to 4.3.0
Operating Systems: All
Default Config Vulnerable: ⚠️ Yes
Notes: Only vulnerable when processing XML input. Systems that don't process XML or only process trusted XML are not affected.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete server compromise through file disclosure of sensitive data (passwords, keys, configuration files) or SSRF leading to internal network reconnaissance and potential lateral movement.

🟠

Likely Case

Unauthorized file read access to server files, potentially exposing sensitive configuration data, credentials, or application source code.

🟢

If Mitigated

Limited impact with proper input validation and XML parser configuration, potentially no exploitation if external entity processing is disabled.

🌐 Internet-Facing: HIGH
🏢 Internal Only: MEDIUM

🎯 Exploit Status

Public PoC: ⚠️ Yes
Weaponized: LIKELY
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

XXE vulnerabilities are well-understood with many public exploit examples. The vulnerability requires XML input processing but doesn't require authentication.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 4.3.0 and later

Vendor Advisory: https://github.com/stanfordnlp/corenlp/commit/5d83f1e8482ca304db8be726cad89554c88f136a

Restart Required: Yes

Instructions:

1. Update CoreNLP to version 4.3.0 or later. 2. Replace the corenlp.jar file with the patched version. 3. Restart any services using CoreNLP.

🔧 Temporary Workarounds

Disable XXE in XML parser

all

Configure XML parser to disable external entity processing

Set XML parser properties: FEATURE_SECURE_PROCESSING = true, DISALLOW_DOCTYPE_DECL = true, EXTERNAL_GENERAL_ENTITIES = false, EXTERNAL_PARAMETER_ENTITIES = false, LOAD_EXTERNAL_DTD = false

Input validation and sanitization

all

Validate and sanitize XML input before processing

Implement XML schema validation
Remove DOCTYPE declarations from input
Use whitelist for allowed XML elements

🧯 If You Can't Patch

  • Implement network segmentation to isolate CoreNLP instances from sensitive systems
  • Deploy WAF rules to block XML containing DOCTYPE declarations or external entity references

🔍 How to Verify

Check if Vulnerable:

Check CoreNLP version: java -cp corenlp.jar edu.stanford.nlp.util.SystemUtils

Check Version:

java -cp corenlp.jar edu.stanford.nlp.util.SystemUtils | grep 'Stanford CoreNLP version'

Verify Fix Applied:

Verify version is 4.3.0 or later and test with XXE payload that should be rejected

📡 Detection & Monitoring

Log Indicators:

  • XML parsing errors related to external entities
  • Unexpected file read operations from CoreNLP process
  • HTTP requests to internal resources from CoreNLP

Network Indicators:

  • Outbound requests from CoreNLP to unexpected internal systems
  • Large XML payloads containing DOCTYPE declarations

SIEM Query:

source="corenlp" AND (message="*DOCTYPE*" OR message="*ENTITY*" OR message="*external*" OR error="*XML*" OR error="*parse*")

🔗 References

📤 Share & Export