CVE-2022-0239 – CVE-2022-0239 is an XXE (XML External Entity) v... (How to Fix)

Q: What is the severity of CVE-2022-0239?

CVE-2022-0239 has a CVSS score of 9.8 (CRITICAL). CVE-2022-0239 is an XXE (XML External Entity) vulnerability in Stanford CoreNLP that allows attackers to read arbitrary files from the server filesystem or conduct server-side request forgery attacks. This affects any system running vulnerable versions of CoreNLP that processes untrusted XML input. The vulnerability is particularly dangerous because it can be exploited remotely without authentication.

📋 TL;DR

CVE-2022-0239 is an XXE (XML External Entity) vulnerability in Stanford CoreNLP that allows attackers to read arbitrary files from the server filesystem or conduct server-side request forgery attacks. This affects any system running vulnerable versions of CoreNLP that processes untrusted XML input. The vulnerability is particularly dangerous because it can be exploited remotely without authentication.

💻 Affected Systems

Products:

Stanford CoreNLP

Versions: All versions before 4.4.0

Operating Systems: All operating systems running Java

Default Config Vulnerable: ⚠️ Yes

Notes: Any CoreNLP installation that processes XML input is vulnerable. The vulnerability is in the XML parsing functionality.

📦 What is this software?

Corenlp by Stanford

View all CVEs affecting Corenlp →

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete server compromise including sensitive file disclosure (passwords, keys, configuration files), SSRF attacks to internal services, and potential remote code execution through file inclusion.

🟠

Likely Case

Arbitrary file read from the server filesystem, potentially exposing sensitive configuration files, credentials, or application data.

🟢

If Mitigated

Limited impact with proper input validation and XML parser configuration that disables external entity processing.

🌐 Internet-Facing: HIGH - The vulnerability can be exploited remotely without authentication when CoreNLP services are exposed to the internet.

🏢 Internal Only: MEDIUM - Internal attackers or compromised internal systems could exploit this to escalate privileges or access sensitive data.

🎯 Exploit Status

Public PoC: ⚠️ Yes

Weaponized: LIKELY

Unauthenticated Exploit: ⚠️ Yes

Complexity: LOW

Exploitation requires sending malicious XML payloads to CoreNLP endpoints. Public proof-of-concept examples demonstrate file disclosure attacks.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 4.4.0 and later

Vendor Advisory: https://github.com/stanfordnlp/corenlp/commit/1940ffb938dc4f3f5bc5f2a2fd8b35aabbbae3dd

Restart Required: Yes

Instructions:

1. Upgrade to CoreNLP version 4.4.0 or later. 2. Download from official GitHub releases. 3. Replace existing CoreNLP installation. 4. Restart any services using CoreNLP.

🔧 Temporary Workarounds

Disable XXE in XML parser

all

Configure XML parser to disable external entity processing before parsing untrusted XML

// Java code: DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

Input validation and sanitization

all

Validate and sanitize XML input before processing, rejecting suspicious content

// Implement input validation to reject XML containing DOCTYPE declarations
// or external entity references before passing to CoreNLP

🧯 If You Can't Patch

Implement network segmentation to isolate CoreNLP instances from sensitive systems
Deploy WAF rules to block XML payloads containing DOCTYPE declarations or external entity references

🔍 How to Verify

Check if Vulnerable:

Check CoreNLP version: java -cp "stanford-corenlp-*.jar" edu.stanford.nlp.util.SystemUtils --version

Check Version:

java -cp "stanford-corenlp-*.jar" edu.stanford.nlp.util.SystemUtils --version

Verify Fix Applied:

Verify version is 4.4.0 or higher and test with XXE payloads that should be rejected

📡 Detection & Monitoring

Log Indicators:

XML parsing errors related to external entities
Unexpected file access patterns from CoreNLP process
HTTP requests containing XML with DOCTYPE declarations

Network Indicators:

HTTP POST requests with XML content to CoreNLP endpoints
Outbound connections from CoreNLP server to internal services

SIEM Query:

source="*corenlp*" AND ("DOCTYPE" OR "ENTITY" OR "SYSTEM")

📊 Metadata

CVE ID: CVE-2022-0239

CVSS v3 Score: 9.8 (CRITICAL)

CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE: CWE-611

Published: January 17, 2022

Last Updated: November 21, 2024

🔗 References

https://github.com/stanfordnlp/corenlp/commit/1940ffb938dc4f3f5bc5f2a2fd8b35aabbbae3dd Patch,Third Party Advisory
https://huntr.dev/bounties/a717aec2-5646-4a5f-ade0-dadc25736ae3 Exploit,Third Party Advisory
https://github.com/stanfordnlp/corenlp/commit/1940ffb938dc4f3f5bc5f2a2fd8b35aabbbae3dd Patch,Third Party Advisory
https://huntr.dev/bounties/a717aec2-5646-4a5f-ade0-dadc25736ae3 Exploit,Third Party Advisory

📤 Share & Export

📄 Export Markdown 📋 Export JSON

🔗 Related Vulnerabilities

If you're affected by CVE-2022-0239, you might also want to check these: