CVE-2023-41886

7.5 HIGH

📋 TL;DR

CVE-2023-41886 is an arbitrary file read vulnerability in OpenRefine that allows unauthenticated attackers to read any file on the server filesystem. This affects all OpenRefine instances prior to version 3.7.5 that are exposed to untrusted networks. The vulnerability stems from improper input validation in file handling functions.

💻 Affected Systems

Products:
  • OpenRefine
Versions: All versions prior to 3.7.5
Operating Systems: All operating systems running OpenRefine
Default Config Vulnerable: ⚠️ Yes
Notes: All default installations of OpenRefine prior to 3.7.5 are vulnerable when exposed to network access.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Attackers could read sensitive files including configuration files, credentials, database files, and system files, potentially leading to complete system compromise.

🟠

Likely Case

Unauthenticated attackers reading application configuration files, source code, or sensitive data files stored on the server.

🟢

If Mitigated

Limited impact if proper network segmentation and access controls prevent untrusted users from reaching the OpenRefine instance.

🌐 Internet-Facing: HIGH - Unauthenticated exploitation allows any internet user to read files on exposed servers.
🏢 Internal Only: MEDIUM - Internal attackers or compromised internal systems could exploit this, but requires network access to the OpenRefine instance.

🎯 Exploit Status

Public PoC: ⚠️ Yes
Weaponized: LIKELY
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

The vulnerability is simple to exploit with publicly available proof-of-concept code. No authentication required.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 3.7.5

Vendor Advisory: https://github.com/OpenRefine/OpenRefine/security/advisories/GHSA-qqh2-wvmv-h72m

Restart Required: Yes

Instructions:

1. Download OpenRefine 3.7.5 or newer from the official repository. 2. Stop the running OpenRefine instance. 3. Replace with the patched version. 4. Restart the OpenRefine service.

🔧 Temporary Workarounds

Network Access Restriction

linux

Restrict network access to OpenRefine instances using firewall rules or network segmentation.

iptables -A INPUT -p tcp --dport 3333 -s trusted_ip_range -j ACCEPT
iptables -A INPUT -p tcp --dport 3333 -j DROP

Reverse Proxy with Authentication

all

Place OpenRefine behind a reverse proxy with authentication requirements.

🧯 If You Can't Patch

  • Implement strict network access controls to limit which IP addresses can reach the OpenRefine instance
  • Place the OpenRefine instance behind a web application firewall (WAF) with file path traversal protection rules

🔍 How to Verify

Check if Vulnerable:

Check if OpenRefine version is below 3.7.5 by accessing the web interface and viewing the version in the footer or about page.

Check Version:

curl -s http://openrefine-server:3333 | grep -i 'openrefine version' || echo 'Check web interface footer'

Verify Fix Applied:

Verify the version shows 3.7.5 or higher after patching. Test file read attempts should return access denied errors.

📡 Detection & Monitoring

Log Indicators:

  • Unusual file path access patterns in OpenRefine logs
  • Multiple failed file read attempts with path traversal patterns

Network Indicators:

  • HTTP requests containing '../' patterns or absolute file paths to OpenRefine port

SIEM Query:

source="openrefine.log" AND ("../" OR "/etc/" OR "/root/") AND status=200

🔗 References

📤 Share & Export