CVE-2025-14009

10.0 CRITICAL

📋 TL;DR

This critical vulnerability in NLTK's downloader component allows remote code execution when users download malicious zip packages. Attackers can craft zip files that extract and execute arbitrary Python code during the unzipping process. All users of NLTK who download packages through the vulnerable downloader are affected.

💻 Affected Systems

Products:
  • Natural Language Toolkit (NLTK)
Versions: All versions prior to patch
Operating Systems: All operating systems running Python with NLTK
Default Config Vulnerable: ⚠️ Yes
Notes: Vulnerable when using nltk.download() or similar functionality to download packages from any source.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Full system compromise including file system access, network access, persistence mechanisms, and complete control of the affected system.

🟠

Likely Case

Arbitrary code execution with the privileges of the NLTK user, potentially leading to data theft, system manipulation, or lateral movement.

🟢

If Mitigated

Limited impact if proper network controls, sandboxing, and least privilege principles are implemented.

🌐 Internet-Facing: HIGH - The vulnerability can be exploited through normal NLTK package downloads from any source.
🏢 Internal Only: MEDIUM - Risk exists if internal systems download NLTK packages from untrusted sources.

🎯 Exploit Status

Public PoC: ⚠️ Yes
Weaponized: LIKELY
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploitation requires convincing users to download malicious packages, which could be achieved through social engineering or compromised repositories.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: Check NLTK GitHub repository for latest patched version

Vendor Advisory: https://github.com/nltk/nltk/security/advisories

Restart Required: No

Instructions:

1. Update NLTK to the latest patched version using pip: pip install --upgrade nltk
2. Verify the update with: pip show nltk
3. Ensure no vulnerable versions remain in your environment

🔧 Temporary Workarounds

Disable automatic downloads

all

Prevent NLTK from downloading packages automatically by using only pre-downloaded corpora

# Manually download required corpora before use
# Avoid nltk.download() in production code

Sandbox execution

linux

Run NLTK in a container or sandbox with limited permissions

docker run --read-only -v /safe/data:/data python nltk_script.py

🧯 If You Can't Patch

  • Restrict NLTK to download only from trusted, verified sources
  • Implement strict network controls to prevent downloads from untrusted repositories

🔍 How to Verify

Check if Vulnerable:

Check if your NLTK version uses zipfile.extractall() without validation in nltk/downloader.py _unzip_iter function

Check Version:

python -c "import nltk; print(nltk.__version__)"

Verify Fix Applied:

Verify the patched version implements proper path validation and uses zipfile.extract() with path checking

📡 Detection & Monitoring

Log Indicators:

  • Unusual file extraction patterns in NLTK download logs
  • Execution of unexpected Python files from NLTK directories

Network Indicators:

  • Downloads from unusual or untrusted repositories to NLTK systems

SIEM Query:

source="nltk" AND (event="download" OR event="extract") AND (url NOT IN trusted_sources)

🔗 References

📤 Share & Export