CVE-2024-0520

8.8 HIGH

📋 TL;DR

This CVE allows remote code execution in MLflow versions before 2.9.0 due to command injection vulnerability. Attackers can manipulate file paths when loading datasets from HTTP sources, leading to arbitrary file writes and potential system compromise. Anyone using vulnerable MLflow versions for dataset loading is affected.

💻 Affected Systems

Products:
  • MLflow
Versions: Versions before 2.9.0 (specifically mentioned 8.2.1)
Operating Systems: All
Default Config Vulnerable: ⚠️ Yes
Notes: Requires HTTP dataset loading functionality to be used.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Full system compromise with attacker gaining shell access, stealing sensitive data/models, and pivoting to other systems.

🟠

Likely Case

Arbitrary file write leading to data corruption, denial of service, or limited command execution depending on permissions.

🟢

If Mitigated

File write limited to specific directories with proper sandboxing and permissions.

🌐 Internet-Facing: HIGH
🏢 Internal Only: MEDIUM

🎯 Exploit Status

Public PoC: ⚠️ Yes
Weaponized: LIKELY
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploit details available in public bounty reports with path traversal examples.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 2.9.0

Vendor Advisory: https://github.com/mlflow/mlflow/commit/400c226953b4568f4361bc0a0c223511652c2b9d

Restart Required: No

Instructions:

1. Update MLflow to version 2.9.0 or later using pip: pip install --upgrade mlflow>=2.9.0
2. Verify the update completed successfully
3. No service restart required for Python package updates

🔧 Temporary Workarounds

Disable HTTP dataset loading

all

Prevent loading datasets from HTTP sources to block the attack vector

Configure MLflow to only use local or trusted dataset sources

Input validation wrapper

all

Add custom validation for Content-Disposition headers and URL paths

Implement sanitization for filename extraction in custom dataset loaders

🧯 If You Can't Patch

  • Implement strict network controls to limit HTTP dataset sources to trusted domains only
  • Run MLflow with minimal permissions and in isolated containers/namespaces

🔍 How to Verify

Check if Vulnerable:

Check if MLflow version is below 2.9.0 and HTTP dataset loading is enabled

Check Version:

python -c "import mlflow; print(mlflow.__version__)"

Verify Fix Applied:

Confirm MLflow version is 2.9.0 or higher and test HTTP dataset loading with malicious paths

📡 Detection & Monitoring

Log Indicators:

  • Unusual file paths in dataset loading operations
  • HTTP requests with suspicious Content-Disposition headers
  • Failed file write attempts to system directories

Network Indicators:

  • HTTP requests to MLflow with crafted Content-Disposition headers
  • Unusual outbound connections from MLflow process

SIEM Query:

source="mlflow.logs" AND ("Content-Disposition" OR "dataset_source") AND (".." OR "/tmp/" OR "/etc/")

🔗 References

📤 Share & Export