CVE-2024-12720

7.5 HIGH

📋 TL;DR

A Regular Expression Denial of Service (ReDoS) vulnerability exists in the huggingface/transformers library's tokenization_nougat_fast.py file. The post_process_single() function uses a regex that can cause exponential backtracking with specially crafted input, leading to high CPU consumption and potential application downtime. Anyone using huggingface/transformers version 4.46.3 is affected.

💻 Affected Systems

Products:
  • huggingface/transformers
Versions: v4.46.3
Operating Systems: All
Default Config Vulnerable: ⚠️ Yes
Notes: Only affects systems using the Nougat tokenizer functionality. The vulnerability is present in default configurations.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete application unavailability due to CPU exhaustion, causing extended service downtime and potential cascading failures in dependent systems.

🟠

Likely Case

Degraded application performance with high CPU spikes, leading to slow response times and potential temporary unavailability under load.

🟢

If Mitigated

Minimal performance impact with proper input validation and rate limiting in place.

🌐 Internet-Facing: HIGH - Publicly accessible endpoints processing user input could be targeted to cause DoS.
🏢 Internal Only: MEDIUM - Internal users could still trigger the vulnerability, but attack surface is more limited.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ✅ No
Complexity: MEDIUM

Exploitation requires crafting specific input patterns to trigger regex backtracking. No authentication bypass needed if the vulnerable endpoint is accessible.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: v4.47.0 or apply the specific commit

Vendor Advisory: https://github.com/huggingface/transformers/commit/deac971c469bcbb182c2e52da0b82fb3bf54cccf

Restart Required: No

Instructions:

1. Update huggingface/transformers to version 4.47.0 or later using pip install --upgrade transformers. 2. Alternatively, apply the specific commit deac971c469bcbb182c2e52da0b82fb3bf54cccf if using a custom build.

🔧 Temporary Workarounds

Input Validation and Sanitization

all

Implement strict input validation to reject or sanitize potentially malicious patterns before they reach the vulnerable regex.

Rate Limiting

all

Implement rate limiting on endpoints that process user input to prevent mass exploitation attempts.

🧯 If You Can't Patch

  • Disable or restrict access to endpoints using the Nougat tokenizer functionality.
  • Implement Web Application Firewall (WAF) rules to detect and block patterns that could trigger ReDoS.

🔍 How to Verify

Check if Vulnerable:

Check if transformers version is 4.46.3 and if the application uses Nougat tokenizer functionality.

Check Version:

python -c "import transformers; print(transformers.__version__)"

Verify Fix Applied:

Verify transformers version is 4.47.0 or later, or confirm the commit deac971c469bcbb182c2e52da0b82fb3bf54cccf is applied.

📡 Detection & Monitoring

Log Indicators:

  • Unusually high CPU usage spikes
  • Increased processing time for tokenization requests
  • Application timeouts or crashes

Network Indicators:

  • Multiple requests with similar patterns to the same endpoint
  • Unusual traffic spikes to tokenization endpoints

SIEM Query:

source="application_logs" AND (message="*CPU spike*" OR message="*timeout*" OR message="*tokenization*" AND message="*slow*")

🔗 References

📤 Share & Export