CVE-2025-52566

8.6 HIGH

📋 TL;DR

A signed vs. unsigned integer overflow vulnerability in llama.cpp's tokenizer allows heap overflow when processing manipulated text input during tokenization. This affects all llama.cpp users running versions prior to b5721. Attackers could potentially execute arbitrary code or crash the inference engine.

💻 Affected Systems

Products:
  • llama.cpp
Versions: All versions prior to b5721
Operating Systems: All platforms running llama.cpp
Default Config Vulnerable: ⚠️ Yes
Notes: Vulnerability is in core tokenization code, so all configurations using affected versions are vulnerable.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Remote code execution leading to complete system compromise, data exfiltration, or lateral movement within the environment.

🟠

Likely Case

Application crash (denial of service) or memory corruption leading to unstable behavior during text processing.

🟢

If Mitigated

Limited impact with proper input validation and memory protection mechanisms in place.

🌐 Internet-Facing: HIGH - If llama.cpp is exposed to untrusted user input via web interfaces or APIs.
🏢 Internal Only: MEDIUM - Lower exposure but still vulnerable to malicious internal users or compromised systems.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ⚠️ Yes
Complexity: MEDIUM

Exploitation requires crafting specific text input to trigger the integer overflow during tokenization.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: b5721 and later

Vendor Advisory: https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-7rxv-5jhh-j6xx

Restart Required: Yes

Instructions:

1. Check current llama.cpp version
2. Update to version b5721 or later: git pull origin master
3. Rebuild: make clean && make
4. Restart any running llama.cpp processes

🔧 Temporary Workarounds

Input validation and sanitization

all

Implement strict input validation to reject suspicious or malformed text inputs before tokenization

Memory protection hardening

linux

Enable ASLR, DEP, and other memory protection mechanisms at OS level

sysctl -w kernel.randomize_va_space=2

🧯 If You Can't Patch

  • Isolate llama.cpp instances in restricted containers or VMs with minimal privileges
  • Implement network segmentation to limit access to vulnerable instances

🔍 How to Verify

Check if Vulnerable:

Check llama.cpp version: ./main --version or examine source code for commit hash prior to dd6e6d0b6a4bbe3ebfc931d1eb14db2f2b1d70af

Check Version:

./main --version 2>&1 | grep -i version || git log --oneline -1

Verify Fix Applied:

Verify version is b5721 or later and commit includes dd6e6d0b6a4bbe3ebfc931d1eb14db2f2b1d70af

📡 Detection & Monitoring

Log Indicators:

  • Segmentation faults in llama.cpp processes
  • Unexpected process termination during text processing
  • Memory allocation errors in system logs

Network Indicators:

  • Unusual patterns of text input to llama.cpp endpoints
  • Repeated connection attempts followed by service crashes

SIEM Query:

process_name:"llama" AND (event_type:"crash" OR event_type:"segfault")

🔗 References

📤 Share & Export