CVE-2024-52595

7.7 HIGH

📋 TL;DR

This vulnerability in lxml_html_clean allows attackers to bypass HTML sanitization by exploiting differences in how browsers versus the library parse special tags like <svg>, <math>, and <noscript>. Malicious scripts hidden in CSS comments can execute despite cleaning, leading to cross-site scripting attacks. Users who process untrusted HTML with lxml_html_clean versions before 0.4.0 are affected.

💻 Affected Systems

Products:
  • lxml_html_clean
Versions: All versions before 0.4.0
Operating Systems: All
Default Config Vulnerable: ⚠️ Yes
Notes: Only affects users who process untrusted HTML input with the default cleaner configuration.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Successful XSS attacks leading to session hijacking, credential theft, or complete compromise of user accounts in web applications.

🟠

Likely Case

Limited XSS attacks affecting users who interact with malicious content, potentially stealing cookies or performing actions on their behalf.

🟢

If Mitigated

No impact if proper sanitization controls are implemented or the library is upgraded.

🌐 Internet-Facing: HIGH
🏢 Internal Only: MEDIUM

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: LIKELY
Unauthenticated Exploit: ⚠️ Yes
Complexity: LOW

Exploitation requires crafting HTML with malicious CSS comments in special tags, which is straightforward for attackers familiar with XSS techniques.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: 0.4.0

Vendor Advisory: https://github.com/fedora-python/lxml_html_clean/security/advisories/GHSA-5jfw-gq64-q45f

Restart Required: No

Instructions:

1. Upgrade lxml_html_clean to version 0.4.0 or later using pip: pip install --upgrade lxml_html_clean. 2. Verify the upgrade with pip show lxml_html_clean.

🔧 Temporary Workarounds

Configure tag restrictions

all

Use lxml_html_clean configuration options to restrict or remove vulnerable tags.

cleaner = Cleaner(remove_tags=['svg', 'math', 'noscript'], kill_tags=['svg', 'math', 'noscript'], allow_tags=[list_of_safe_tags])

🧯 If You Can't Patch

  • Implement additional HTML sanitization layers using alternative libraries or custom validation.
  • Disable processing of untrusted HTML input until the patch can be applied.

🔍 How to Verify

Check if Vulnerable:

Check the installed version of lxml_html_clean; if it's below 0.4.0, the system is vulnerable.

Check Version:

pip show lxml_html_clean | grep Version

Verify Fix Applied:

Confirm the version is 0.4.0 or higher and test with sample HTML containing CSS comments in <svg>, <math>, or <noscript> tags to ensure sanitization works.

📡 Detection & Monitoring

Log Indicators:

  • Unusual HTML input patterns with CSS comments in special tags
  • Errors or warnings from HTML sanitization processes

Network Indicators:

  • HTTP requests containing crafted HTML with CSS comments in <svg>, <math>, or <noscript> tags

SIEM Query:

source="web_logs" AND (html_content CONTAINS "/*" AND html_content CONTAINS "*/" AND (html_content CONTAINS "<svg>" OR html_content CONTAINS "<math>" OR html_content CONTAINS "<noscript>"))

🔗 References

📤 Share & Export