CVE-2021-29421
📋 TL;DR
This vulnerability in pikepdf allows XML External Entity (XXE) attacks when parsing XMP metadata in PDF files. Attackers can exploit this to read arbitrary files from the server filesystem or conduct server-side request forgery. Any Python application using vulnerable versions of pikepdf to process untrusted PDF files is affected.
💻 Affected Systems
- pikepdf
📦 What is this software?
Fedora by Fedoraproject
Fedora by Fedoraproject
Pikepdf by Pikepdf Project
⚠️ Risk & Real-World Impact
Worst Case
Complete server compromise through file disclosure leading to credential theft, SSRF attacks on internal services, or denial of service via resource exhaustion.
Likely Case
Unauthorized file disclosure from the server filesystem, potentially exposing sensitive configuration files, credentials, or application source code.
If Mitigated
Limited impact with proper input validation and file processing restrictions in place, potentially only partial file disclosure.
🎯 Exploit Status
Exploitation requires only a malicious PDF file with crafted XMP metadata. XXE attacks are well-documented and easily weaponized.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: 2.10.0 and later
Vendor Advisory: https://github.com/pikepdf/pikepdf/blob/v2.10.0/docs/release_notes.rst#v2100
Restart Required: No
Instructions:
1. Upgrade pikepdf to version 2.10.0 or later using pip: pip install --upgrade pikepdf>=2.10.0
2. Verify the upgrade with: pip show pikepdf
3. Test PDF processing functionality after upgrade.
🔧 Temporary Workarounds
Disable XML external entity processing
allConfigure XML parser to disable external entity resolution before passing to pikepdf
# In Python code, configure XML parser:
import defusedxml.lxml
from lxml import etree
parser = etree.XMLParser(resolve_entities=False, no_network=True)
Input validation and sanitization
allValidate PDF files before processing and reject files with XMP metadata
# Check for XMP metadata before processing
import pikepdf
with pikepdf.open('file.pdf') as pdf:
if hasattr(pdf, 'Root') and '/Metadata' in pdf.Root:
raise ValueError('PDF contains XMP metadata - reject processing')
🧯 If You Can't Patch
- Implement strict file upload validation to reject PDFs with XMP metadata
- Run pikepdf in isolated containers with minimal filesystem access and network restrictions
🔍 How to Verify
Check if Vulnerable:
Check pikepdf version: pip show pikepdf | grep Version. If version is between 1.3.0 and 2.9.2 inclusive, the system is vulnerable.
Check Version:
pip show pikepdf | grep Version
Verify Fix Applied:
Verify pikepdf version is 2.10.0 or later: pip show pikepdf | grep Version. Test with a PDF containing XMP metadata to ensure proper handling.
📡 Detection & Monitoring
Log Indicators:
- Unusual file access patterns from PDF processing service
- Large XML parsing errors in application logs
- Outbound network connections initiated during PDF processing
Network Indicators:
- HTTP requests to internal services from PDF processing host
- DNS queries for internal hostnames during file processing
SIEM Query:
source="application.log" AND "pikepdf" AND ("XML" OR "XMP" OR "metadata") AND ("error" OR "exception" OR "failed")
🔗 References
- https://github.com/pikepdf/pikepdf/blob/v2.10.0/docs/release_notes.rst#v2100
- https://github.com/pikepdf/pikepdf/commit/3f38f73218e5e782fe411ccbb3b44a793c0b343a
- https://lists.fedoraproject.org/archives/list/package-announce%40lists.fedoraproject.org/message/36P4HTLBJPO524WMQWW57N3QRF4RFSJG/
- https://lists.fedoraproject.org/archives/list/package-announce%40lists.fedoraproject.org/message/3QFLBBYGEDNXJ7FS6PIWTVI4T4BUPGEQ/
- https://github.com/pikepdf/pikepdf/blob/v2.10.0/docs/release_notes.rst#v2100
- https://github.com/pikepdf/pikepdf/commit/3f38f73218e5e782fe411ccbb3b44a793c0b343a
- https://lists.fedoraproject.org/archives/list/package-announce%40lists.fedoraproject.org/message/36P4HTLBJPO524WMQWW57N3QRF4RFSJG/
- https://lists.fedoraproject.org/archives/list/package-announce%40lists.fedoraproject.org/message/3QFLBBYGEDNXJ7FS6PIWTVI4T4BUPGEQ/