CVE-2023-32758
📋 TL;DR
CVE-2023-32758 is a Regular Expression Denial of Service (ReDoS) vulnerability in giturlparse library versions through 1.2.2. When parsing maliciously crafted URLs, the vulnerable regex patterns can cause excessive CPU consumption and service degradation. This primarily affects Semgrep users analyzing untrusted packages containing specially crafted Git URLs.
💻 Affected Systems
- git-url-parse (giturlparse)
- Semgrep
📦 What is this software?
⚠️ Risk & Real-World Impact
Worst Case
Complete service unavailability due to CPU exhaustion, potentially affecting all Semgrep scanning operations and dependent CI/CD pipelines.
Likely Case
Degraded performance and timeouts during code analysis when processing packages with malicious URLs, slowing down security scanning workflows.
If Mitigated
Minimal impact with proper input validation and updated libraries, maintaining normal scanning operations.
🎯 Exploit Status
ReDoS attacks require minimal technical skill - attackers can craft malicious URLs that trigger exponential regex backtracking.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: git-url-parse 1.2.3; Semgrep 1.25.0+
Vendor Advisory: https://github.com/coala/git-url-parse/security/advisories
Restart Required: No
Instructions:
1. Update git-url-parse: pip install --upgrade git-url-parse>=1.2.3
2. Update Semgrep: pip install --upgrade semgrep>=1.25.0
3. Verify dependencies in requirements.txt/pyproject.toml
4. Test scanning functionality
🔧 Temporary Workarounds
Input Validation Filter
allImplement URL validation before passing to giturlparse to reject suspicious patterns
# Python example: validate URL length and pattern
import re
MAX_URL_LENGTH = 2048
if len(url) > MAX_URL_LENGTH or re.search(r'\^.*\+.*\+', url):
raise ValueError('Suspicious URL rejected')
Rate Limiting
linuxLimit URL parsing operations and implement timeouts
# Use timeout decorator
import signal
class TimeoutException(Exception): pass
def timeout_handler(signum, frame):
raise TimeoutException()
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(5) # 5 second timeout
# Parse URL here
signal.alarm(0)
🧯 If You Can't Patch
- Implement strict input validation to reject URLs with repeating patterns or excessive length
- Isolate Semgrep scanning to dedicated containers with resource limits and monitoring
🔍 How to Verify
Check if Vulnerable:
Check installed version: pip show git-url-parse | grep Version && pip show semgrep | grep Version
Check Version:
python -c "import giturlparse; import semgrep; print(f'giturlparse: {giturlparse.__version__}, semgrep: {semgrep.__version__}')"
Verify Fix Applied:
Confirm versions: git-url-parse >=1.2.3 and semgrep >=1.25.0
📡 Detection & Monitoring
Log Indicators:
- High CPU usage spikes during URL parsing
- Semgrep process timeouts
- Failed scans with timeout errors
Network Indicators:
- Unusually long URLs in package metadata
- Repeated scanning attempts with similar payloads
SIEM Query:
process.name:semgrep AND (cpu.usage>90 OR duration>300s)
🔗 References
- https://github.com/coala/git-url-parse/blob/master/giturlparse/parser.py#L53
- https://github.com/returntocorp/semgrep/pull/7611
- https://github.com/returntocorp/semgrep/pull/7943
- https://github.com/returntocorp/semgrep/pull/7955
- https://pypi.org/project/git-url-parse
- https://github.com/coala/git-url-parse/blob/master/giturlparse/parser.py#L53
- https://github.com/returntocorp/semgrep/pull/7611
- https://github.com/returntocorp/semgrep/pull/7943
- https://github.com/returntocorp/semgrep/pull/7955
- https://pypi.org/project/git-url-parse