CVE-2024-38557
📋 TL;DR
A race condition vulnerability in the Linux kernel's mlx5 driver can cause a deadlock when disabling or enabling link aggregation (LAG). This affects systems using Mellanox network adapters with the mlx5 driver and can lead to system hangs or crashes. The vulnerability requires administrative privileges to trigger.
💻 Affected Systems
- Linux kernel with mlx5 driver
📦 What is this software?
Linux Kernel by Linux
The Linux Kernel is the core component of the Linux operating system, serving as the critical interface between computer hardware and software processes. As the heart of millions of servers, cloud infrastructure, embedded systems, Android devices, and IoT deployments worldwide, the Linux Kernel mana...
Learn more about Linux Kernel →Linux Kernel by Linux
The Linux Kernel is the core component of the Linux operating system, serving as the critical interface between computer hardware and software processes. As the heart of millions of servers, cloud infrastructure, embedded systems, Android devices, and IoT deployments worldwide, the Linux Kernel mana...
Learn more about Linux Kernel →Linux Kernel by Linux
The Linux Kernel is the core component of the Linux operating system, serving as the critical interface between computer hardware and software processes. As the heart of millions of servers, cloud infrastructure, embedded systems, Android devices, and IoT deployments worldwide, the Linux Kernel mana...
Learn more about Linux Kernel →⚠️ Risk & Real-World Impact
Worst Case
System deadlock requiring hard reboot, causing service disruption and potential data loss.
Likely Case
System hang or crash when network administrators modify LAG configurations, requiring reboot to recover.
If Mitigated
No impact if proper kernel patches are applied or if LAG functionality is not used.
🎯 Exploit Status
Requires administrative privileges to trigger via devlink commands. Not easily weaponized for remote exploitation.
🛠️ Fix & Mitigation
✅ Official Fix
Patch Version: Kernel commits: 0f06228d4a2dcc1fca5b3ddb0eefa09c05b102c4, 0f320f28f54b1b269a755be2e3fb3695e0b80b07, e93fc8d959e56092e2eca1e5511c2d2f0ad6807a, f03c714a0fdd1f93101a929d0e727c28a66383fc
Vendor Advisory: https://git.kernel.org/stable/c/0f06228d4a2dcc1fca5b3ddb0eefa09c05b102c4
Restart Required: Yes
Instructions:
1. Update Linux kernel to version containing the fix commits. 2. Reboot system to load new kernel. 3. Verify mlx5 driver loads correctly.
🔧 Temporary Workarounds
Disable LAG functionality
linuxPrevent triggering the deadlock by avoiding LAG configuration changes
# Avoid using devlink commands to change eswitch mode
# Do not run: devlink dev eswitch set pci/0000:xx:xx.x mode switchdev
🧯 If You Can't Patch
- Avoid modifying LAG configurations via devlink commands
- Implement change control procedures for network configuration changes
🔍 How to Verify
Check if Vulnerable:
Check kernel version and if mlx5 driver is loaded: uname -r && lsmod | grep mlx5
Check Version:
uname -r
Verify Fix Applied:
Verify kernel version includes fix commits and test LAG disable/enable operations
📡 Detection & Monitoring
Log Indicators:
- Kernel panic messages
- System hang/crash during network configuration
- Deadlock warnings in dmesg
Network Indicators:
- Network interface failures after LAG configuration changes
SIEM Query:
search 'kernel panic' OR 'deadlock' OR 'mlx5' in system logs during network maintenance windows
🔗 References
- https://git.kernel.org/stable/c/0f06228d4a2dcc1fca5b3ddb0eefa09c05b102c4
- https://git.kernel.org/stable/c/0f320f28f54b1b269a755be2e3fb3695e0b80b07
- https://git.kernel.org/stable/c/e93fc8d959e56092e2eca1e5511c2d2f0ad6807a
- https://git.kernel.org/stable/c/f03c714a0fdd1f93101a929d0e727c28a66383fc
- https://git.kernel.org/stable/c/0f06228d4a2dcc1fca5b3ddb0eefa09c05b102c4
- https://git.kernel.org/stable/c/0f320f28f54b1b269a755be2e3fb3695e0b80b07
- https://git.kernel.org/stable/c/e93fc8d959e56092e2eca1e5511c2d2f0ad6807a
- https://git.kernel.org/stable/c/f03c714a0fdd1f93101a929d0e727c28a66383fc