CVE-2025-70999

7.5 HIGH

📋 TL;DR

A GPU device-ID validation flaw in OneFlow's CUDA component allows attackers to trigger a Denial of Service (DoS) by providing a crafted device ID. This affects OneFlow v0.9.0 users who utilize GPU acceleration. The vulnerability can crash the application when processing malicious input.

💻 Affected Systems

Products:
  • OneFlow
Versions: v0.9.0
Operating Systems: Linux, Windows, macOS - any OS where OneFlow with CUDA support runs
Default Config Vulnerable: ⚠️ Yes
Notes: Only affects installations with CUDA/GPU support enabled. CPU-only installations are not vulnerable.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete service disruption where the OneFlow application crashes and becomes unavailable, potentially affecting dependent services or workflows.

🟠

Likely Case

Application crash or hang when processing malicious device IDs, requiring manual restart and causing temporary service interruption.

🟢

If Mitigated

Minimal impact if input validation is implemented externally or if the vulnerable function is not exposed to untrusted sources.

🌐 Internet-Facing: MEDIUM - Exploitation requires access to the vulnerable function, which may be exposed through APIs or user inputs in web-facing applications.
🏢 Internal Only: MEDIUM - Internal users or automated processes could trigger the DoS, affecting availability of machine learning workflows.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ✅ No
Complexity: LOW - Requires ability to pass crafted device ID to the vulnerable function.

Exploitation requires access to call flow.cuda.get_device_capability() with malicious input. This could be through user-controlled parameters, API calls, or data processing pipelines.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: Check GitHub issue #10660 for latest patched version

Vendor Advisory: https://github.com/Oneflow-Inc/oneflow/issues/10660

Restart Required: Yes

Instructions:

1. Check the GitHub issue #10660 for patch availability
2. Update OneFlow to the latest patched version
3. Restart any running OneFlow applications or services

🔧 Temporary Workarounds

Input Validation Wrapper

all

Implement input validation before calling flow.cuda.get_device_capability() to ensure device IDs are within valid range.

# Python example wrapper
import oneflow as flow

def safe_get_device_capability(device_id):
    if device_id < 0 or device_id >= flow.cuda.device_count():
        raise ValueError('Invalid device ID')
    return flow.cuda.get_device_capability(device_id)

Disable GPU Support

all

Run OneFlow in CPU-only mode if GPU acceleration is not required.

# Set environment variable to disable CUDA
import os
os.environ['CUDA_VISIBLE_DEVICES'] = ''

🧯 If You Can't Patch

  • Implement strict input validation for all device ID parameters before passing to CUDA functions
  • Isolate OneFlow services in containers or VMs to limit blast radius if DoS occurs

🔍 How to Verify

Check if Vulnerable:

Check if running OneFlow v0.9.0 with CUDA support: import oneflow; print(oneflow.__version__); print(oneflow.cuda.is_available())

Check Version:

python -c "import oneflow; print(oneflow.__version__)"

Verify Fix Applied:

After updating, verify version is newer than v0.9.0 and test with invalid device IDs to ensure proper error handling.

📡 Detection & Monitoring

Log Indicators:

  • Application crashes or segmentation faults related to CUDA/GPU operations
  • Error messages mentioning device ID validation or out-of-bounds access

Network Indicators:

  • Unusual patterns of requests to GPU-related API endpoints
  • Sudden service unavailability after device ID parameter changes

SIEM Query:

source='application.logs' AND ("segmentation fault" OR "CUDA error" OR "device ID")

🔗 References

📤 Share & Export