CVE-2025-65891

7.5 HIGH

📋 TL;DR

A GPU device-ID validation flaw in OneFlow v0.9.0 allows attackers to trigger a Denial of Service (DoS) by calling flow.cuda.get_device_properties() with an invalid or negative device index. This affects systems running OneFlow v0.9.0 with CUDA GPU support enabled. The vulnerability can crash the application or cause GPU-related instability.

💻 Affected Systems

Products:
  • OneFlow
Versions: v0.9.0
Operating Systems: Linux, Windows, macOS
Default Config Vulnerable: ⚠️ Yes
Notes: Only affects installations with CUDA GPU support enabled. CPU-only installations are not vulnerable.

📦 What is this software?

⚠️ Risk & Real-World Impact

🔴

Worst Case

Complete application crash leading to service disruption, potential data loss in active processing, and GPU driver instability requiring system reboot.

🟠

Likely Case

Application crash or hang when malicious input triggers the vulnerability, requiring process restart and interrupting GPU-accelerated workloads.

🟢

If Mitigated

Minimal impact if proper input validation is implemented or vulnerable function calls are restricted.

🌐 Internet-Facing: MEDIUM - Exploitation requires ability to execute code or inject malicious input into the application, but many internet-facing ML services use OneFlow.
🏢 Internal Only: MEDIUM - Internal ML development and inference systems could be disrupted, affecting productivity and model training pipelines.

🎯 Exploit Status

Public PoC: ✅ No
Weaponized: UNKNOWN
Unauthenticated Exploit: ✅ No
Complexity: LOW

Exploitation requires ability to call the vulnerable function with malicious input, typically through code execution or API access.

🛠️ Fix & Mitigation

✅ Official Fix

Patch Version: v0.9.1 or later

Vendor Advisory: https://github.com/Oneflow-Inc/oneflow/issues/10661

Restart Required: Yes

Instructions:

1. Check current version with 'pip show oneflow'. 2. Upgrade using 'pip install --upgrade oneflow'. 3. Restart all OneFlow applications and services.

🔧 Temporary Workarounds

Input Validation Wrapper

all

Wrap calls to flow.cuda.get_device_properties() with validation to ensure device index is non-negative and within valid range.

# Python wrapper example:
def safe_get_device_properties(device_id):
    if device_id < 0:
        raise ValueError('Invalid device ID')
    return flow.cuda.get_device_properties(device_id)

Disable GPU Support

linux

Run OneFlow in CPU-only mode if GPU acceleration is not required.

export CUDA_VISIBLE_DEVICES=''
or set environment variable before running OneFlow applications

🧯 If You Can't Patch

  • Implement strict input validation for all user-controlled inputs that could reach the vulnerable function.
  • Monitor application logs for crashes or errors related to flow.cuda.get_device_properties() calls and implement rate limiting.

🔍 How to Verify

Check if Vulnerable:

Check if running OneFlow v0.9.0 with 'pip show oneflow' and verify CUDA is available with 'python -c "import oneflow; print(oneflow.cuda.is_available())"'.

Check Version:

pip show oneflow | grep Version

Verify Fix Applied:

After upgrading, test with 'python -c "import oneflow; oneflow.cuda.get_device_properties(-1)"' which should raise a proper error instead of crashing.

📡 Detection & Monitoring

Log Indicators:

  • Application crashes with GPU-related errors
  • Stack traces containing 'flow.cuda.get_device_properties'
  • Unexpected process termination

Network Indicators:

  • Sudden drop in ML inference/training throughput
  • Unusual API calls to GPU-related endpoints

SIEM Query:

source='application.logs' AND ("flow.cuda.get_device_properties" OR "GPU error" OR "cuda illegal memory access")

🔗 References

📤 Share & Export