A powerful, native Rust-based diagnostic tool designed to test NVIDIA GPU VRAM for hardware faults.
This project was born out of necessity: dealing with a graphics card that had malfunctioning VRAM, and needing a reliable, low-level testing tool during the repair/servicing process. Dedicated hardware diagnostics (like NVIDIA MATS/MODS) are often proprietary, hard to obtain, or require booting into specific Linux environments.
This tool provides a native Windows alternative, allowing you to run intensive memory pattern tests directly on your GPU to identify bad memory chips, all while keeping the OS stable.
- Multiple Test Patterns: Uses industry-standard memory diagnostic patterns to catch different types of hardware faults (All Zero, All One, Walking 1, Walking 0, Checkerboard, Inverse Checkerboard, Random).
- Custom CUDA Kernels: Direct PTX execution for blazing-fast memory filling and verification, maximizing memory bandwidth.
- Sliding Window Architecture: Tests VRAM in manageable chunks to bypass Windows Display Driver Model (WDDM) allocation limits and prevent OS crashes or TDR (Timeout Detection and Recovery) triggers.
- Graphical User Interface (GUI): A clean, easy-to-use GUI built with
eguito select GPUs, adjust chunk sizes, set test passes, and monitor progress in real-time. - Command-Line Interface (CLI): For automated testing, logging, and headless environments.
- Chip Mapping Analysis: Attempts to estimate which physical memory chip might be faulty based on the stride (distance) between memory errors.
- OS: Windows 10 / 11 (64-bit)
- GPU: NVIDIA Graphics Card with CUDA support (Compute Capability 5.0+)
- Drivers: NVIDIA Display Driver installed (provides the required
nvcuda.dll)
You can run the application normally to launch the Graphical Interface:
cargo run --release- GPU: Select the target NVIDIA GPU from the dropdown menu.
- Chunk: Select the VRAM allocation chunk size (e.g., 512MB, 1024MB). Note: Larger chunks are faster but may trigger driver timeouts or out-of-memory errors on Windows if the VRAM is heavily fragmented.
- Passes: Choose how many full-memory passes to run.
- Click START and let the diagnostic run. If the text flashes red and the error count goes up, your VRAM has physical faults.
If you prefer running tests from the terminal or saving automated reports, you can use the --cli flag:
cargo run --release -- --cli --chunk-size 1024 --passes 3 --device 0To export the diagnostic results as a JSON file:
cargo run --release -- --cli --output report.json- WDDM Allocation Limits: Because this tool runs in standard Windows, it is bound by WDDM (Windows Display Driver Model). Windows reserves a significant portion of VRAM for desktop rendering and system processes. You can never test 100% of the VRAM in Windows (typically only ~80-90% is accessible).
- TDR (Timeout Detection and Recovery): If a chunk size is too large or the GPU is too slow, the CUDA kernel execution may take longer than 2 seconds, causing Windows to reset the display driver (screen flashes black). If this happens, lower the Chunk Size.
- Chip Identification Accuracy: The chip mapping feature (
Chip Map) makes an educated guess based on error strides (e.g., 32-byte or 64-byte gaps). Because NVIDIA's memory controllers interleave addresses across multiple physical chips in complex (and proprietary) ways, the exact faulty chip (e.g., "Chip U43") cannot be guaranteed with 100% accuracy. Use it as a guiding hint, not absolute truth. - NVIDIA Only: Due to the reliance on the CUDA Driver API, this tool currently only supports NVIDIA GPUs.
- Install Rust.
- Install the CUDA Toolkit (Required for
nvccto compile the.cukernel files). - Ensure Microsoft Visual Studio Build Tools with C++ workloads are installed.
- Run:
git clone <repository_url>
cd VramDiagnostics
cargo build --releaseUse this tool at your own risk. Subjecting failing hardware to intense, repeated stress tests can sometimes accelerate degradation. This software is provided "as is", without warranty of any kind.