NVIDIA repositories

DALI

Public

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

python machine-learning deep-learningneural-network mxnet gpu image-processing pytorch gpu-tensorflow data-processing

C++

•

Apache License 2.0

•658•5.6k•225•33•Updated

Jan 26, 2026

Megatron-Energon

Public

Megatron's multi-modal data loader

Python

•

Other

•38•310•14•4•Updated

Jan 26, 2026

cuEquivariance is a math library that is a collective of low-level primitives and tensor ops to accelerate widely-used models, like DiffDock, MACE, Allegro and NEQUIP, based on equivariant neural networks. Also includes kernels for accelerated structure prediction.

Python

•24•349•13•3•Updated

Jan 26, 2026

cccl

Public

CUDA Core Compute Libraries

cpp hpc gpumodern-cpp parallel-computing cuda nvidia gpu-acceleration cuda-kernels gpu-computing

C++

•

Other

•326•2.1k•1.2k•207•Updated

Jan 26, 2026

cuda-quantum

Public

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows

python cpp quantumquantum-computing hacktoberfest quantum-programming-language quantum-algorithms quantum-machine-learning unitaryhack

C++

•

Other

•324•897•423•87•Updated

Jan 26, 2026

Megatron-LM

Public

Ongoing research training transformer models at scale

transformers model-para large-language-models

Python

•

Other

•3.5k•15k•306•273•Updated

Jan 26, 2026

makani

Public

Massively parallel training of machine-learning based weather and climate models

Python

•

Other

•63•351•4•4•Updated

Jan 26, 2026

TensorRT-LLM

Public

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

cuda pytorch moeblackwell llm-serving

Python

•

Other

•2k•13k•519•469•Updated

Jan 26, 2026

NVSentinel

Public

NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments

Go

•

Apache License 2.0

•37•165•33•15•Updated

Jan 26, 2026

doca-platform

Public

DOCA Platform manages provisioning and service orchestration for Bluefield DPUs

Go

•

Apache License 2.0

•18•75•0•0•Updated

Jan 26, 2026

spark-rapids

Public

Spark RAPIDS plugin - accelerate Apache Spark with GPUs

big-data spark gpurapids

Scala

•

Apache License 2.0

•271•959•1.8k•36•Updated

Jan 26, 2026

k8s-device-plugin

Public

NVIDIA device plugin for Kubernetes

kubernetes

Go

•

Apache License 2.0

•780•3.6k•72•39•Updated

Jan 26, 2026

OSMO

Public

The developer-first platform for scaling complex Physical AI workloads across heterogeneous compute—unifying training GPUs, simulation clusters, and edge devices in a simple YAML

Python

•

Apache License 2.0

•6•82•42•12•Updated

Jan 26, 2026

gpu-operator

Public

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes

kubernetes gpu cudanvidia

Go

•

Apache License 2.0

•442•2.5k•91•65•Updated

Jan 26, 2026

bionemo-framework

Public

BioNeMo Framework: For building and adapting AI models in drug discovery at scale

machine-learning gpu pytorchdrug-discovery

Jupyter Notebook

•115•641•61•116•Updated

Jan 26, 2026

cuopt

Public

GPU accelerated decision optimization

gpu optimization cudalinear-programming

Cuda

•

Apache License 2.0

•116•680•84•18•Updated

Jan 26, 2026

TileGym

Public

Helpful kernel tutorials and examples for tile-based GPU programming

Python

•

Other

•36•609•2•1•Updated

Jan 26, 2026

sandbox-device-plugin

Public

Kubernetes Device Plugin to help cold plug vfio/iommufd GPUs in Kata VMs for Confidential Containers

Go

•

BSD 3-Clause "New" or "Revised" License

•3•2•1•7•Updated

Jan 26, 2026

stdexec

Public

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++

•

Apache License 2.0

•225•2.2k•122•15•Updated

Jan 26, 2026

Model-Optimizer

Public

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

Python

•

Apache License 2.0

•242•1.9k•64•70•Updated

Jan 26, 2026

NeMo-Agent-Toolkit

Public

The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.

Python

•

Apache License 2.0

•494•1.8k•66•28•Updated

Jan 26, 2026

nvidia-resiliency-ext

Public

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to failures and interruptions.

Python

•

Other

•42•253•2•19•Updated

Jan 26, 2026

recsys-examples

Public

Examples for Recommenders - easy to train and deploy on accelerated infrastructure.

pytorch recommender-system recommendersgenerative-recommenders

Python

•

Other

•43•211•43•11•Updated

Jan 26, 2026

Fuser

Public

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++

•

Other

•75•375•211•203•Updated

Jan 26, 2026

spark-rapids-examples

Public

A repo for all spark examples using Rapids Accelerator including ETL, ML/DL, etc.

Jupyter Notebook

•

Apache License 2.0

•62•166•21•3•Updated

Jan 26, 2026

accelerated-computing-hub

Public

NVIDIA curated collection of educational resources related to general purpose GPU programming.

Jupyter Notebook

•

Other

•197•1.1k•14•4•Updated

Jan 26, 2026

NV-Kernels

Public

Ubuntu kernels which are optimized for NVIDIA server systems

C

•53•87•0•14•Updated

Jan 26, 2026

aistore

Public

AIStore: scalable storage for AI applications

kubernetes high-performance distributed-storagehigh-availability object-storage multi-cloud batch-jobs s3-compatible multipart-upload ml-training

Go

•

MIT License

•232•1.7k•2•0•Updated

Jan 26, 2026

k8s-dra-driver-gpu

Public

NVIDIA DRA Driver for GPUs

Go

•

Apache License 2.0

•113•551•89•23•Updated

Jan 25, 2026

KAI-Scheduler

Public

KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale

Go

•

Apache License 2.0

•138•1.1k•31•61•Updated

Jan 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Corporation

All

All

655 repositories

DALI

Megatron-Energon

cuEquivariance

cccl

cuda-quantum

Megatron-LM

makani

TensorRT-LLM

NVSentinel

doca-platform

spark-rapids

k8s-device-plugin

OSMO

gpu-operator

bionemo-framework

cuopt

TileGym

sandbox-device-plugin

stdexec

Model-Optimizer

NeMo-Agent-Toolkit

nvidia-resiliency-ext

recsys-examples

Fuser

spark-rapids-examples

accelerated-computing-hub

NV-Kernels

aistore

k8s-dra-driver-gpu

KAI-Scheduler

All

All

Repositories list

655 repositories