Highlights
- Pro
Pinned Loading
-
nsa-from-scratch
nsa-from-scratch PublicFrom-scratch reimplementation of DeepSeek's Native Sparse Attention (arXiv:2502.11089) in Triton + CUDA Hopper WGMMA. 7.4x faster than FlashAttention-3 at 64k context. Five-model training fleet, pe…
Python 5
-
tinycompress
tinycompress PublicImplemented and benchmarked LLM inference compression: int4/int8 quantization, GPTQ-like calibration, int8 KV cache, pruning, distillation, speculative decoding, torch.compile, and ONNX. Every numb…
Python
-
whitenoise
whitenoise PublicNumerics for white noise and the stochastic heat equation on the torus. Python API, C++/pybind11 hot kernels, analytic Monte Carlo tests.
Python
-
NanoExchange
NanoExchange PublicMatching engine, UDP multicast feed, TCP order gateway, and a React dashboard with order book, depth, heatmap, OHLC chart, and live simulator.
Java
-
pytorch
pytorch PublicForked from pytorch/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Python
-
transformers
transformers PublicForked from huggingface/transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Python
If the problem persists, check the GitHub status page or contact support.



