qflen

Follow

Kimon N. qflen

Follow

trying to be 100x

8 followers · 2 following

ASML
Netherlands
kimon.space

Achievements

Achievements

Highlights

Pro

Pinned Loading

nsa-from-scratch nsa-from-scratch Public

From-scratch reimplementation of DeepSeek's Native Sparse Attention (arXiv:2502.11089) in Triton + CUDA Hopper WGMMA. 7.4x faster than FlashAttention-3 at 64k context. Five-model training fleet, pe…

Python 5
tinycompress tinycompress Public

Implemented and benchmarked LLM inference compression: int4/int8 quantization, GPTQ-like calibration, int8 KV cache, pruning, distillation, speculative decoding, torch.compile, and ONNX. Every numb…

Python
whitenoise whitenoise Public

Numerics for white noise and the stochastic heat equation on the torus. Python API, C++/pybind11 hot kernels, analytic Monte Carlo tests.

Python
NanoExchange NanoExchange Public

Matching engine, UDP multicast feed, TCP order gateway, and a React dashboard with order book, depth, heatmap, OHLC chart, and live simulator.

Java
pytorch pytorch Public

Forked from pytorch/pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python
transformers transformers Public

Forked from huggingface/transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python