Skip to content
View hamzaqureshi5's full-sized avatar
🌏
Available
🌏
Available

Block or report hamzaqureshi5

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hamzaqureshi5/README.md

Hi, I'm Hamza! 👋👨🏾‍💻

AI Compiler Engineer — I turn high‑level models (PyTorch · TensorFlow · JAX · ONNX) into optimized inference artifacts across CPU, GPU, FPGA, and custom accelerators, working across the TVM · MLIR · LLVM · IREE stack to cut build time, memory footprint, and on‑device latency.

🔧 Tech Stack

Languages

Coding

C++ Python

Compilers & IR

TVM MLIR LLVM XLA Triton CUTLASS

Frameworks & Serving

PyTorch TensorFlow JAX ONNX vLLM SGLang TensorRT-LLM

Hardware & Acceleration

CUDA cuDNN cuBLAS TensorRT Jetson Thor


🚀 What I Do

  • 🔭 I’m currently an AI Software Engineer at DreamBig Semiconductor, optimizing LLM inference through graph- and kernel-level transformations in TVM and IREE for GPU, FPGA, and custom accelerators.

  • 🧱 I architect compiler passes and analyses that cut build time, memory footprint, and runtime latency while preserving model fidelity—across the TVM, LLVM, MLIR, IREE stack targeting CUDA, ROCm, TensorRT, OpenVINO, Metal, Vulkan, and custom silicon.

  • 🚗 I deploy and optimize models for edge / embedded AI accelerators, including NVIDIA Jetson Thor, for low‑latency on‑device inference.

  • 🏅 I’m a Certified Artificial Intelligence Developer.

  • 💬 Ask me about Compilers, LLM inference, Transformers, GPU/CUDA, and HPC.

  • 📧 Contact me at: hamza7771.861@gmail.com.

🌐 Connect with me

LinkedIn

📊 GitHub Stats

Hamza's GitHub stats Top Langs

Pinned Loading

  1. aes_256 aes_256 Public

    AES-256 Encryption and Decryption

    C++

  2. tvm tvm Public

    Forked from apache/tvm

    Open deep learning compiler stack for cpu, gpu and specialized accelerators

    Python 1

  3. gsm-data-generation_lib gsm-data-generation_lib Public

    Forked from open-etsi/gsm-data-generator

    A Python-based GSM data generator for creating synthetic telecom datasets. It supports operator-specific logic, customizable templates, SIM card personalization data.

    Python 1 1

  4. llvm llvm Public