[NVIDIA] Add DSR1 TensorRT Support and Enhanced Plotting by kedarpotdar-nv · Pull Request #7 · SemiAnalysisAI/InferenceX

kedarpotdar-nv · 2025-09-08T02:10:27Z

Summary

This PR adds TensorRT-LLM FP4 support for DSR1 model and enhances the plotting system to better handle multiple model variants and precision types.

Changes Made

1. DSR1 Template Enhancements

✅ Added precision parameter to DSR1 template with fp8 default
✅ Added b200-trt job for DSR1 with TensorRT support
✅ Set b200-trt to use fp4 precision specifically
✅ Updated collect-results to include b200-trt job

2. TRT-LLM Configuration

✅ Created dsr1_b200_trt_slurm.sh benchmark script
✅ Uses TensorRT-LLM with trtllm-serve command
✅ Configured for DSR1 FP4 model (nvidia/DeepSeek-R1-0528-FP4)
✅ MTP support

3. Plotting System Improvements

✅ Enhanced model grouping - groups by model family (70b, dsr1) instead of full model names
✅ Added precision distinction - different markers for fp8 (circles) vs fp4 (squares)
✅ Improved legend labels - shows precision in labels (e.g., "B200-TRT (fp4)")

New Features

DSR1 TensorRT benchmarking with fp4 precision
Visual precision distinction in plots (circles vs squares)
Improved model grouping for better chart organization

kimbochen · 2025-09-08T04:27:36Z

Thank you for the PR. Can you revert the tp-list to full sweep?

kedarpotdar-nv · 2025-09-08T04:31:04Z

Done.

Add summarize.py (compact NCCL/DeepEP results table, printed at end of every job) and make it the result gate. Fix review findings: benchmark failures/skipped-deepep now fail the job instead of reporting green (#1); DeepEP nodes from SLURM_NNODES not world_size//8 (#3); apply Buffer.set_num_sms so num_comm_sms is real (#8); nccl-tests -c 1 with a missing check footer is now invalid (#7); use context managers for file reads (#4,#5); launchers export COLLECTIVEX_IMAGE/_DIGEST for provenance (#9); trim workflow_dispatch sku options to launcher-backed pools (#2). Artifact-path finding (#6) already fixed via cx_collect_results.

… rate, run links Addresses review #3 frontend critiques (backward-compatible with v2 docs): - Percentile selector p50/p90/p99 (p99 default); reads pooled-trial percentiles. - Suite selector backend-default vs resource-constrained — kept distinct, never read as one fair contest (#5). dtype/mode/resource/contract are all in the per-line label + hover; lines are uniquely colored (SKU family) + dashed-fp8 (#10). - Bandwidth axis renamed "Logical routed payload rate" using SEPARATE dispatch/combine bytes; serial bandwidth removed; serial relabeled "Σ isolated medians" (#6,#7). - Hover shows p50/p90/p99, contract, suite, and the WORKFLOW RUN (run id + sha) that produced the point (#1). Provenance text no longer claims a single dtype (the "bf16 while fp8 shown" bug); states routing-identity-proven, pooled-sample count, logical-rate caveat, suite-separation, and correctness-is-smoke (#9 fix).

kedarpotdar-nv added 5 commits September 7, 2025 17:10

first commit

d9298e8

remove 8k tests

4966b0a

remove concurrency lock

bc87167

typo in concurrecny lock

b8e567c

enable other tests

759c0f6

kedarpotdar-nv requested a review from kimbochen September 8, 2025 02:10

remove MTP

294fe5f

update tp-list to include full list

a5df329

kimbochen merged commit e4e60be into main Sep 8, 2025

kedarpotdar-nv deleted the kepotdar-fix-chart-add-dsr1-trt branch September 18, 2025 00:57

claude-code-infmax Bot mentioned this pull request Jan 21, 2026

[NV] Update DSR1 GB200 FP4 Disagg Submission #510

Merged

cquil11 added the NVIDIA label Apr 8, 2026

cquil11 changed the title ~~Add DSR1 TensorRT Support and Enhanced Plotting~~ [NVIDIA] Add DSR1 TensorRT Support and Enhanced Plotting Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NVIDIA] Add DSR1 TensorRT Support and Enhanced Plotting#7

[NVIDIA] Add DSR1 TensorRT Support and Enhanced Plotting#7
kimbochen merged 7 commits into
mainfrom
kepotdar-fix-chart-add-dsr1-trt

kedarpotdar-nv commented Sep 8, 2025

Uh oh!

kimbochen commented Sep 8, 2025

Uh oh!

kedarpotdar-nv commented Sep 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

kedarpotdar-nv commented Sep 8, 2025

Summary

Changes Made

1. DSR1 Template Enhancements

2. TRT-LLM Configuration

3. Plotting System Improvements

New Features

Uh oh!

kimbochen commented Sep 8, 2025

Uh oh!

kedarpotdar-nv commented Sep 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants