[NV] Update MiniMax M3 B200/B300 MTP settings by jasonlizhengjian · Pull Request #1784 · SemiAnalysisAI/InferenceX

jasonlizhengjian · 2026-06-15T19:32:07Z

Updates MiniMax M3 B200/B300 EAGLE3 MTP recipes with the serving settings and TP4+EP4 coverage used for the non-MTP recipes.

Validation:

bash -n benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b200_mtp.sh
bash -n benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b300_mtp.sh
git diff --check

Note: local matrix generation was not run because pydantic is not installed in this environment.

Note

Low Risk
Benchmark config and shell script tuning only; no application logic, auth, or data paths.

Overview
Aligns MiniMax-M3 FP8 EAGLE3 MTP B200/B300 benchmark recipes with the non-MTP serving setup and broadens the fixed-seq-len matrix in nvidia-master.yaml.

The B200 and B300 MTP runner scripts now set VLLM_FLOAT32_MATMUL_PRECISION=high and use a fixed --max-cudagraph-capture-size 2048 instead of computing capture size from concurrency and speculative token count.

For minimaxm3-fp8-b200-vllm-mtp and minimaxm3-fp8-b300-vllm-mtp, the search space adds TP4+EP4 rows: DP-attention sweeps on 1k1k (conc 128–512) and 8k1k (conc 64–128), plus the missing non–DP-attention TP4+EP4 row for 8k1k (conc 64–256). perf-changelog.yaml documents these changes.

^{Reviewed by Cursor Bugbot for commit 52d37c0. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-06-15T19:32:23Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude · 2026-06-15T19:52:44Z

Claude finished @jasonlizhengjian's task in 1m 44s —— View job

PR Review: Update MiniMax M3 B200/B300 MTP settings

Gather context and read changed files
Validate master config + perf-changelog
Validate perf-changelog entry position
Check benchmark script code style and MTP requirements
Post review findings

LGTM — no blocking issues found.

Both B200 and B300 MTP scripts now align with the non-MTP B200 serving settings (VLLM_FLOAT32_MATMUL_PRECISION=high, hardcoded --max-cudagraph-capture-size 2048), and the new TP4+EP4 config rows mirror the coverage pattern from the STP recipes. All required validations pass: --use-chat-template is present, expert parallelism is conditional on EP_SIZE, server commands use multi-line formatting, and the perf-changelog entry is correctly appended to the end.

github-actions · 2026-06-15T21:57:14Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=27572276756
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=27572276756

functionstackx · 2026-06-16T16:15:08Z

/reuse-sweep-run

…ettings

Update MiniMax M3 B200 B300 MTP settings

16ccdff

github-project-automation Bot added this to InferenceMAX Board Jun 15, 2026

Update MiniMax M3 MTP changelog link

389037e

jasonlizhengjian marked this pull request as ready for review June 15, 2026 19:52

jasonlizhengjian requested a review from a team June 15, 2026 19:52

jasonlizhengjian requested review from jgangani and kedarpotdar-nv as code owners June 15, 2026 19:52

jasonlizhengjian added the full-sweep-enabled label Jun 15, 2026

jasonlizhengjian changed the title ~~[WIP][NV] Update MiniMax M3 B200/B300 MTP settings~~ [NV] Update MiniMax M3 B200/B300 MTP settings Jun 15, 2026

xinli-sw mentioned this pull request Jun 16, 2026

[NV] Update MiniMax-M3 B200 MTP, B300, and B300 MTP vLLM serving settings #1802

Closed

3 tasks

This was referenced Jun 16, 2026

[WIP][NV] Use Marlin for MiniMax M3 TP-only configs #1807

Closed

[NV] Use Marlin for MiniMax M3 TP-only configs #1809

Merged

kedarpotdar-nv approved these changes Jun 16, 2026

View reviewed changes

Merge branch 'main' into nv/jasonli/minimaxm3-b200-b300-mtp-serving-s…

52d37c0

…ettings

functionstackx merged commit e4fcda4 into main Jun 16, 2026
3 checks passed

functionstackx deleted the nv/jasonli/minimaxm3-b200-b300-mtp-serving-settings branch June 16, 2026 16:15

github-project-automation Bot moved this to Done in InferenceMAX Board Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NV] Update MiniMax M3 B200/B300 MTP settings#1784

[NV] Update MiniMax M3 B200/B300 MTP settings#1784
functionstackx merged 3 commits into
mainfrom
nv/jasonli/minimaxm3-b200-b300-mtp-serving-settings

jasonlizhengjian commented Jun 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

claude Bot commented Jun 15, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

functionstackx commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jasonlizhengjian commented Jun 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

claude Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Update MiniMax M3 B200/B300 MTP settings

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

functionstackx commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jasonlizhengjian commented Jun 15, 2026 •

edited by cursor Bot

Loading

claude Bot commented Jun 15, 2026 •

edited

Loading