Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/configs/amd-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2562,7 +2562,7 @@
# acceptance dilutes in big batches, and the draft weights + draft KV shave
# headroom — tp2-ep2 is dropped since its KV headroom was already thin.
minimaxm3-fp8-mi355x-vllm-mtp:
image: vllm/vllm-openai-rocm:minimax-m3
image: vllm/vllm-openai-rocm:nightly-3f5a1e1733200760169ff31ebe60a271072b199e

Check failure on line 2565 in .github/configs/amd-master.yaml

View check run for this annotation

Claude / Claude Code Review

Image bump leaves stale in-place EAGLE3 patch in FP8 MTP recipe

Image bumps minimaxm3-fp8-mi355x-vllm-mtp to nightly-3f5a1e17... but benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_mi355x_mtp.sh:18-25,89-167 was not updated — it still hard-codes the old vllm/vllm-openai-rocm:minimax-m3 image in the docstring and runs an in-place EAGLE3 patch with a sys.exit hard-fail if any anchor count != 1. The FP4 sibling minimaxm3_fp4_mi355x_vllm_mtp.sh:6-7 uses the IDENTICAL nightly and explicitly states 'The pinned nightly includes upstream AMD MiniMax-M3 SupportsEa
Comment thread
functionstackx marked this conversation as resolved.
model: MiniMaxAI/MiniMax-M3-MXFP8
model-prefix: minimaxm3
runner: mi355x
Expand Down
7 changes: 7 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4229,3 +4229,10 @@
- "Reuse the pinned vllm/vllm-openai-rocm:nightly-3f5a1e1733200760169ff31ebe60a271072b199e image, text-only target path, TRITON_ATTN, automatic tool choice, MiniMax-M3 parsers, VLLM_USE_BREAKABLE_CUDAGRAPH=0, default KV-cache dtype, and automatic MoE backend selection."
- "Pass --use-chat-template for MTP acceptance and mirror the existing MiniMax-M3 MXFP8 MI355X MTP TP/EP/DP-attention search space at 1k1k and 8k1k."
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1939

- config-keys:
- minimaxm3-fp8-mi355x-vllm-mtp
description:
- "Update the MiniMax-M3 MXFP8 MI355X vLLM EAGLE3 benchmark image from vllm/vllm-openai-rocm:minimax-m3 to vllm/vllm-openai-rocm:nightly-3f5a1e1733200760169ff31ebe60a271072b199e."
- "Benchmark configuration, EAGLE3 draft model, serving flags, and search space are unchanged."
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1941