-
Notifications
You must be signed in to change notification settings - Fork 208
[AMD] Add MiniMax-M3-FP8 MI355X ATOMESH #1865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
e0b81f4
[AMD] server_atom: improve config print and cleanup
seungrokj 2ecfe19
update perf-changelog for dsv4-fp4-mi355x-atom-disagg-mtp
seungrokj 68ff385
[AMD] fix DECODE_MTP_SIZE and BENCH_REQUEST_RATE propagation in atom-…
seungrokj f0c64d8
[AMD] server_atom: pass SPEC_ARGS to prefill server
seungrokj 53722ef
[AMD] amd-master: fix comment for 1P1D TP8+DPA+TBO+MTP1 config
seungrokj 09f0d18
[AMD] dsv4_atom-disagg: remove DECODE_MTP_SIZE from check_env_vars
seungrokj 7643da7
[AMD] bench: use --dsv4 flag for DeepSeek-V4-Pro MTP benchmarks
seungrokj f9c69d3
[AMD] server_atom: export IS_MTP=true when SPEC_DECODING=mtp for benc…
seungrokj 290eb53
[AMD] server_atom: fix hf-overrides JSON quoting
seungrokj 82ce90f
fix: inline --hf-overrides to avoid eval word-splitting, remove OPT_ARGS
seungrokj af235c9
refactor: extract --hf-overrides into HF_OVERRIDES_ARG variable
seungrokj e264c4e
fix: enable --hf-overrides only for DeepSeek-V4-Pro
seungrokj 2cea307
fix: add HF_OVERRIDES_ARG to INFO config print block
seungrokj 74c7a5a
fix: replace broken-quote array splice with ${ARRAY[*]} in CMD strings
seungrokj 95a730e
fix: remove ${CUDAGRAPH_OPT} from decode CMD
seungrokj 4d2cf04
feat: add MiniMax-M3 ATOM disagg CI script and server_atom.sh support
seungrokj 14eb7f2
feat: add minimaxm3-fp4-mi355x-atom-disagg recipe and AITER_QUICK_RED…
seungrokj c2cce71
feat: export AITER_QUICK_REDUCE_QUANTIZATION=INT4 for non-DSv4 models
seungrokj aeb73dc
fix: server_atom.sh and minimaxm3 disagg cleanup
seungrokj 697f26f
fix: dsv4_fp4_mi355x_atom-disagg cleanup
seungrokj f2b89c6
fix: set BLOCK_SIZE=128 for MiniMax-M3 in minimaxm3_fp4_mi355x_atom-d…
seungrokj 48b3daf
fix: use KV_CACHE_DTYPE=fp8 for MiniMax-M3 disagg (matches atom serve…
seungrokj d06a44c
feat: update minimaxm3-fp4-mi355x-atom-disagg search space and disabl…
seungrokj 38e82c3
feat: add MiniMax-M3-MXFP4/MXFP8 to models_atom.yaml; set KV_CACHE_DT…
seungrokj 69e4be7
fix: set mi355x-disagg runner and add dynamic cudagraph sizes for dec…
seungrokj 4b57fab
fix: gate ATOM_MOE_GU_ITLV and AITER_BF16_FP8_MOE_BOUND on DeepSeek-V…
seungrokj 1e0bb1e
fix: preserve empty KV_CACHE_DTYPE to skip --kv-cache-dtype flag
seungrokj a933e35
fix: use KV_CACHE_DTYPE=auto for minimaxm3 disagg to skip --kv-cache-…
seungrokj 41f23a1
fix: align minimaxm3 disagg settings with slurm reference script
seungrokj ae82c60
fix: rename minimaxm3-fp4-mi355x-atom-disagg to minimaxm3-fp8 and rem…
seungrokj bcbda4f
feat: add minimaxm3_fp8_mi355x_atom-disagg multi-node benchmark script
seungrokj 5e82263
benchmarks: rename minimaxm3 to dsv4 atom-disagg script and generaliz…
seungrokj c0a813b
fix: bump minimaxm3-fp8-mi355x-atom-disagg image and pin MAX_MODEL_LEN
seungrokj a2e7439
Merge branch 'main' into amd/atom_mesh_0619_m3_fp8
seungrokj File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.