Add Qwen3.5 FP4 to vLLM-ATOM nightly accuracy check and benchmark by wuhuikx · Pull Request #593 · ROCm/ATOM

wuhuikx · 2026-04-17T04:33:49Z

Add Qwen3.5 FP4 for vLLM-ATOM, including:

nightly accuracy test
performance benchmark candidate list
update the recipe

…nt SGLang V3.2 Follow-up to the citation audit, two more verified corrections in the plugin sections: - vLLM-ATOM / Qwen3.5: the prior pass dropped Qwen3.5 along with the misattributed #532, but Qwen3.5 does have real vLLM-plugin support. Restore it with the correct PRs: #448 (fp8 functionality/accuracy, touches atom/plugin/vllm/model_wrapper.py) and #593 (Qwen3.5 FP4 nightly + benchmark, recipes/atom_vllm/Qwen3.5.md), keeping #772 (Qwen3-Next MTP). - SGLang-ATOM: dropped "V3.2" from the DeepSeek model list. No SGLang DeepSeek V3.2 PR landed in v0.1.2..v0.1.3 (V3 MTP=#643, R1 FP4=#614, FP4 MTP=#834/#846); the cited PRs only cover V3 and R1 FP4. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…itations in v0.1.3 (#1061) * docs(release-notes): fix misattributed plugin PR citations in v0.1.3 Four citations in the vLLM-ATOM sections referenced PRs that actually belong to SGLang-ATOM or the native ATOM path. Verified each PR title against GitHub before correcting. - Model Support / DeepSeek V4 / R1 FP4: dropped the bullet. #650 is the native DeepSeek V4 triton-MoE path (already cited under ATOM Server) and #614 is a SGLang-ATOM R1 FP4 PR (already cited under SGLang-ATOM); neither supports a vLLM-ATOM V4 / R1 FP4 claim. - Model Support / Qwen3.5 / Qwen3-Next: dropped #532 (it adds Qwen3.5 / Qwen3-Next to SGLang, not vLLM); keep #772 (Qwen3-Next MTP for vLLM). - H&P / vLLM-ATOM: dropped #528 + the "Q/K norm-quant fusion" claim; #528 is the SGLang+ATOM qk-norm fusion PR (already cited under SGLang). - H&P / vLLM-ATOM: dropped #614 from the DeepSeek FP4 validation bullet (SGLang-ATOM PR), leaving the genuine #639 TP8/EP8 case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(release-notes): correct Qwen3.5 vLLM citation and drop nonexistent SGLang V3.2 Follow-up to the citation audit, two more verified corrections in the plugin sections: - vLLM-ATOM / Qwen3.5: the prior pass dropped Qwen3.5 along with the misattributed #532, but Qwen3.5 does have real vLLM-plugin support. Restore it with the correct PRs: #448 (fp8 functionality/accuracy, touches atom/plugin/vllm/model_wrapper.py) and #593 (Qwen3.5 FP4 nightly + benchmark, recipes/atom_vllm/Qwen3.5.md), keeping #772 (Qwen3-Next MTP). - SGLang-ATOM: dropped "V3.2" from the DeepSeek model list. No SGLang DeepSeek V3.2 PR landed in v0.1.2..v0.1.3 (V3 MTP=#643, R1 FP4=#614, FP4 MTP=#834/#846); the cited PRs only cover V3 and R1 FP4. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(release-notes): fix 3 more cross-section PR misattributions Verified each PR's changed files to confirm which engine path it belongs to: - vLLM-ATOM Engine Core: drop #793 + "handles scalar KV scales". #793 only touches atom/model_ops/{attention_mha,base_attention}.py (native, no plugin files) and is already cited correctly in the native section. - vLLM-ATOM H&P: drop the DeepSeek FP4 TP8/EP8 bullet and move #639 to SGLang-ATOM H&P. #639 only touches sglang_benchmark_models.json and atom-sglang-benchmark.yaml -> it is a SGLang benchmark PR, not vLLM. - vLLM-ATOM H&P: drop "V4 DP benchmark coverage (#949)". #949 touches the native benchmark (.github/benchmark/models.json, atom-benchmark.yaml), already cited under ATOM Server; it is not a plugin PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

wuhuikx added 2 commits April 16, 2026 23:32

Add Qwen3.5 FP4 to vLLM-ATOM nightly accuracy check and benchmark

87f5869

Change the config

450d912

wuhuikx commented Apr 17, 2026

View reviewed changes

Comment thread .github/benchmark/oot_benchmark_models.json

Comment thread .github/benchmark/oot_models_accuracy.json Outdated

Comment thread .github/benchmark/oot_models_accuracy.json Outdated

wuhuikx marked this pull request as draft April 17, 2026 04:48

wuhuikx added 3 commits April 17, 2026 00:21

Update the Qwen3.5 FP4 recipe

bcfb98d

Update the accuracy threshold

1f4e59c

Update the recipe

fa36b79

wuhuikx marked this pull request as ready for review April 17, 2026 09:01

ganyi1996ppo approved these changes Apr 17, 2026

View reviewed changes

valarLip approved these changes Apr 17, 2026

View reviewed changes

zejunchen-zejun approved these changes Apr 17, 2026

View reviewed changes

zejunchen-zejun merged commit 173f3ee into main Apr 17, 2026
33 of 44 checks passed

zejunchen-zejun deleted the hattie/add_qwen3.5_fp4 branch April 17, 2026 10:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Qwen3.5 FP4 to vLLM-ATOM nightly accuracy check and benchmark#593

Add Qwen3.5 FP4 to vLLM-ATOM nightly accuracy check and benchmark#593
zejunchen-zejun merged 5 commits into
mainfrom
hattie/add_qwen3.5_fp4

wuhuikx commented Apr 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

wuhuikx commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wuhuikx commented Apr 17, 2026 •

edited

Loading