Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/configs/amd-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2135,7 +2135,7 @@ dsv4-fp4-mi355x-sglang:
# gpu-mem-util=0.6. TP8 sweeps conc 4-64; DEP8 has a single conc=64
# probe to validate the ROCm DP+EP path.
dsv4-fp4-mi355x-vllm:
image: vllm/vllm-openai-rocm:nightly-4f940896a32c9e2a0eba7f50d521bf5f6b4de458
image: vllm/vllm-openai-rocm:v0.22.0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: the comment block above this entry (lines 2118-2127) describes the image as a 'nightly image' and contains an 'IMPORTANT' block instructing maintainers to pin to a digest-suffixed nightly rather than the floating :nightly (because launch_mi355x-amds.sh caches enroot squashfs files by image string). After the bump to v0.22.0 (a stable versioned tag), both the 'nightly image' wording and the digest-vs-floating rationale are stale and will mislead future maintainers. Consider updating the comment block to reflect the stable-tag pin, or dropping the nightly-pinning paragraph entirely.

Extended reasoning...

What's stale

The diff changes the image at line 2138 from vllm/vllm-openai-rocm:nightly-4f940896a32c9e2a0eba7f50d521bf5f6b4de458 to vllm/vllm-openai-rocm:v0.22.0. The 10-line comment block immediately above the entry (lines 2118-2127), which was written to explain why the previous nightly pin was needed, now contradicts the actual image:

  • Line 2118-2119: "DSv4 on MI355X via vLLM, using the official vllm/vllm-openai-rocm nightly image."v0.22.0 is a stable release tag, not a nightly.
  • Lines 2123-2127: "IMPORTANT: pin to a digest-suffixed nightly tag rather than the floating :nightly. launch_mi355x-amds.sh caches enroot squashfs files keyed on the image string and short-circuits re-import if the file already exists, so the floating tag silently keeps a stale build even after Docker Hub updates :nightly." — the digest-vs-floating concern is specific to mutable tags. v0.22.0 is an immutable stable release, so this entire rationale is moot for the currently pinned image.

Why this matters (and why it's still just a nit)

The staleness is introduced directly by this PR — the comment was accurate when the image was nightly-4f940896... and becomes inaccurate the moment the image becomes v0.22.0. A future maintainer looking at this entry to understand why the tag is structured this way will read an IMPORTANT block that doesn't describe what they're seeing, which is the canonical recipe for stale-comment confusion.

That said, there is no functional impact: the sweep config itself is correct, and the enroot squashfs caching behavior described in the comment is still true in general (it's just no longer the reason this particular tag is pinned the way it is). For that reason this is filed as a nit, not a blocking issue.

Addressing the refutation

One verifier refuted this as a non-egregious nit not worth posting. I think it's worth flagging because (a) the comment block is unusually detailed — it isn't a one-line label, it's an IMPORTANT: paragraph with a specific operational rationale; (b) the staleness was created by this PR rather than being pre-existing drift, so it's directly in scope for an image-bump cleanup; and (c) the wording is actively misdirecting ("nightly image" / "digest-suffixed nightly tag") rather than merely incomplete. The IMPORTANT-block cache-keying guidance is still generally accurate as a reference, but its placement above a stable-tag entry implies it explains this pin, which it no longer does.

Suggested fix

Either:

  1. Rewrite the IMPORTANT block to be tag-scheme-agnostic, e.g. note that launch_mi355x-amds.sh caches enroot squashfs files by image string and so any tag (nightly or stable) must be replaced rather than re-pushed; or
  2. Drop the nightly-specific paragraph and the "nightly image" phrasing, since the stable v0.22.0 tag is immutable on Docker Hub and the caching concern doesn't apply to it in practice.

model: deepseek-ai/DeepSeek-V4-Pro
model-prefix: dsv4
runner: mi355x
Expand Down
6 changes: 6 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3323,3 +3323,9 @@
description:
- "Update vLLM image from v0.21.0 to v0.22.0"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1602

- config-keys:
- dsv4-fp4-mi355x-vllm
description:
- "Update vLLM ROCm image from nightly-4f940896a32c9e2a0eba7f50d521bf5f6b4de458 to v0.22.0"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1624
Loading