Skip to content

gpt-oss model support sink_sliding_window#637

Open
kyle-256 wants to merge 14 commits intomainfrom
feature/gpt-oss
Open

gpt-oss model support sink_sliding_window#637
kyle-256 wants to merge 14 commits intomainfrom
feature/gpt-oss

Conversation

@kyle-256
Copy link
Copy Markdown
Contributor

should wait aiter pr: ROCm/aiter#2505 merge into main first.

JohnQinAMD and others added 14 commits January 20, 2026 02:48
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Gene Der Su <e870252314@gmail.com>
This reverts commit 84065bd.
Signed-off-by: Gene Der Su <e870252314@gmail.com>
Signed-off-by: Gene Der Su <e870252314@gmail.com>
Bring the latest mainline Primus updates into the GPT-OSS branch while
keeping the sink sliding-window GPT-OSS configs enabled.

Made-with: Cursor
Copilot AI review requested due to automatic review settings March 30, 2026 05:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Megatron MI355X GPT-OSS pretrain example configs to enable sink-attention sliding window using the GPT-OSS default window size.

Changes:

  • Set sink_sliding_window from 0 (disabled) to 128 (enabled) in MI355X GPT-OSS pretrain configs.
  • Remove the inline note indicating sliding window was not yet supported (by implication via enabling the feature).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
examples/megatron/configs/MI355X/gpt_oss_20B-FP8-pretrain.yaml Enable sink sliding window (128) for 20B FP8 MI355X pretrain config.
examples/megatron/configs/MI355X/gpt_oss_20B-BF16-pretrain.yaml Enable sink sliding window (128) for 20B BF16 MI355X pretrain config.
examples/megatron/configs/MI355X/gpt_oss_120B-FP8-pretrain.yaml Enable sink sliding window (128) for 120B FP8 MI355X pretrain config.
examples/megatron/configs/MI355X/gpt_oss_120B-BF16-pretrain.yaml Enable sink sliding window (128) for 120B BF16 MI355X pretrain config.

# Note: sliding window not yet supported by aiter Triton backend
# Set to 0 to disable, or wait for backend support
sink_sliding_window: 0 # gpt-oss default is 128, but disabled for now
sink_sliding_window: 128 # gpt-oss default
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says this change should wait for ROCm/aiter PR #2505 to merge into main first. With this config now enabling sink_sliding_window=128, runs using the current pinned AITer commit in CI may fail if that backend support isn't available yet. Consider keeping this at 0 (or gating it) until the dependency commit is updated to include the required upstream change, and/or update the pinned AITer commit alongside this config change.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants