Skip to content

[DispatchCreation] Fuse reshape op chains along with set_encoding ops#21365

Merged
Max191 merged 2 commits into
iree-org:mainfrom
Max191-agents:fuse-encoding-reshape-chains
Jul 21, 2025
Merged

[DispatchCreation] Fuse reshape op chains along with set_encoding ops#21365
Max191 merged 2 commits into
iree-org:mainfrom
Max191-agents:fuse-encoding-reshape-chains

Conversation

@Max191

@Max191 Max191 commented Jul 14, 2025

Copy link
Copy Markdown
Contributor

There can be reshape ops in between set_encoding ops and their producer dispatch region ops when we fuse encoding ops into dispatches, so this PR pulls any reshape op chains into the producer dispatch along with the set_encoding op. This enables more producer fusions for set_encoding in the data tiling fusion path.

@jtuyls jtuyls left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread compiler/src/iree/compiler/Dialect/Flow/Transforms/RegionOpUtils.cpp Outdated
@Max191 Max191 force-pushed the fuse-encoding-reshape-chains branch from 39b8de7 to 2bd5675 Compare July 15, 2025 13:58
@Max191

Max191 commented Jul 15, 2025

Copy link
Copy Markdown
Contributor Author

I'm going to convert this to a draft now, because @jtuyls found some issues with llama.

@Max191 Max191 marked this pull request as draft July 15, 2025 16:43

@hanhanW hanhanW left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me put a blocker, just in case.

@jtuyls

jtuyls commented Jul 16, 2025

Copy link
Copy Markdown
Contributor

Just a quick update on the Llama issue: it's caused by a dominance problem, but after fixing that, one of the new dispatches still fails somewhere in the LLVM backend. I think I have a fix/workaround for that as well, but it's in LLVM and I need to look at it a bit more to see whether it makes sense and create an upstream PR.

@Max191

Max191 commented Jul 16, 2025

Copy link
Copy Markdown
Contributor Author

Just a quick update on the Llama issue: it's caused by a dominance problem, but after fixing that, one of the new dispatches still fails somewhere in the LLVM backend. I think I have a fix/workaround for that as well, but it's in LLVM and I need to look at it a bit more to see whether it makes sense and create an upstream PR.

Nice! What is the dispatch that's failing?

@jtuyls

jtuyls commented Jul 16, 2025

Copy link
Copy Markdown
Contributor

Nice! What is the dispatch that's failing?

Here it is: https://gist.github.com/jtuyls/5590340b86fedbbd83f310f45904eee4. It doesn't look too special to me but it fails with:

iree/third_party/llvm-project/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:3628: const llvm::TargetRegisterClass* llvm::SIRegisterInfo::getEquivalentSGPRClass(const llvm::TargetRegisterClass*) const: Assertion `SRC && "Invalid register class size"' failed.

The repro command is:

iree-compile --compile-from=executable-sources --compile-to=hal module_prefill_bs4\$async_dispatch_3.mlir --iree-llvmgpu-test-combine-layout-transformation

@jtuyls

jtuyls commented Jul 17, 2025

Copy link
Copy Markdown
Contributor

@Max191

Max191 commented Jul 18, 2025

Copy link
Copy Markdown
Contributor Author

So I think we could land this PR?

Sure, I think we can just land it, since this is on an experimental path still, and your PR is already up. I think it will be easier that way.

EDIT: Alternatively, I could cherry-pick your fix onto this PR and list you as a coauthor if we want to avoid landing this without the fix. I'll leave it up to you which you prefer.

@jtuyls

jtuyls commented Jul 19, 2025

Copy link
Copy Markdown
Contributor

So I think we could land this PR?

Sure, I think we can just land it, since this is on an experimental path still, and your PR is already up. I think it will be easier that way.

EDIT: Alternatively, I could cherry-pick your fix onto this PR and list you as a coauthor if we want to avoid landing this without the fix. I'll leave it up to you which you prefer.

I am ok landing this as is and then I can rebase my PR and add a test for that as well. Llama isn't working on main yet anyway, I need an integrate as well to get the fix from llvm/llvm-project#149181 and until we put in some e2e llama-like tests I think it will likely be broken a couple more times.

@Max191 Max191 marked this pull request as ready for review July 21, 2025 13:41
@Max191

Max191 commented Jul 21, 2025

Copy link
Copy Markdown
Contributor Author

@hanhanW looks like we are ready to land this now then.

@Max191 Max191 requested a review from hanhanW July 21, 2025 14:16
Max191 added 2 commits July 21, 2025 16:37
Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
@Max191 Max191 force-pushed the fuse-encoding-reshape-chains branch from 2bd5675 to 3c9e0c8 Compare July 21, 2025 16:37
@Max191 Max191 enabled auto-merge (squash) July 21, 2025 16:37
@Max191 Max191 merged commit d17d103 into iree-org:main Jul 21, 2025
44 checks passed
AWoloszyn pushed a commit that referenced this pull request Dec 1, 2025
…#21365)

There can be reshape ops in between set_encoding ops and their producer
dispatch region ops when we fuse encoding ops into dispatches, so this
PR pulls any reshape op chains into the producer dispatch along with the
set_encoding op. This enables more producer fusions for set_encoding in
the data tiling fusion path.

---------

Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants