Skip to content

[Codegen] Only use arith.select in padding on linalg.generic#21716

Closed
newling wants to merge 1 commit into
iree-org:mainfrom
newling:dont_pad_named_as_if_generic
Closed

[Codegen] Only use arith.select in padding on linalg.generic#21716
newling wants to merge 1 commit into
iree-org:mainfrom
newling:dont_pad_named_as_if_generic

Conversation

@newling

@newling newling commented Aug 15, 2025

Copy link
Copy Markdown
Contributor

This PR updates the logic so that rewrites only happens on linalg.generic. Before, all linalg op's were having the block-rewrite logic applied to them, including named ops like linalg.matmul. Was hitting

error: expected add/mul op in the body
  %0 =  linalg.matmul {lowering_config = #iree_gpu.lowering_config<{partial_reduction = [0, 0, 32]}>}

@newling

newling commented Aug 15, 2025

Copy link
Copy Markdown
Contributor Author

@Groverkss / @kuhar this will help in deprecating warp reduction

@Groverkss

Copy link
Copy Markdown
Contributor

Do you know where these linalg.matmul ops are coming from? We generalize all named ops other than contractions in vector distribute i think

if (auto linalgOp =
dyn_cast<linalg::LinalgOp>(tilingInterfaceOp.getOperation())) {
reductionDimInfo = getReductionInfo(linalgOp);
if (linalg::GenericOp genericOp =

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (linalg::GenericOp genericOp =
if (auto genericOp =

@kuhar

kuhar commented Aug 15, 2025

Copy link
Copy Markdown
Member

We generalize all named ops other than contractions in vector distribute i think

+1, I think this was the case. The tuning infra makes this assumption too.

@Groverkss

Copy link
Copy Markdown
Contributor

I have a feeling this might be crashing because of linalg.conv_2d ops, which makes sense, but we should probably add a test for conv operations then. Let's check what exactly are the ops that are failing.

@newling

newling commented Aug 16, 2025

Copy link
Copy Markdown
Contributor Author

Do you know where these linalg.matmul ops are coming from? We generalize all named ops other than contractions in vector distribute i think

With a change like this : #21718

One of the e2e tests compiled as

iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=rocm --iree-hip-target=gfx942 --iree-opt-data-tiling=false --iree-dispatch-creation-experimental-data-tiling=true --iree-dispatch-creation-set-encoding-strategy=padding --iree-hip-encoding-layout-resolver=pad e2e_matmul_cdna3_pad_i8_rocm_hip_matmul.mlir -o abc.vmfb 

Ends up with

e2e_matmul_cdna3_pad_i8_rocm_hip_matmul.mlir:2:13: error: expected add/mul op in the body
  %result = linalg.matmul ins(%lhs, %rhs: tensor<?x?xi8>, tensor<?x?xi8>) outs(%acc: tensor<?x?xi32>) -> tensor<?x?xi32>
            ^
e2e_matmul_cdna3_pad_i8_rocm_hip_matmul.mlir:1:1: note: called from
util.func @matmul_accumulate_DYNxDYNxi8_times_DYNxDYNxi8_into_DYNxDYNxi32(%lhs: tensor<?x?xi8>, %rhs: tensor<?x?xi8>, %acc: tensor<?x?xi32>) -> tensor<?x?xi32> {

[...}

This error comes from the padding pass this PR touches. The failing pass is here in the pipeline:

funcPassManager.addPass(createGPUApplyPaddingLevelPass(padOptions));

In the pass run in this failing case, I see

...
GPUGeneralizeNamedOpsPass
...
LLVMGPUSelectLoweringStrategyPass
...
GPUApplyPaddingLevelPass (CRASH!!!!!!!)

The obvious question then is why is it still a named op in the pass that crashes, if there is clearly a generalization pass run before it? Because the pass to generalize ops does not generalize matmul :

if (isa<linalg::BatchMatmulTransposeBOp, linalg::MatmulTransposeBOp,

which seems suspicious, but I assume it was done like that for some undocumented reason?

There is another generalization pass that gets runs later, over here

funcPassManager.addPass(createLinalgGeneralizeNamedOpsPass());

But we're not getting to that.

@newling

newling commented Aug 16, 2025

Copy link
Copy Markdown
Contributor Author

Maybe we need APIs like isProjectingNamedLinalgOp / isNamedLinalgOp or something, and then passes should be required to use these to make their expectations clear with assertions.

I have a feeling this might be crashing because of linalg.conv_2d ops, which makes sense, but we should probably add a test for conv operations then. Let's check what exactly are the ops that are failing.

So... it makes sense because convolutions aren't easy to specialize? So having this pass work for (named) conv but not matmul makes sense to you? Please spell out your thinking from first principles for me

@kuhar

kuhar commented Aug 16, 2025

Copy link
Copy Markdown
Member

Named matmul op is new, probably fell through the cracks. We should generalize it instead of checking for transposed variants.

@kuhar

kuhar commented Aug 16, 2025

Copy link
Copy Markdown
Member

The named transpose ops are going away in O(days)

@newling

newling commented Aug 16, 2025

Copy link
Copy Markdown
Contributor Author

Named matmul op is new, probably fell through the cracks.

That's surprising, there was a linalg::BatchMatmulTransposeBOp before a linalg::MatmulOp ?

We should generalize it instead of checking for transposed variants.

What does this mean? I'm assuming it's not just adding it https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/Codegen/Common/GPU/GPUGeneralizeNamedOps.cpp#L56 ?

@newling

newling commented Aug 16, 2025

Copy link
Copy Markdown
Contributor Author

The named transpose ops are going away in O(days)

Ah ok, I hadn't been following that

@newling

newling commented Aug 18, 2025

Copy link
Copy Markdown
Contributor Author

Generalize matmul: #21720

I still think this PR is a good-to-have. Either this, or an assertion/failure that linalg.matmul is not entering the llvm kernel configuration pass.

@newling newling force-pushed the dont_pad_named_as_if_generic branch 2 times, most recently from cdb8fbb to 205426b Compare October 8, 2025 23:01
… of named op's blocks are bad)

Signed-off-by: James Newling <james.newling@gmail.com>
@newling newling closed this May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants