Skip to content

[SelectionDAG] Add f16 soft promotion for lrint and lround take 2#172733

Closed
folkertdev wants to merge 3 commits into
llvm:mainfrom
folkertdev:lrint-f16-fix-updated
Closed

[SelectionDAG] Add f16 soft promotion for lrint and lround take 2#172733
folkertdev wants to merge 3 commits into
llvm:mainfrom
folkertdev:lrint-f16-fix-updated

Conversation

@folkertdev

Copy link
Copy Markdown
Contributor

an attempt to fix the remaining test failures in #152684.

tgross35 and others added 3 commits December 16, 2025 20:55
On platforms that soft promote `half`, using `lrint` intrinsics crashes
with the following:

    SoftPromoteHalfOperand Op #0: t5: i32 = lrint t4

    LLVM ERROR: Do not know how to soft promote this operator's operand!
    PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
    Stack dump:
    0.      Program arguments: /Users/tmgross/Documents/projects/llvm/llvm-build/bin/llc -mtriple=riscv32
    1.      Running pass 'Function Pass Manager' on module '<stdin>'.
    2.      Running pass 'RISC-V DAG->DAG Pattern Instruction Selection' on function '@test_lrint_ixx_f16'

Resolve this by adding a soft promotion.

`SoftPromoteHalfOp_FP_TO_XINT` is reused here since it provides the
correct input and output types. It is renamed `PromoteFloatOp_UnaryOp`
to match `PromoteFloatOp_UnaryOp` and similar functions that are used to
handle the same sets of intrinsics.
@llvmbot

llvmbot commented Dec 17, 2025

Copy link
Copy Markdown
Member

@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-backend-mips
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-backend-loongarch

Author: Folkert de Vries (folkertdev)

Changes

an attempt to fix the remaining test failures in #152684.


Full diff: https://github.com/llvm/llvm-project/pull/172733.diff

14 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+23)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp (+8-4)
  • (modified) llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp (+13-6)
  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+13)
  • (modified) llvm/test/CodeGen/LoongArch/lrint-conv.ll (+24-9)
  • (modified) llvm/test/CodeGen/Mips/llrint-conv.ll (+11-12)
  • (modified) llvm/test/CodeGen/Mips/lrint-conv.ll (+15-12)
  • (modified) llvm/test/CodeGen/RISCV/lrint-conv.ll (+18-7)
  • (modified) llvm/test/CodeGen/X86/llrint-conv.ll (+43-12)
  • (modified) llvm/test/CodeGen/X86/llround-conv.ll (+8-10)
  • (modified) llvm/test/CodeGen/X86/lrint-conv-i32.ll (+8-11)
  • (modified) llvm/test/CodeGen/X86/lrint-conv-i64.ll (+8-10)
  • (modified) llvm/test/CodeGen/X86/lround-conv-i32.ll (+4-5)
  • (modified) llvm/test/CodeGen/X86/lround-conv-i64.ll (+5-6)
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
index 99d14a60c6ed1..40fee688ce341 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -5392,6 +5392,10 @@ void SelectionDAGLegalize::PromoteNode(SDNode *Node) {
       Node->getOpcode() == ISD::STRICT_SINT_TO_FP ||
       Node->getOpcode() == ISD::STRICT_FSETCC ||
       Node->getOpcode() == ISD::STRICT_FSETCCS ||
+      Node->getOpcode() == ISD::STRICT_LRINT ||
+      Node->getOpcode() == ISD::STRICT_LLRINT ||
+      Node->getOpcode() == ISD::STRICT_LROUND ||
+      Node->getOpcode() == ISD::STRICT_LLROUND ||
       Node->getOpcode() == ISD::VP_REDUCE_FADD ||
       Node->getOpcode() == ISD::VP_REDUCE_FMUL ||
       Node->getOpcode() == ISD::VP_REDUCE_FMAX ||
@@ -5937,6 +5941,25 @@ void SelectionDAGLegalize::PromoteNode(SDNode *Node) {
     Results.push_back(Tmp3);
     Results.push_back(Tmp3.getValue(1));
     break;
+  case ISD::LLROUND:
+  case ISD::LROUND:
+  case ISD::LRINT:
+  case ISD::LLRINT:
+    Tmp1 = DAG.getNode(ISD::FP_EXTEND, dl, NVT, Node->getOperand(0));
+    Tmp2 = DAG.getNode(Node->getOpcode(), dl, Node->getValueType(0), Tmp1);
+    Results.push_back(Tmp2);
+    break;
+  case ISD::STRICT_LLROUND:
+  case ISD::STRICT_LROUND:
+  case ISD::STRICT_LRINT:
+  case ISD::STRICT_LLRINT:
+    Tmp1 = DAG.getNode(ISD::STRICT_FP_EXTEND, dl, {NVT, MVT::Other},
+                       {Node->getOperand(0), Node->getOperand(1)});
+    Tmp2 = DAG.getNode(Node->getOpcode(), dl, {NVT, MVT::Other},
+                       {Tmp1.getValue(1), Tmp1});
+    Results.push_back(Tmp2);
+    Results.push_back(Tmp2.getValue(1));
+    break;
   case ISD::BUILD_VECTOR: {
     MVT EltVT = OVT.getVectorElementType();
     MVT NewEltVT = NVT.getVectorElementType();
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
index 383a025a4d916..60817a400d747 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
@@ -3779,14 +3779,18 @@ bool DAGTypeLegalizer::SoftPromoteHalfOperand(SDNode *N, unsigned OpNo) {
     Res = SoftPromoteHalfOp_FAKE_USE(N, OpNo);
     break;
   case ISD::FCOPYSIGN:  Res = SoftPromoteHalfOp_FCOPYSIGN(N, OpNo); break;
-  case ISD::STRICT_FP_TO_SINT:
-  case ISD::STRICT_FP_TO_UINT:
   case ISD::FP_TO_SINT:
   case ISD::FP_TO_UINT:
-  case ISD::LRINT:
+  case ISD::STRICT_FP_TO_SINT:
+  case ISD::STRICT_FP_TO_UINT:
   case ISD::LLRINT:
-  case ISD::LROUND:
   case ISD::LLROUND:
+  case ISD::LRINT:
+  case ISD::LROUND:
+  case ISD::STRICT_LLRINT:
+  case ISD::STRICT_LLROUND:
+  case ISD::STRICT_LRINT:
+  case ISD::STRICT_LROUND:
     Res = SoftPromoteHalfOp_Op0WithStrict(N);
     break;
   case ISD::FP_TO_SINT_SAT:
diff --git a/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp b/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
index e792b1bce3c5c..a82ce8f526c73 100644
--- a/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
+++ b/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
@@ -119,11 +119,17 @@ X86LegalizerInfo::X86LegalizerInfo(const X86Subtarget &STI,
       .widenScalarToNextPow2(0, /*Min=*/8)
       .clampScalar(0, s8, sMaxScalar);
 
-  getActionDefinitionsBuilder({G_LROUND,  G_LLROUND, G_FCOS,  G_FCOSH,  G_FACOS,
-                               G_FSIN,    G_FSINH,   G_FASIN, G_FTAN,   G_FTANH,
-                               G_FATAN,   G_FATAN2,  G_FPOW,  G_FEXP,   G_FEXP2,
-                               G_FEXP10,  G_FLOG,    G_FLOG2, G_FLOG10, G_FPOWI,
-                               G_FSINCOS, G_FCEIL,   G_FFLOOR})
+  getActionDefinitionsBuilder({G_LROUND, G_LLROUND})
+      .widenScalarIf(typeIs(1, s16),
+                     [=](const LegalityQuery &) {
+                       return std::pair<unsigned, LLT>(1, s32);
+                     })
+      .libcall();
+
+  getActionDefinitionsBuilder(
+      {G_FCOS,  G_FCOSH, G_FACOS,  G_FSIN,  G_FSINH,   G_FASIN, G_FTAN,
+       G_FTANH, G_FATAN, G_FATAN2, G_FPOW,  G_FEXP,    G_FEXP2, G_FEXP10,
+       G_FLOG,  G_FLOG2, G_FLOG10, G_FPOWI, G_FSINCOS, G_FCEIL, G_FFLOOR})
       .libcall();
 
   getActionDefinitionsBuilder(G_FSQRT)
@@ -446,7 +452,8 @@ X86LegalizerInfo::X86LegalizerInfo(const X86Subtarget &STI,
   getActionDefinitionsBuilder(G_FPEXT)
       .legalFor(HasSSE2, {{s64, s32}})
       .legalFor(HasAVX, {{v4s64, v4s32}})
-      .legalFor(HasAVX512, {{v8s64, v8s32}});
+      .legalFor(HasAVX512, {{v8s64, v8s32}})
+      .libcall();
 
   getActionDefinitionsBuilder(G_FPTRUNC)
       .legalFor(HasSSE2, {{s32, s64}})
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 50df19b3e6e47..93a678219f175 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -717,9 +717,22 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
     setOperationAction(ISD::FCANONICALIZE, MVT::f16, Custom);
     setOperationAction(ISD::STRICT_FP_EXTEND, MVT::f32, Custom);
     setOperationAction(ISD::STRICT_FP_EXTEND, MVT::f64, Custom);
+
+    setOperationAction(ISD::LLROUND, MVT::f16, Expand);
+    setOperationAction(ISD::LROUND, MVT::f16, Expand);
     setOperationAction(ISD::LRINT, MVT::f16, Expand);
     setOperationAction(ISD::LLRINT, MVT::f16, Expand);
 
+    setOperationAction(ISD::STRICT_LLROUND, MVT::f16, Promote);
+    setOperationAction(ISD::STRICT_LROUND, MVT::f16, Promote);
+    setOperationAction(ISD::STRICT_LRINT, MVT::f16, Promote);
+    setOperationAction(ISD::STRICT_LLRINT, MVT::f16, Promote);
+
+    AddPromotedToType(ISD::STRICT_LLROUND, MVT::f16, MVT::f32);
+    AddPromotedToType(ISD::STRICT_LROUND, MVT::f16, MVT::f32);
+    AddPromotedToType(ISD::STRICT_LRINT, MVT::f16, MVT::f32);
+    AddPromotedToType(ISD::STRICT_LLRINT, MVT::f16, MVT::f32);
+
     // Lower this to MOVMSK plus an AND.
     setOperationAction(ISD::FGETSIGN, MVT::i64, Custom);
     setOperationAction(ISD::FGETSIGN, MVT::i32, Custom);
diff --git a/llvm/test/CodeGen/LoongArch/lrint-conv.ll b/llvm/test/CodeGen/LoongArch/lrint-conv.ll
index 85de820025614..262d1c16a6486 100644
--- a/llvm/test/CodeGen/LoongArch/lrint-conv.ll
+++ b/llvm/test/CodeGen/LoongArch/lrint-conv.ll
@@ -5,16 +5,31 @@
 ; RUN: sed 's/ITy/i32/g' %s | llc -mtriple=loongarch64 | FileCheck %s --check-prefixes=LA64-I32
 ; RUN: sed 's/ITy/i64/g' %s | llc -mtriple=loongarch64 | FileCheck %s --check-prefixes=LA64-I64
 
-; FIXME: crash
-; define ITy @test_lrint_ixx_f16(half %x) nounwind {
-;   %res = tail call ITy @llvm.lrint.ITy.f16(half %x)
-;   ret ITy %res
-; }
+define ITy @test_lrint_ixx_f16(half %x) nounwind {
+; LA32-LABEL: test_lrint_ixx_f16:
+; LA32:         bl lrintf
+;
+; LA64-I32-LABEL: test_lrint_ixx_f16:
+; LA64-I32:         pcaddu18i $ra, %call36(lrintf)
+;
+; LA64-I64-LABEL: test_lrint_ixx_f16:
+; LA64-I64:         pcaddu18i $t8, %call36(lrintf)
+  %res = tail call ITy @llvm.lrint.ITy.f16(half %x)
+  ret ITy %res
+}
 
-; define ITy @test_llrint_ixx_f16(half %x) nounwind {
-;   %res = tail call ITy @llvm.llrint.ITy.f16(half %x)
-;   ret ITy %res
-; }
+define ITy @test_llrint_ixx_f16(half %x) nounwind {
+; LA32-LABEL: test_llrint_ixx_f16:
+; LA32:         bl llrintf
+;
+; LA64-I32-LABEL: test_llrint_ixx_f16:
+; LA64-I32:         pcaddu18i $ra, %call36(llrintf)
+;
+; LA64-I64-LABEL: test_llrint_ixx_f16:
+; LA64-I64:         pcaddu18i $t8, %call36(llrintf)
+  %res = tail call ITy @llvm.llrint.ITy.f16(half %x)
+  ret ITy %res
+}
 
 define ITy @test_lrint_ixx_f32(float %x) nounwind {
 ; LA32-LABEL: test_lrint_ixx_f32:
diff --git a/llvm/test/CodeGen/Mips/llrint-conv.ll b/llvm/test/CodeGen/Mips/llrint-conv.ll
index 592d40c0f65aa..8eaef5d4135bb 100644
--- a/llvm/test/CodeGen/Mips/llrint-conv.ll
+++ b/llvm/test/CodeGen/Mips/llrint-conv.ll
@@ -1,19 +1,18 @@
 ; RUN: llc < %s -mtriple=mips64el -mattr=+soft-float | FileCheck %s
 ; RUN: llc < %s -mtriple=mips -mattr=+soft-float     | FileCheck %s
 
-; FIXME: crash
-; define signext i32 @testmswh(half %x) {
-; entry:
-;   %0 = tail call i64 @llvm.llrint.i64.f16(half %x)
-;   %conv = trunc i64 %0 to i32
-;   ret i32 %conv
-; }
+define signext i32 @testmswh(half %x) {
+entry:
+  %0 = tail call i64 @llvm.llrint.i64.f16(half %x)
+  %conv = trunc i64 %0 to i32
+  ret i32 %conv
+}
 
-; define i64 @testmsxh(half %x) {
-; entry:
-;   %0 = tail call i64 @llvm.llrint.i64.f16(half %x)
-;   ret i64 %0
-; }
+define i64 @testmsxh(half %x) {
+entry:
+  %0 = tail call i64 @llvm.llrint.i64.f16(half %x)
+  ret i64 %0
+}
 
 define signext i32 @testmsws(float %x) {
 ; CHECK-LABEL: testmsws:
diff --git a/llvm/test/CodeGen/Mips/lrint-conv.ll b/llvm/test/CodeGen/Mips/lrint-conv.ll
index 6d2e392675f1c..64c5cb9ac5b07 100644
--- a/llvm/test/CodeGen/Mips/lrint-conv.ll
+++ b/llvm/test/CodeGen/Mips/lrint-conv.ll
@@ -1,19 +1,22 @@
 ; RUN: llc < %s -mtriple=mips64el -mattr=+soft-float | FileCheck %s
 ; RUN: llc < %s -mtriple=mips -mattr=+soft-float     | FileCheck %s
 
-; FIXME: crash
-; define signext i32 @testmswh(half %x) {
-; entry:
-;   %0 = tail call i64 @llvm.lrint.i64.f16(half %x)
-;   %conv = trunc i64 %0 to i32
-;   ret i32 %conv
-; }
+define signext i32 @testmswh(half %x) {
+; CHECK-LABEL: testmswh:
+; CHECK:       jal     lrintf
+entry:
+  %0 = tail call i64 @llvm.lrint.i64.f16(half %x)
+  %conv = trunc i64 %0 to i32
+  ret i32 %conv
+}
 
-; define i64 @testmsxh(half %x) {
-; entry:
-;   %0 = tail call i64 @llvm.lrint.i64.f16(half %x)
-;   ret i64 %0
-; }
+define i64 @testmsxh(half %x) {
+; CHECK-LABEL: testmsxh:
+; CHECK:       jal     lrintf
+entry:
+  %0 = tail call i64 @llvm.lrint.i64.f16(half %x)
+  ret i64 %0
+}
 
 define signext i32 @testmsws(float %x) {
 ; CHECK-LABEL: testmsws:
diff --git a/llvm/test/CodeGen/RISCV/lrint-conv.ll b/llvm/test/CodeGen/RISCV/lrint-conv.ll
index d3af2153588a1..ecb6bd0932ef3 100644
--- a/llvm/test/CodeGen/RISCV/lrint-conv.ll
+++ b/llvm/test/CodeGen/RISCV/lrint-conv.ll
@@ -5,14 +5,25 @@
 ; RUN: sed 's/ITy/i32/g' %s | llc -mtriple=riscv64 | FileCheck %s --check-prefixes=RV64
 ; RUN: sed 's/ITy/i64/g' %s | llc -mtriple=riscv64 | FileCheck %s --check-prefixes=RV64
 
-; FIXME: crash
-; define ITy @test_lrint_ixx_f16(half %x) nounwind {
-;   %res = tail call ITy @llvm.lrint.ITy.f16(half %x)
-; }
+define ITy @test_lrint_ixx_f16(half %x) nounwind {
+; RV32-LABEL: test_lrint_ixx_f16:
+; RV32:         call lrintf
+;
+; RV64-LABEL: test_lrint_ixx_f16:
+; RV64:         call lrintf
+  %res = tail call ITy @llvm.lrint.ITy.f16(half %x)
+  ret ITy %res
+}
 
-; define ITy @test_llrint_ixx_f16(half %x) nounwind {
-;   %res = tail call ITy @llvm.llrint.ITy.f16(half %x)
-; }
+define ITy @test_llrint_ixx_f16(half %x) nounwind {
+; RV32-LABEL: test_llrint_ixx_f16:
+; RV32:         call llrintf
+;
+; RV64-LABEL: test_llrint_ixx_f16:
+; RV64:         call llrintf
+  %res = tail call ITy @llvm.llrint.ITy.f16(half %x)
+  ret ITy %res
+}
 
 define ITy @test_lrint_ixx_f32(float %x) nounwind {
 ; RV32-LABEL: test_lrint_ixx_f32:
diff --git a/llvm/test/CodeGen/X86/llrint-conv.ll b/llvm/test/CodeGen/X86/llrint-conv.ll
index 5f38645f74636..72bef2701d2d3 100644
--- a/llvm/test/CodeGen/X86/llrint-conv.ll
+++ b/llvm/test/CodeGen/X86/llrint-conv.ll
@@ -7,12 +7,44 @@
 ; RUN: llc < %s -mtriple=x86_64-unknown -mattr=avx | FileCheck %s --check-prefixes=X64,X64-AVX
 ; RUN: llc < %s -mtriple=x86_64-unknown -mattr=avx512f | FileCheck %s --check-prefixes=X64,X64-AVX
 
-; FIXME: crash
-; define i64 @test_llrint_i64_f16(half %x) nounwind {
-; entry:
-;   %0 = tail call i64 @llvm.llrint.i64.f16(half %x)
-;   ret i64 %0
-; }
+define i64 @test_llrint_i64_f16(half %x) nounwind {
+; X86-NOSSE-LABEL: test_llrint_i64_f16:
+; X86-NOSSE:       # %bb.0: # %entry
+; X86-NOSSE-NEXT:    pushl %eax
+; X86-NOSSE-NEXT:    movzwl {{[0-9]+}}(%esp), %eax
+; X86-NOSSE-NEXT:    movl %eax, (%esp)
+; X86-NOSSE-NEXT:    calll __extendhfsf2
+; X86-NOSSE-NEXT:    fstps (%esp)
+; X86-NOSSE-NEXT:    calll llrintf
+; X86-NOSSE-NEXT:    popl %ecx
+; X86-NOSSE-NEXT:    retl
+;
+; X86-SSE2-LABEL: test_llrint_i64_f16:
+; X86-SSE2:       # %bb.0: # %entry
+; X86-SSE2-NEXT:    pushl %eax
+; X86-SSE2-NEXT:    pinsrw $0, {{[0-9]+}}(%esp), %xmm0
+; X86-SSE2-NEXT:    pextrw $0, %xmm0, %eax
+; X86-SSE2-NEXT:    movw %ax, (%esp)
+; X86-SSE2-NEXT:    calll __extendhfsf2
+; X86-SSE2-NEXT:    fstps (%esp)
+; X86-SSE2-NEXT:    calll llrintf
+; X86-SSE2-NEXT:    popl %ecx
+; X86-SSE2-NEXT:    retl
+;
+; X64-SSE-LABEL: test_llrint_i64_f16:
+; X64-SSE:       # %bb.0: # %entry
+; X64-SSE-NEXT:    pushq %rax
+; X64-SSE-NEXT:    callq __extendhfsf2@PLT
+; X64-SSE-NEXT:    callq rintf@PLT
+; X64-SSE-NEXT:    callq __truncsfhf2@PLT
+; X64-SSE-NEXT:    callq __extendhfsf2@PLT
+; X64-SSE-NEXT:    cvttss2si %xmm0, %rax
+; X64-SSE-NEXT:    popq %rcx
+; X64-SSE-NEXT:    retq
+entry:
+  %0 = tail call i64 @llvm.llrint.i64.f16(half %x)
+  ret i64 %0
+}
 
 define i64 @test_llrint_i64_f32(float %x) nounwind {
 ; X86-NOSSE-LABEL: test_llrint_i64_f32:
@@ -217,12 +249,11 @@ entry:
   ret i64 %0
 }
 
-; FIXME: crash
-; define i64 @test_llrint_i64_f16_strict(half %x) nounwind strictfp {
-; entry:
-;   %0 = tail call i64 @llvm.experimental.constrained.llrint.i64.f16(half %x, metadata!"round.dynamic", metadata!"fpexcept.strict")
-;   ret i64 %0
-; }
+define i64 @test_llrint_i64_f16_strict(half %x) nounwind strictfp {
+entry:
+  %0 = tail call i64 @llvm.experimental.constrained.llrint.i64.f16(half %x, metadata!"round.dynamic", metadata!"fpexcept.strict")
+  ret i64 %0
+}
 
 define i64 @test_llrint_i64_f32_strict(float %x) nounwind strictfp {
 ; X86-NOSSE-LABEL: test_llrint_i64_f32_strict:
diff --git a/llvm/test/CodeGen/X86/llround-conv.ll b/llvm/test/CodeGen/X86/llround-conv.ll
index ef4df82e9e57e..0b50acbcf7685 100644
--- a/llvm/test/CodeGen/X86/llround-conv.ll
+++ b/llvm/test/CodeGen/X86/llround-conv.ll
@@ -5,11 +5,10 @@
 ; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X86
 ; RUN: llc < %s -mtriple=x86_64-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X64
 
-; FIXME: crash
-; define i64 @test_llround_f16(half %x) nounwind {
-;   %conv = tail call i64 @llvm.llround.f16(half %x)
-;   ret i64 %conv
-; }
+define i64 @test_llround_f16(half %x) nounwind {
+  %conv = tail call i64 @llvm.llround.f16(half %x)
+  ret i64 %conv
+}
 
 define i64 @test_llround_f32(float %x) nounwind {
 ; X86-NOSSE-LABEL: test_llround_f32:
@@ -184,11 +183,10 @@ define i64 @test_llround_f128(fp128 %x) nounwind {
   ret i64 %conv
 }
 
-; FIXME: crash
-; define i64 @test_llround_i64_f16(half %x) nounwind {
-;   %conv = call i64 @llvm.llround.i64.f16(half %x)
-;   ret i64 %conv
-; }
+define i64 @test_llround_i64_f16(half %x) nounwind {
+  %conv = call i64 @llvm.llround.i64.f16(half %x)
+  ret i64 %conv
+}
 
 define i64 @test_llround_i64_f32(float %x) nounwind {
 ; X86-NOSSE-LABEL: test_llround_i64_f32:
diff --git a/llvm/test/CodeGen/X86/lrint-conv-i32.ll b/llvm/test/CodeGen/X86/lrint-conv-i32.ll
index 2b99b4c50f58a..fefb841b4b631 100644
--- a/llvm/test/CodeGen/X86/lrint-conv-i32.ll
+++ b/llvm/test/CodeGen/X86/lrint-conv-i32.ll
@@ -7,12 +7,10 @@
 ; RUN: llc < %s -mtriple=x86_64-unknown -mattr=avx | FileCheck %s --check-prefixes=X64,X64-AVX
 ; RUN: llc < %s -mtriple=x86_64-unknown -mattr=avx512f | FileCheck %s --check-prefixes=X64,X64-AVX
 
-; FIXME: crash
-; define i32 @test_lrint_i32_f16(half %x) nounwind {
-; entry:
-;   %0 = tail call i32 @llvm.lrint.i32.f16(half %x)
-;   ret i32 %0
-; }
+define i32 @test_lrint_i32_f16(half %x) nounwind {
+  %conv = tail call i32 @llvm.lrint.i32.f16(half %x)
+  ret i32 %conv
+}
 
 define i32 @test_lrint_i32_f32(float %x) nounwind {
 ; X86-NOSSE-LABEL: test_lrint_i32_f32:
@@ -154,11 +152,10 @@ define i32 @test_lrint_i32_f128(fp128 %x) nounwind {
   ret i32 %conv
 }
 
-; FIXME: crash
-; define i32 @test_lrint_i32_f16_strict(half %x) nounwind strictfp {
-;   %conv = tail call i32 @llvm.experimental.constrained.lrint.i32.f16(half %x, metadata!"round.dynamic", metadata!"fpexcept.strict")
-;   ret i32 %conv
-; }
+define i32 @test_lrint_i32_f16_strict(half %x) nounwind strictfp {
+  %conv = tail call i32 @llvm.experimental.constrained.lrint.i32.f16(half %x, metadata!"round.dynamic", metadata!"fpexcept.strict")
+  ret i32 %conv
+}
 
 define i32 @test_lrint_i32_f32_strict(float %x) nounwind strictfp {
 ; X86-NOSSE-LABEL: test_lrint_i32_f32_strict:
diff --git a/llvm/test/CodeGen/X86/lrint-conv-i64.ll b/llvm/test/CodeGen/X86/lrint-conv-i64.ll
index 731c03bf0d747..c3d9fb39ae4da 100644
--- a/llvm/test/CodeGen/X86/lrint-conv-i64.ll
+++ b/llvm/test/CodeGen/X86/lrint-conv-i64.ll
@@ -5,11 +5,10 @@
 ; RUN: llc < %s -mtriple=x86_64-unknown -mattr=avx | FileCheck %s --check-prefixes=CHECK,AVX
 ; RUN: llc < %s -mtriple=x86_64-unknown -mattr=avx512f | FileCheck %s --check-prefixes=CHECK,AVX
 
-; FIXME: crash
-; define i64 @test_lrint_i64_f16(half %x) nounwind {
-;   %conv = tail call i64 @llvm.lrint.i64.f16(half %x)
-;   ret i64 %conv
-; }
+define i64 @test_lrint_i64_f16(half %x) nounwind {
+  %conv = tail call i64 @llvm.lrint.i64.f16(half %x)
+  ret i64 %conv
+}
 
 define i64 @test_lrint_i64_f32(float %x) nounwind {
 ; X86-NOSSE-LABEL: test_lrint_i64_f32:
@@ -149,11 +148,10 @@ define i64 @test_lrint_i64_f128(fp128 %x) nounwind {
   ret i64 %conv
 }
 
-; FIXME: crash
-; define i64 @test_lrint_i64_f16_strict(half %x) nounwind {
-;   %conv = tail call i64 @llvm.experimental.constrained.lrint.i64.f16(half %x, metadata!"round.dynamic", metadata!"fpexcept.strict")
-;   ret i64 %conv
-; }
+define i64 @test_lrint_i64_f16_strict(half %x) nounwind {
+  %conv = tail call i64 @llvm.experimental.constrained.lrint.i64.f16(half %x, metadata!"round.dynamic", metadata!"fpexcept.strict")
+  ret i64 %conv
+}
 
 define i64 @test_lrint_i64_f32_strict(float %x) nounwind {
 ; X86-NOSSE-LABEL: test_lrint_i64_f32_strict:
diff --git a/llvm/test/CodeGen/X86/lround-conv-i32.ll b/llvm/test/CodeGen/X86/lround-conv-i32.ll
index 389f29233dcce..7e7b6e779ee23 100644
--- a/llvm/test/CodeGen/X86/lround-conv-i32.ll
+++ b/llvm/test/CodeGen/X86/lround-conv-i32.ll
@@ -5,11 +5,10 @@
 ; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X86
 ; RUN: llc < %s -mtriple=x86_64-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X64
 
-; FIXME: crash
-; define i32 @test_lround_i32_f16(half %x) nounwind {
-;   %conv = tail call i32 @llvm.lround.i32.f16(half %x)
-;   ret i32 %conv
-; }
+define i32 @test_lround_i32_f16(half %x) nounwind {
+  %conv = tail call i32 @llvm.lround.i32.f16(half %x)
+  ret i32 %conv
+}
 
 define i32 @test_lround_i32_f32(float %x) nounwind {
 ; X86-LABEL: test_lround_i32_f32:
diff --git a/llvm/test/CodeGen/X86/lround-conv-i64.ll b/llvm/test/CodeGen/X86/lround-conv-i64.ll
index 8b8230074728f..a1fe08ff7ad50 100644
--- a/llvm/test/CodeGen/X86/lround-conv-i64.ll
+++ b/llvm/test/CodeGen/X86/lround-conv-i64.ll
@@ -5,12 +5,11 @@
 ; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X86
 ; RUN: llc < %s -mtriple=x86_64-linux-gnu -global-isel -global-isel-abort=1 | FileCheck %s --check-prefixes=GISEL-X64
 
-; FIXME: crash
-; define i64 @test_lround_i64_f16(half %x) nounwind {
-; entry:
-;   %0 = tail call i64 @llvm.lround.i64.f16(half %x)
-;   ret i64 %0
-; }
+define i64 @test_lround_i64_f16(half %x) nounwind {
+entry:
+  %0 = tail call i64 @llvm.lround.i64.f16(half %x)
+  ret i64 %0
+}
 
 define i64 @test_lround_i64_f32(float %x) nounwind {
 ; X86-NOSSE-LABEL: test_lround_i64_f32:

@github-actions

Copy link
Copy Markdown

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions cpp -- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp llvm/lib/Target/X86/X86ISelLowering.cpp --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
index 8a0f7e254..5d808be5e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
@@ -3781,7 +3781,9 @@ bool DAGTypeLegalizer::SoftPromoteHalfOperand(SDNode *N, unsigned OpNo) {
   case ISD::FAKE_USE:
     Res = SoftPromoteHalfOp_FAKE_USE(N, OpNo);
     break;
-  case ISD::FCOPYSIGN:  Res = SoftPromoteHalfOp_FCOPYSIGN(N, OpNo); break;
+  case ISD::FCOPYSIGN:
+    Res = SoftPromoteHalfOp_FCOPYSIGN(N, OpNo);
+    break;
   case ISD::FP_TO_SINT:
   case ISD::FP_TO_UINT:
   case ISD::STRICT_FP_TO_SINT:

@arsenm arsenm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch title should be more descriptive

@folkertdev folkertdev changed the title Lrint f16 fix updated [SelectionDAG] Add f16 soft promotion for lrint and lround take 2 Dec 17, 2025
@github-actions

Copy link
Copy Markdown

🪟 Windows x64 Test Results

  • 128744 tests passed
  • 2826 tests skipped
  • 1 test failed

Failed Tests

(click on a test name to see its output)

MLIR

MLIR.Dialect/XeGPU/propagate-layout-subgroup.mlir
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
c:\_work\llvm-project\llvm-project\build\bin\mlir-opt.exe -xevm-attach-target='chip=pvc' -xegpu-propagate-layout="layout-kind=subgroup" -split-input-file C:\_work\llvm-project\llvm-project\mlir\test\Dialect\XeGPU\propagate-layout-subgroup.mlir | c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe C:\_work\llvm-project\llvm-project\mlir\test\Dialect\XeGPU\propagate-layout-subgroup.mlir
# executed command: 'c:\_work\llvm-project\llvm-project\build\bin\mlir-opt.exe' -xevm-attach-target=chip=pvc -xegpu-propagate-layout=layout-kind=subgroup -split-input-file 'C:\_work\llvm-project\llvm-project\mlir\test\Dialect\XeGPU\propagate-layout-subgroup.mlir'
# note: command had no output on stdout or stderr
# executed command: 'c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe' 'C:\_work\llvm-project\llvm-project\mlir\test\Dialect\XeGPU\propagate-layout-subgroup.mlir'
# .---command stderr------------
# | C:\_work\llvm-project\llvm-project\mlir\test\Dialect\XeGPU\propagate-layout-subgroup.mlir:10:17: error: CHECK-SAME: expected string not found in input
# |  // CHECK-SAME: {layout_result_0 = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}
# |                 ^
# | <stdin>:5:90: note: scanning from here
# |  %1 = xegpu.load_nd %0 <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}> : !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>> -> vector<256x128xf32>
# |                                                                                          ^
# | <stdin>:6:17: note: possible intended match here
# |  xegpu.store_nd %1, %0 <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}> : vector<256x128xf32>, !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>>
# |                 ^
# | C:\_work\llvm-project\llvm-project\mlir\test\Dialect\XeGPU\propagate-layout-subgroup.mlir:36:17: error: CHECK-SAME: expected string not found in input
# |  // CHECK-SAME: {layout_result_0 = #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>} :
# |                 ^
# | <stdin>:18:112: note: scanning from here
# |  %2 = xegpu.load_nd %0[0, 0] <{layout = #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>}> : !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>> -> vector<256x128xf32>
# |                                                                                                                ^
# | <stdin>:19:35: note: possible intended match here
# |  %3 = vector.transpose %2, [1, 0] {layout_result_0 = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>} : vector<256x128xf32> to vector<128x256xf32>
# |                                   ^
# | 
# | Input file: <stdin>
# | Check file: C:\_work\llvm-project\llvm-project\mlir\test\Dialect\XeGPU\propagate-layout-subgroup.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |            1: module { 
# |            2:  gpu.module @test [#xevm.target<chip = "pvc">] { 
# |            3:  func.func @store_nd(%arg0: memref<256x128xf32>) { 
# |            4:  %0 = xegpu.create_nd_tdesc %arg0 : memref<256x128xf32> -> !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>> 
# |            5:  %1 = xegpu.load_nd %0 <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}> : !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>> -> vector<256x128xf32> 
# | same:10'0                                                                                              X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |            6:  xegpu.store_nd %1, %0 <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}> : vector<256x128xf32>, !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>> 
# | same:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | same:10'1                     ?                                                                                                                                                                                        possible intended match
# |            7:  return 
# | same:10'0     ~~~~~~~~
# |            8:  } 
# | same:10'0     ~~~
# |            9:  } 
# | same:10'0     ~~~
# |           10: } 
# | same:10'0     ~~
# |           11:  
# | same:10'0     ~
# |           12: // ----- 
# | same:10'0     ~~~~~~~~~
# |           13: module { 
# | same:10'0     ~~~~~~~~~
# |           14:  gpu.module @test [#xevm.target<chip = "pvc">] { 
# | same:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           15:  func.func @vector_transpose(%arg0: memref<256x128xf32>, %arg1: memref<128x256xf32>) { 
# | same:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           16:  %0 = xegpu.create_nd_tdesc %arg0 : memref<256x128xf32> -> !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>> 
# |           17:  %1 = xegpu.create_nd_tdesc %arg1 : memref<128x256xf32> -> !xegpu.tensor_desc<128x256xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>> 
# |           18:  %2 = xegpu.load_nd %0[0, 0] <{layout = #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>}> : !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>> -> vector<256x128xf32> 
# | same:36'0                                                                                                                    X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |           19:  %3 = vector.transpose %2, [1, 0] {layout_result_0 = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>} : vector<256x128xf32> to vector<128x256xf32> 
# | same:36'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | same:36'1                                       ?                                                                                                                                       possible intended match
# |           20:  xegpu.store_nd %3, %1[0, 0] <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>}> : vector<128x256xf32>, !xegpu.tensor_desc<128x256xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>> 
# | same:36'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           21:  return 
# | same:36'0     ~~~~~~~~
# |           22:  } 
# | same:36'0     ~~~
# |           23:  } 
# | same:36'0     ~~~
# |           24: } 
# | same:36'0     ~~
# |           25:  
# | same:36'0     ~
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the infrastructure label.

@github-actions

Copy link
Copy Markdown

🐧 Linux x64 Test Results

  • 167261 tests passed
  • 2955 tests skipped
  • 1 test failed

Failed Tests

(click on a test name to see its output)

MLIR

MLIR.Dialect/XeGPU/propagate-layout-subgroup.mlir (Likely Already Failing) This test is already failing at the base commit.
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/mlir-opt -xevm-attach-target='chip=pvc' -xegpu-propagate-layout="layout-kind=subgroup" -split-input-file /home/gha/actions-runner/_work/llvm-project/llvm-project/mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/mlir-opt -xevm-attach-target=chip=pvc -xegpu-propagate-layout=layout-kind=subgroup -split-input-file /home/gha/actions-runner/_work/llvm-project/llvm-project/mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir
# note: command had no output on stdout or stderr
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir
# .---command stderr------------
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir:10:17: error: CHECK-SAME: expected string not found in input
# |  // CHECK-SAME: {layout_result_0 = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}
# |                 ^
# | <stdin>:5:90: note: scanning from here
# |  %1 = xegpu.load_nd %0 <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}> : !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>> -> vector<256x128xf32>
# |                                                                                          ^
# | <stdin>:6:17: note: possible intended match here
# |  xegpu.store_nd %1, %0 <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}> : vector<256x128xf32>, !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>>
# |                 ^
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir:36:17: error: CHECK-SAME: expected string not found in input
# |  // CHECK-SAME: {layout_result_0 = #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>} :
# |                 ^
# | <stdin>:18:112: note: scanning from here
# |  %2 = xegpu.load_nd %0[0, 0] <{layout = #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>}> : !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>> -> vector<256x128xf32>
# |                                                                                                                ^
# | <stdin>:19:35: note: possible intended match here
# |  %3 = vector.transpose %2, [1, 0] {layout_result_0 = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>} : vector<256x128xf32> to vector<128x256xf32>
# |                                   ^
# | 
# | Input file: <stdin>
# | Check file: /home/gha/actions-runner/_work/llvm-project/llvm-project/mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |            1: module { 
# |            2:  gpu.module @test [#xevm.target<chip = "pvc">] { 
# |            3:  func.func @store_nd(%arg0: memref<256x128xf32>) { 
# |            4:  %0 = xegpu.create_nd_tdesc %arg0 : memref<256x128xf32> -> !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>> 
# |            5:  %1 = xegpu.load_nd %0 <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}> : !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>> -> vector<256x128xf32> 
# | same:10'0                                                                                              X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |            6:  xegpu.store_nd %1, %0 <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>}> : vector<256x128xf32>, !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32]>> 
# | same:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | same:10'1                     ?                                                                                                                                                                                        possible intended match
# |            7:  return 
# | same:10'0     ~~~~~~~~
# |            8:  } 
# | same:10'0     ~~~
# |            9:  } 
# | same:10'0     ~~~
# |           10: } 
# | same:10'0     ~~
# |           11:  
# | same:10'0     ~
# |           12: // ----- 
# | same:10'0     ~~~~~~~~~
# |           13: module { 
# | same:10'0     ~~~~~~~~~
# |           14:  gpu.module @test [#xevm.target<chip = "pvc">] { 
# | same:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           15:  func.func @vector_transpose(%arg0: memref<256x128xf32>, %arg1: memref<128x256xf32>) { 
# | same:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           16:  %0 = xegpu.create_nd_tdesc %arg0 : memref<256x128xf32> -> !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>> 
# |           17:  %1 = xegpu.create_nd_tdesc %arg1 : memref<128x256xf32> -> !xegpu.tensor_desc<128x256xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>> 
# |           18:  %2 = xegpu.load_nd %0[0, 0] <{layout = #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>}> : !xegpu.tensor_desc<256x128xf32, #xegpu.layout<sg_layout = [4, 8], sg_data = [64, 32], order = [0, 1]>> -> vector<256x128xf32> 
# | same:36'0                                                                                                                    X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |           19:  %3 = vector.transpose %2, [1, 0] {layout_result_0 = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>} : vector<256x128xf32> to vector<128x256xf32> 
# | same:36'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | same:36'1                                       ?                                                                                                                                       possible intended match
# |           20:  xegpu.store_nd %3, %1[0, 0] <{layout = #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>}> : vector<128x256xf32>, !xegpu.tensor_desc<128x256xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 64], order = [1, 0]>> 
# | same:36'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           21:  return 
# | same:36'0     ~~~~~~~~
# |           22:  } 
# | same:36'0     ~~~
# |           23:  } 
# | same:36'0     ~~~
# |           24: } 
# | same:36'0     ~~
# |           25:  
# | same:36'0     ~
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the infrastructure label.

@folkertdev folkertdev closed this Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants