[LV] Remove EVLIndVarSimplify pass#160454
Conversation
Initially this was needed to replace the fixed-step canonical IV with the variable-step EVL IV, but this was eventually superseded by the loop vectorizer doing this transform itself in llvm#147222. The pass was then removed from the RISC-V pipeline in llvm#151483 and the loop vectorizer stopped emitting the metadata used by the pass in llvm#155760, so now there's no users of it.
|
@llvm/pr-subscribers-vectorizers @llvm/pr-subscribers-llvm-transforms Author: Luke Lau (lukel97) ChangesInitially this was needed to replace the fixed-step canonical IV with the variable-step EVL IV, but this was eventually superseded by the loop vectorizer doing this transform itself in #147222. The pass was then removed from the RISC-V pipeline in #151483 and the loop vectorizer stopped emitting the metadata used by the pass in #155760, so now there's no users of it. Patch is 33.57 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/160454.diff 6 Files Affected:
diff --git a/llvm/include/llvm/Transforms/Vectorize/EVLIndVarSimplify.h b/llvm/include/llvm/Transforms/Vectorize/EVLIndVarSimplify.h
deleted file mode 100644
index 3178dc762a195..0000000000000
--- a/llvm/include/llvm/Transforms/Vectorize/EVLIndVarSimplify.h
+++ /dev/null
@@ -1,31 +0,0 @@
-//===------ EVLIndVarSimplify.h - Optimize vectorized loops w/ EVL IV------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// This pass optimizes a vectorized loop with canonical IV to using EVL-based
-// IV if it was tail-folded by predicated EVL.
-//
-//===----------------------------------------------------------------------===//
-
-#ifndef LLVM_TRANSFORMS_VECTORIZE_EVLINDVARSIMPLIFY_H
-#define LLVM_TRANSFORMS_VECTORIZE_EVLINDVARSIMPLIFY_H
-
-#include "llvm/Analysis/LoopAnalysisManager.h"
-#include "llvm/IR/PassManager.h"
-
-namespace llvm {
-class Loop;
-class LPMUpdater;
-
-/// Turn vectorized loops with canonical induction variables into loops that
-/// only use a single EVL-based induction variable.
-struct EVLIndVarSimplifyPass : public PassInfoMixin<EVLIndVarSimplifyPass> {
- PreservedAnalyses run(Loop &L, LoopAnalysisManager &LAM,
- LoopStandardAnalysisResults &AR, LPMUpdater &U);
-};
-} // namespace llvm
-#endif
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index e4dab4acc0b4a..f84a16bd97224 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -375,7 +375,6 @@
#include "llvm/Transforms/Utils/SymbolRewriter.h"
#include "llvm/Transforms/Utils/UnifyFunctionExitNodes.h"
#include "llvm/Transforms/Utils/UnifyLoopExits.h"
-#include "llvm/Transforms/Vectorize/EVLIndVarSimplify.h"
#include "llvm/Transforms/Vectorize/LoadStoreVectorizer.h"
#include "llvm/Transforms/Vectorize/LoopIdiomVectorize.h"
#include "llvm/Transforms/Vectorize/LoopVectorize.h"
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 49d5d08474f0f..f0e7d36f78aab 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -755,7 +755,6 @@ LOOP_ANALYSIS("should-run-extra-simple-loop-unswitch",
#endif
LOOP_PASS("canon-freeze", CanonicalizeFreezeInLoopsPass())
LOOP_PASS("dot-ddg", DDGDotPrinterPass())
-LOOP_PASS("evl-iv-simplify", EVLIndVarSimplifyPass())
LOOP_PASS("guard-widening", GuardWideningPass())
LOOP_PASS("extra-simple-loop-unswitch-passes",
ExtraLoopPassManager<ShouldRunExtraSimpleLoopUnswitch>())
diff --git a/llvm/lib/Transforms/Vectorize/CMakeLists.txt b/llvm/lib/Transforms/Vectorize/CMakeLists.txt
index 96670fe3ea195..9f4a242214471 100644
--- a/llvm/lib/Transforms/Vectorize/CMakeLists.txt
+++ b/llvm/lib/Transforms/Vectorize/CMakeLists.txt
@@ -1,5 +1,4 @@
add_llvm_component_library(LLVMVectorize
- EVLIndVarSimplify.cpp
LoadStoreVectorizer.cpp
LoopIdiomVectorize.cpp
LoopVectorizationLegality.cpp
diff --git a/llvm/lib/Transforms/Vectorize/EVLIndVarSimplify.cpp b/llvm/lib/Transforms/Vectorize/EVLIndVarSimplify.cpp
deleted file mode 100644
index 5dd689799b828..0000000000000
--- a/llvm/lib/Transforms/Vectorize/EVLIndVarSimplify.cpp
+++ /dev/null
@@ -1,300 +0,0 @@
-//===---- EVLIndVarSimplify.cpp - Optimize vectorized loops w/ EVL IV------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// This pass optimizes a vectorized loop with canonical IV to using EVL-based
-// IV if it was tail-folded by predicated EVL.
-//
-//===----------------------------------------------------------------------===//
-
-#include "llvm/Transforms/Vectorize/EVLIndVarSimplify.h"
-#include "llvm/ADT/Statistic.h"
-#include "llvm/Analysis/IVDescriptors.h"
-#include "llvm/Analysis/LoopInfo.h"
-#include "llvm/Analysis/LoopPass.h"
-#include "llvm/Analysis/OptimizationRemarkEmitter.h"
-#include "llvm/Analysis/ScalarEvolution.h"
-#include "llvm/Analysis/ScalarEvolutionExpressions.h"
-#include "llvm/Analysis/ValueTracking.h"
-#include "llvm/IR/IRBuilder.h"
-#include "llvm/IR/PatternMatch.h"
-#include "llvm/Support/CommandLine.h"
-#include "llvm/Support/Debug.h"
-#include "llvm/Support/MathExtras.h"
-#include "llvm/Support/raw_ostream.h"
-#include "llvm/Transforms/Scalar/LoopPassManager.h"
-#include "llvm/Transforms/Utils/Local.h"
-
-#define DEBUG_TYPE "evl-iv-simplify"
-
-using namespace llvm;
-
-STATISTIC(NumEliminatedCanonicalIV, "Number of canonical IVs we eliminated");
-
-static cl::opt<bool> EnableEVLIndVarSimplify(
- "enable-evl-indvar-simplify",
- cl::desc("Enable EVL-based induction variable simplify Pass"), cl::Hidden,
- cl::init(true));
-
-namespace {
-struct EVLIndVarSimplifyImpl {
- ScalarEvolution &SE;
- OptimizationRemarkEmitter *ORE = nullptr;
-
- EVLIndVarSimplifyImpl(LoopStandardAnalysisResults &LAR,
- OptimizationRemarkEmitter *ORE)
- : SE(LAR.SE), ORE(ORE) {}
-
- /// Returns true if modify the loop.
- bool run(Loop &L);
-};
-} // anonymous namespace
-
-/// Returns the constant part of vectorization factor from the induction
-/// variable's step value SCEV expression.
-static uint32_t getVFFromIndVar(const SCEV *Step, const Function &F) {
- if (!Step)
- return 0U;
-
- // Looking for loops with IV step value in the form of `(<constant VF> x
- // vscale)`.
- if (const auto *Mul = dyn_cast<SCEVMulExpr>(Step)) {
- if (Mul->getNumOperands() == 2) {
- const SCEV *LHS = Mul->getOperand(0);
- const SCEV *RHS = Mul->getOperand(1);
- if (const auto *Const = dyn_cast<SCEVConstant>(LHS);
- Const && isa<SCEVVScale>(RHS)) {
- uint64_t V = Const->getAPInt().getLimitedValue();
- if (llvm::isUInt<32>(V))
- return V;
- }
- }
- }
-
- // If not, see if the vscale_range of the parent function is a fixed value,
- // which makes the step value to be replaced by a constant.
- if (F.hasFnAttribute(Attribute::VScaleRange))
- if (const auto *ConstStep = dyn_cast<SCEVConstant>(Step)) {
- APInt V = ConstStep->getAPInt().abs();
- ConstantRange CR = llvm::getVScaleRange(&F, 64);
- if (const APInt *Fixed = CR.getSingleElement()) {
- V = V.zextOrTrunc(Fixed->getBitWidth());
- uint64_t VF = V.udiv(*Fixed).getLimitedValue();
- if (VF && llvm::isUInt<32>(VF) &&
- // Make sure step is divisible by vscale.
- V.urem(*Fixed).isZero())
- return VF;
- }
- }
-
- return 0U;
-}
-
-bool EVLIndVarSimplifyImpl::run(Loop &L) {
- if (!EnableEVLIndVarSimplify)
- return false;
-
- if (!getBooleanLoopAttribute(&L, "llvm.loop.isvectorized"))
- return false;
- const MDOperand *EVLMD =
- findStringMetadataForLoop(&L, "llvm.loop.isvectorized.tailfoldingstyle")
- .value_or(nullptr);
- if (!EVLMD || !EVLMD->equalsStr("evl"))
- return false;
-
- BasicBlock *LatchBlock = L.getLoopLatch();
- ICmpInst *OrigLatchCmp = L.getLatchCmpInst();
- if (!LatchBlock || !OrigLatchCmp)
- return false;
-
- InductionDescriptor IVD;
- PHINode *IndVar = L.getInductionVariable(SE);
- if (!IndVar || !L.getInductionDescriptor(SE, IVD)) {
- const char *Reason = (IndVar ? "induction descriptor is not available"
- : "cannot recognize induction variable");
- LLVM_DEBUG(dbgs() << "Cannot retrieve IV from loop " << L.getName()
- << " because" << Reason << "\n");
- if (ORE) {
- ORE->emit([&]() {
- return OptimizationRemarkMissed(DEBUG_TYPE, "UnrecognizedIndVar",
- L.getStartLoc(), L.getHeader())
- << "Cannot retrieve IV because " << ore::NV("Reason", Reason);
- });
- }
- return false;
- }
-
- BasicBlock *InitBlock, *BackEdgeBlock;
- if (!L.getIncomingAndBackEdge(InitBlock, BackEdgeBlock)) {
- LLVM_DEBUG(dbgs() << "Expect unique incoming and backedge in "
- << L.getName() << "\n");
- if (ORE) {
- ORE->emit([&]() {
- return OptimizationRemarkMissed(DEBUG_TYPE, "UnrecognizedLoopStructure",
- L.getStartLoc(), L.getHeader())
- << "Does not have a unique incoming and backedge";
- });
- }
- return false;
- }
-
- // Retrieve the loop bounds.
- std::optional<Loop::LoopBounds> Bounds = L.getBounds(SE);
- if (!Bounds) {
- LLVM_DEBUG(dbgs() << "Could not obtain the bounds for loop " << L.getName()
- << "\n");
- if (ORE) {
- ORE->emit([&]() {
- return OptimizationRemarkMissed(DEBUG_TYPE, "UnrecognizedLoopStructure",
- L.getStartLoc(), L.getHeader())
- << "Could not obtain the loop bounds";
- });
- }
- return false;
- }
- Value *CanonicalIVInit = &Bounds->getInitialIVValue();
- Value *CanonicalIVFinal = &Bounds->getFinalIVValue();
-
- const SCEV *StepV = IVD.getStep();
- uint32_t VF = getVFFromIndVar(StepV, *L.getHeader()->getParent());
- if (!VF) {
- LLVM_DEBUG(dbgs() << "Could not infer VF from IndVar step '" << *StepV
- << "'\n");
- if (ORE) {
- ORE->emit([&]() {
- return OptimizationRemarkMissed(DEBUG_TYPE, "UnrecognizedIndVar",
- L.getStartLoc(), L.getHeader())
- << "Could not infer VF from IndVar step "
- << ore::NV("Step", StepV);
- });
- }
- return false;
- }
- LLVM_DEBUG(dbgs() << "Using VF=" << VF << " for loop " << L.getName()
- << "\n");
-
- // Try to find the EVL-based induction variable.
- using namespace PatternMatch;
- BasicBlock *BB = IndVar->getParent();
-
- Value *EVLIndVar = nullptr;
- Value *RemTC = nullptr;
- Value *TC = nullptr;
- auto IntrinsicMatch = m_Intrinsic<Intrinsic::experimental_get_vector_length>(
- m_Value(RemTC), m_SpecificInt(VF),
- /*Scalable=*/m_SpecificInt(1));
- for (PHINode &PN : BB->phis()) {
- if (&PN == IndVar)
- continue;
-
- // Check 1: it has to contain both incoming (init) & backedge blocks
- // from IndVar.
- if (PN.getBasicBlockIndex(InitBlock) < 0 ||
- PN.getBasicBlockIndex(BackEdgeBlock) < 0)
- continue;
- // Check 2: EVL index is always increasing, thus its inital value has to be
- // equal to either the initial IV value (when the canonical IV is also
- // increasing) or the last IV value (when canonical IV is decreasing).
- Value *Init = PN.getIncomingValueForBlock(InitBlock);
- using Direction = Loop::LoopBounds::Direction;
- switch (Bounds->getDirection()) {
- case Direction::Increasing:
- if (Init != CanonicalIVInit)
- continue;
- break;
- case Direction::Decreasing:
- if (Init != CanonicalIVFinal)
- continue;
- break;
- case Direction::Unknown:
- // To be more permissive and see if either the initial or final IV value
- // matches PN's init value.
- if (Init != CanonicalIVInit && Init != CanonicalIVFinal)
- continue;
- break;
- }
- Value *RecValue = PN.getIncomingValueForBlock(BackEdgeBlock);
- assert(RecValue && "expect recurrent IndVar value");
-
- LLVM_DEBUG(dbgs() << "Found candidate PN of EVL-based IndVar: " << PN
- << "\n");
-
- // Check 3: Pattern match to find the EVL-based index and total trip count
- // (TC).
- if (match(RecValue,
- m_c_Add(m_ZExtOrSelf(IntrinsicMatch), m_Specific(&PN))) &&
- match(RemTC, m_Sub(m_Value(TC), m_Specific(&PN)))) {
- EVLIndVar = RecValue;
- break;
- }
- }
-
- if (!EVLIndVar || !TC)
- return false;
-
- LLVM_DEBUG(dbgs() << "Using " << *EVLIndVar << " for EVL-based IndVar\n");
- if (ORE) {
- ORE->emit([&]() {
- DebugLoc DL;
- BasicBlock *Region = nullptr;
- if (auto *I = dyn_cast<Instruction>(EVLIndVar)) {
- DL = I->getDebugLoc();
- Region = I->getParent();
- } else {
- DL = L.getStartLoc();
- Region = L.getHeader();
- }
- return OptimizationRemark(DEBUG_TYPE, "UseEVLIndVar", DL, Region)
- << "Using " << ore::NV("EVLIndVar", EVLIndVar)
- << " for EVL-based IndVar";
- });
- }
-
- // Create an EVL-based comparison and replace the branch to use it as
- // predicate.
-
- // Loop::getLatchCmpInst check at the beginning of this function has ensured
- // that latch block ends in a conditional branch.
- auto *LatchBranch = cast<BranchInst>(LatchBlock->getTerminator());
- assert(LatchBranch->isConditional() &&
- "expect the loop latch to be ended with a conditional branch");
- ICmpInst::Predicate Pred;
- if (LatchBranch->getSuccessor(0) == L.getHeader())
- Pred = ICmpInst::ICMP_NE;
- else
- Pred = ICmpInst::ICMP_EQ;
-
- IRBuilder<> Builder(OrigLatchCmp);
- auto *NewLatchCmp = Builder.CreateICmp(Pred, EVLIndVar, TC);
- OrigLatchCmp->replaceAllUsesWith(NewLatchCmp);
-
- // llvm::RecursivelyDeleteDeadPHINode only deletes cycles whose values are
- // not used outside the cycles. However, in this case the now-RAUW-ed
- // OrigLatchCmp will be considered a use outside the cycle while in reality
- // it's practically dead. Thus we need to remove it before calling
- // RecursivelyDeleteDeadPHINode.
- (void)RecursivelyDeleteTriviallyDeadInstructions(OrigLatchCmp);
- if (llvm::RecursivelyDeleteDeadPHINode(IndVar))
- LLVM_DEBUG(dbgs() << "Removed original IndVar\n");
-
- ++NumEliminatedCanonicalIV;
-
- return true;
-}
-
-PreservedAnalyses EVLIndVarSimplifyPass::run(Loop &L, LoopAnalysisManager &LAM,
- LoopStandardAnalysisResults &AR,
- LPMUpdater &U) {
- Function &F = *L.getHeader()->getParent();
- auto &FAMProxy = LAM.getResult<FunctionAnalysisManagerLoopProxy>(L, AR);
- OptimizationRemarkEmitter *ORE =
- FAMProxy.getCachedResult<OptimizationRemarkEmitterAnalysis>(F);
-
- if (EVLIndVarSimplifyImpl(AR, ORE).run(L))
- return PreservedAnalyses::allInSet<CFGAnalyses>();
- return PreservedAnalyses::all();
-}
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/evl-iv-simplify.ll b/llvm/test/Transforms/LoopVectorize/RISCV/evl-iv-simplify.ll
deleted file mode 100644
index 4de0e666149f3..0000000000000
--- a/llvm/test/Transforms/LoopVectorize/RISCV/evl-iv-simplify.ll
+++ /dev/null
@@ -1,333 +0,0 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
-; RUN: opt -S -mtriple=riscv64 -mattr='+v' --passes='loop(evl-iv-simplify)' < %s | FileCheck %s
-; RUN: opt -S -mtriple=riscv64 -mattr='+v' --passes='loop(evl-iv-simplify),function(simplifycfg,dce)' < %s | FileCheck %s --check-prefix=LOOP-DEL
-
-define void @simple(ptr noalias %a, ptr noalias %b, <vscale x 4 x i32> %c, i64 %N) vscale_range(2, 1024) {
-; CHECK-LABEL: define void @simple(
-; CHECK-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], <vscale x 4 x i32> [[C:%.*]], i64 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
-; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = sub i64 -1, [[N]]
-; CHECK-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 4
-; CHECK-NEXT: [[TMP3:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
-; CHECK-NEXT: br i1 [[TMP3]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
-; CHECK: vector.ph:
-; CHECK-NEXT: [[TMP4:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT: [[TMP5:%.*]] = mul i64 [[TMP4]], 4
-; CHECK-NEXT: [[TMP6:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT: [[TMP7:%.*]] = mul i64 [[TMP6]], 4
-; CHECK-NEXT: [[TMP8:%.*]] = sub i64 [[TMP7]], 1
-; CHECK-NEXT: [[N_RND_UP:%.*]] = add i64 [[N]], [[TMP8]]
-; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[N_RND_UP]], [[TMP5]]
-; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[N_RND_UP]], [[N_MOD_VF]]
-; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
-; CHECK: vector.body:
-; CHECK-NEXT: [[EVL_BASED_IV:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_EVL_NEXT:%.*]], [[VECTOR_BODY]] ]
-; CHECK-NEXT: [[TMP11:%.*]] = sub i64 [[N]], [[EVL_BASED_IV]]
-; CHECK-NEXT: [[TMP12:%.*]] = call i32 @llvm.experimental.get.vector.length.i64(i64 [[TMP11]], i32 4, i1 true)
-; CHECK-NEXT: [[TMP13:%.*]] = add i64 [[EVL_BASED_IV]], 0
-; CHECK-NEXT: [[TMP14:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP13]]
-; CHECK-NEXT: [[TMP17:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i32 0
-; CHECK-NEXT: [[VP_OP_LOAD1:%.*]] = call <vscale x 4 x i32> @llvm.vp.load.nxv4i32.p0(ptr align 4 [[TMP17]], <vscale x 4 x i1> splat (i1 true), i32 [[TMP12]])
-; CHECK-NEXT: [[TMP18:%.*]] = add nsw <vscale x 4 x i32> [[C]], [[VP_OP_LOAD1]]
-; CHECK-NEXT: [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP13]]
-; CHECK-NEXT: [[TMP20:%.*]] = getelementptr inbounds i32, ptr [[TMP19]], i32 0
-; CHECK-NEXT: call void @llvm.vp.store.nxv4i32.p0(<vscale x 4 x i32> [[TMP18]], ptr align 4 [[TMP20]], <vscale x 4 x i1> splat (i1 true), i32 [[TMP12]])
-; CHECK-NEXT: [[TMP21:%.*]] = zext i32 [[TMP12]] to i64
-; CHECK-NEXT: [[INDEX_EVL_NEXT]] = add i64 [[TMP21]], [[EVL_BASED_IV]]
-; CHECK-NEXT: [[TMP22:%.*]] = icmp eq i64 [[INDEX_EVL_NEXT]], [[N]]
-; CHECK-NEXT: br i1 [[TMP22]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
-; CHECK: middle.block:
-; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
-; CHECK: scalar.ph:
-; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.*]] ]
-; CHECK-NEXT: br label [[FOR_BODY:%.*]]
-; CHECK: for.body:
-; CHECK-NEXT: [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], [[FOR_BODY]] ]
-; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IV]]
-; CHECK-NEXT: [[ADD:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
-; CHECK-NEXT: [[ARRAYIDX4:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IV]]
-; CHECK-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX4]], align 4
-; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
-; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[N]]
-; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]], label [[FOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
-; CHECK: for.cond.cleanup.loopexit:
-; CHECK-NEXT: br label [[FOR_COND_CLEANUP]]
-; CHECK: for.cond.cleanup:
-; CHECK-NEXT: ret void
-;
-; LOOP-DEL-LABEL: define void @simple(
-; LOOP-DEL-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], <vscale x 4 x i32> [[C:%.*]], i64 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
-; LOOP-DEL-NEXT: entry:
-; LOOP-DEL-NEXT: [[TMP0:%.*]] = sub i64 -1, [[N]]
-; LOOP-DEL-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
-; LOOP-DEL-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 4
-; LOOP-DEL-NEXT: [[TMP3:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
-; LOOP-DEL-NEXT: br i1 [[TMP3]], label [[FOR_BODY:%.*]], label [[VECTOR_PH:%.*]]
-; LOOP-DEL: vector.ph:
-; LOOP-DEL-NEXT: br label [[VECTOR_BODY:%.*]]
-; LOOP-DEL: vector.body:
-; LOOP-DEL-NEXT: [[EVL_BASED_IV:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_EVL_NEXT:%.*]], [[VECTOR_BODY]] ]
-; LOOP-DEL-NEXT: [[TMP4:%.*]] = sub i64 [[N]], [[EVL_BASED_IV]]
-; LOOP-DEL-NEXT: [[TMP5:%.*]] = call i32 @llvm.experimental.get.vector.length.i64(i64 [[TMP4]], i32 4, i1 true)
-; LOOP-DEL-NEXT: [[TMP6:%.*]] = add i64 [[EVL_BASED_IV]], 0
-; LOOP-DEL-NEXT: [[TMP7:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP6]]
-; LOOP-DEL-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, ptr [[TMP7]], i32 0
-; LOOP-DEL-NEXT: [[VP_OP_LOAD1:%.*]] = call <vscale x 4 x i32> @llvm.vp.load.nxv4i32.p0(ptr align 4 [[TMP10]], <vscale x 4 x i1> splat (i1 true), i32 [[TMP5]])
-; LOOP-DEL-NEXT: [[TMP11:%.*]] = add nsw <vscale x 4 x i32> [[C]], [[VP_OP_LOAD1]]
-; LOOP-DEL-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP6]]
-; LOOP-DEL-NEXT: [[TMP13:%.*]] = getelementpt...
[truncated]
|
|
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesInitially this was needed to replace the fixed-step canonical IV with the variable-step EVL IV, but this was eventually superseded by the loop vectorizer doing this transform itself in #147222. The pass was then removed from the RISC-V pipeline in #151483 and the loop vectorizer stopped emitting the metadata used by the pass in #155760, so now there's no users of it. Patch is 33.57 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/160454.diff 6 Files Affected:
diff --git a/llvm/include/llvm/Transforms/Vectorize/EVLIndVarSimplify.h b/llvm/include/llvm/Transforms/Vectorize/EVLIndVarSimplify.h
deleted file mode 100644
index 3178dc762a195..0000000000000
--- a/llvm/include/llvm/Transforms/Vectorize/EVLIndVarSimplify.h
+++ /dev/null
@@ -1,31 +0,0 @@
-//===------ EVLIndVarSimplify.h - Optimize vectorized loops w/ EVL IV------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// This pass optimizes a vectorized loop with canonical IV to using EVL-based
-// IV if it was tail-folded by predicated EVL.
-//
-//===----------------------------------------------------------------------===//
-
-#ifndef LLVM_TRANSFORMS_VECTORIZE_EVLINDVARSIMPLIFY_H
-#define LLVM_TRANSFORMS_VECTORIZE_EVLINDVARSIMPLIFY_H
-
-#include "llvm/Analysis/LoopAnalysisManager.h"
-#include "llvm/IR/PassManager.h"
-
-namespace llvm {
-class Loop;
-class LPMUpdater;
-
-/// Turn vectorized loops with canonical induction variables into loops that
-/// only use a single EVL-based induction variable.
-struct EVLIndVarSimplifyPass : public PassInfoMixin<EVLIndVarSimplifyPass> {
- PreservedAnalyses run(Loop &L, LoopAnalysisManager &LAM,
- LoopStandardAnalysisResults &AR, LPMUpdater &U);
-};
-} // namespace llvm
-#endif
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index e4dab4acc0b4a..f84a16bd97224 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -375,7 +375,6 @@
#include "llvm/Transforms/Utils/SymbolRewriter.h"
#include "llvm/Transforms/Utils/UnifyFunctionExitNodes.h"
#include "llvm/Transforms/Utils/UnifyLoopExits.h"
-#include "llvm/Transforms/Vectorize/EVLIndVarSimplify.h"
#include "llvm/Transforms/Vectorize/LoadStoreVectorizer.h"
#include "llvm/Transforms/Vectorize/LoopIdiomVectorize.h"
#include "llvm/Transforms/Vectorize/LoopVectorize.h"
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 49d5d08474f0f..f0e7d36f78aab 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -755,7 +755,6 @@ LOOP_ANALYSIS("should-run-extra-simple-loop-unswitch",
#endif
LOOP_PASS("canon-freeze", CanonicalizeFreezeInLoopsPass())
LOOP_PASS("dot-ddg", DDGDotPrinterPass())
-LOOP_PASS("evl-iv-simplify", EVLIndVarSimplifyPass())
LOOP_PASS("guard-widening", GuardWideningPass())
LOOP_PASS("extra-simple-loop-unswitch-passes",
ExtraLoopPassManager<ShouldRunExtraSimpleLoopUnswitch>())
diff --git a/llvm/lib/Transforms/Vectorize/CMakeLists.txt b/llvm/lib/Transforms/Vectorize/CMakeLists.txt
index 96670fe3ea195..9f4a242214471 100644
--- a/llvm/lib/Transforms/Vectorize/CMakeLists.txt
+++ b/llvm/lib/Transforms/Vectorize/CMakeLists.txt
@@ -1,5 +1,4 @@
add_llvm_component_library(LLVMVectorize
- EVLIndVarSimplify.cpp
LoadStoreVectorizer.cpp
LoopIdiomVectorize.cpp
LoopVectorizationLegality.cpp
diff --git a/llvm/lib/Transforms/Vectorize/EVLIndVarSimplify.cpp b/llvm/lib/Transforms/Vectorize/EVLIndVarSimplify.cpp
deleted file mode 100644
index 5dd689799b828..0000000000000
--- a/llvm/lib/Transforms/Vectorize/EVLIndVarSimplify.cpp
+++ /dev/null
@@ -1,300 +0,0 @@
-//===---- EVLIndVarSimplify.cpp - Optimize vectorized loops w/ EVL IV------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// This pass optimizes a vectorized loop with canonical IV to using EVL-based
-// IV if it was tail-folded by predicated EVL.
-//
-//===----------------------------------------------------------------------===//
-
-#include "llvm/Transforms/Vectorize/EVLIndVarSimplify.h"
-#include "llvm/ADT/Statistic.h"
-#include "llvm/Analysis/IVDescriptors.h"
-#include "llvm/Analysis/LoopInfo.h"
-#include "llvm/Analysis/LoopPass.h"
-#include "llvm/Analysis/OptimizationRemarkEmitter.h"
-#include "llvm/Analysis/ScalarEvolution.h"
-#include "llvm/Analysis/ScalarEvolutionExpressions.h"
-#include "llvm/Analysis/ValueTracking.h"
-#include "llvm/IR/IRBuilder.h"
-#include "llvm/IR/PatternMatch.h"
-#include "llvm/Support/CommandLine.h"
-#include "llvm/Support/Debug.h"
-#include "llvm/Support/MathExtras.h"
-#include "llvm/Support/raw_ostream.h"
-#include "llvm/Transforms/Scalar/LoopPassManager.h"
-#include "llvm/Transforms/Utils/Local.h"
-
-#define DEBUG_TYPE "evl-iv-simplify"
-
-using namespace llvm;
-
-STATISTIC(NumEliminatedCanonicalIV, "Number of canonical IVs we eliminated");
-
-static cl::opt<bool> EnableEVLIndVarSimplify(
- "enable-evl-indvar-simplify",
- cl::desc("Enable EVL-based induction variable simplify Pass"), cl::Hidden,
- cl::init(true));
-
-namespace {
-struct EVLIndVarSimplifyImpl {
- ScalarEvolution &SE;
- OptimizationRemarkEmitter *ORE = nullptr;
-
- EVLIndVarSimplifyImpl(LoopStandardAnalysisResults &LAR,
- OptimizationRemarkEmitter *ORE)
- : SE(LAR.SE), ORE(ORE) {}
-
- /// Returns true if modify the loop.
- bool run(Loop &L);
-};
-} // anonymous namespace
-
-/// Returns the constant part of vectorization factor from the induction
-/// variable's step value SCEV expression.
-static uint32_t getVFFromIndVar(const SCEV *Step, const Function &F) {
- if (!Step)
- return 0U;
-
- // Looking for loops with IV step value in the form of `(<constant VF> x
- // vscale)`.
- if (const auto *Mul = dyn_cast<SCEVMulExpr>(Step)) {
- if (Mul->getNumOperands() == 2) {
- const SCEV *LHS = Mul->getOperand(0);
- const SCEV *RHS = Mul->getOperand(1);
- if (const auto *Const = dyn_cast<SCEVConstant>(LHS);
- Const && isa<SCEVVScale>(RHS)) {
- uint64_t V = Const->getAPInt().getLimitedValue();
- if (llvm::isUInt<32>(V))
- return V;
- }
- }
- }
-
- // If not, see if the vscale_range of the parent function is a fixed value,
- // which makes the step value to be replaced by a constant.
- if (F.hasFnAttribute(Attribute::VScaleRange))
- if (const auto *ConstStep = dyn_cast<SCEVConstant>(Step)) {
- APInt V = ConstStep->getAPInt().abs();
- ConstantRange CR = llvm::getVScaleRange(&F, 64);
- if (const APInt *Fixed = CR.getSingleElement()) {
- V = V.zextOrTrunc(Fixed->getBitWidth());
- uint64_t VF = V.udiv(*Fixed).getLimitedValue();
- if (VF && llvm::isUInt<32>(VF) &&
- // Make sure step is divisible by vscale.
- V.urem(*Fixed).isZero())
- return VF;
- }
- }
-
- return 0U;
-}
-
-bool EVLIndVarSimplifyImpl::run(Loop &L) {
- if (!EnableEVLIndVarSimplify)
- return false;
-
- if (!getBooleanLoopAttribute(&L, "llvm.loop.isvectorized"))
- return false;
- const MDOperand *EVLMD =
- findStringMetadataForLoop(&L, "llvm.loop.isvectorized.tailfoldingstyle")
- .value_or(nullptr);
- if (!EVLMD || !EVLMD->equalsStr("evl"))
- return false;
-
- BasicBlock *LatchBlock = L.getLoopLatch();
- ICmpInst *OrigLatchCmp = L.getLatchCmpInst();
- if (!LatchBlock || !OrigLatchCmp)
- return false;
-
- InductionDescriptor IVD;
- PHINode *IndVar = L.getInductionVariable(SE);
- if (!IndVar || !L.getInductionDescriptor(SE, IVD)) {
- const char *Reason = (IndVar ? "induction descriptor is not available"
- : "cannot recognize induction variable");
- LLVM_DEBUG(dbgs() << "Cannot retrieve IV from loop " << L.getName()
- << " because" << Reason << "\n");
- if (ORE) {
- ORE->emit([&]() {
- return OptimizationRemarkMissed(DEBUG_TYPE, "UnrecognizedIndVar",
- L.getStartLoc(), L.getHeader())
- << "Cannot retrieve IV because " << ore::NV("Reason", Reason);
- });
- }
- return false;
- }
-
- BasicBlock *InitBlock, *BackEdgeBlock;
- if (!L.getIncomingAndBackEdge(InitBlock, BackEdgeBlock)) {
- LLVM_DEBUG(dbgs() << "Expect unique incoming and backedge in "
- << L.getName() << "\n");
- if (ORE) {
- ORE->emit([&]() {
- return OptimizationRemarkMissed(DEBUG_TYPE, "UnrecognizedLoopStructure",
- L.getStartLoc(), L.getHeader())
- << "Does not have a unique incoming and backedge";
- });
- }
- return false;
- }
-
- // Retrieve the loop bounds.
- std::optional<Loop::LoopBounds> Bounds = L.getBounds(SE);
- if (!Bounds) {
- LLVM_DEBUG(dbgs() << "Could not obtain the bounds for loop " << L.getName()
- << "\n");
- if (ORE) {
- ORE->emit([&]() {
- return OptimizationRemarkMissed(DEBUG_TYPE, "UnrecognizedLoopStructure",
- L.getStartLoc(), L.getHeader())
- << "Could not obtain the loop bounds";
- });
- }
- return false;
- }
- Value *CanonicalIVInit = &Bounds->getInitialIVValue();
- Value *CanonicalIVFinal = &Bounds->getFinalIVValue();
-
- const SCEV *StepV = IVD.getStep();
- uint32_t VF = getVFFromIndVar(StepV, *L.getHeader()->getParent());
- if (!VF) {
- LLVM_DEBUG(dbgs() << "Could not infer VF from IndVar step '" << *StepV
- << "'\n");
- if (ORE) {
- ORE->emit([&]() {
- return OptimizationRemarkMissed(DEBUG_TYPE, "UnrecognizedIndVar",
- L.getStartLoc(), L.getHeader())
- << "Could not infer VF from IndVar step "
- << ore::NV("Step", StepV);
- });
- }
- return false;
- }
- LLVM_DEBUG(dbgs() << "Using VF=" << VF << " for loop " << L.getName()
- << "\n");
-
- // Try to find the EVL-based induction variable.
- using namespace PatternMatch;
- BasicBlock *BB = IndVar->getParent();
-
- Value *EVLIndVar = nullptr;
- Value *RemTC = nullptr;
- Value *TC = nullptr;
- auto IntrinsicMatch = m_Intrinsic<Intrinsic::experimental_get_vector_length>(
- m_Value(RemTC), m_SpecificInt(VF),
- /*Scalable=*/m_SpecificInt(1));
- for (PHINode &PN : BB->phis()) {
- if (&PN == IndVar)
- continue;
-
- // Check 1: it has to contain both incoming (init) & backedge blocks
- // from IndVar.
- if (PN.getBasicBlockIndex(InitBlock) < 0 ||
- PN.getBasicBlockIndex(BackEdgeBlock) < 0)
- continue;
- // Check 2: EVL index is always increasing, thus its inital value has to be
- // equal to either the initial IV value (when the canonical IV is also
- // increasing) or the last IV value (when canonical IV is decreasing).
- Value *Init = PN.getIncomingValueForBlock(InitBlock);
- using Direction = Loop::LoopBounds::Direction;
- switch (Bounds->getDirection()) {
- case Direction::Increasing:
- if (Init != CanonicalIVInit)
- continue;
- break;
- case Direction::Decreasing:
- if (Init != CanonicalIVFinal)
- continue;
- break;
- case Direction::Unknown:
- // To be more permissive and see if either the initial or final IV value
- // matches PN's init value.
- if (Init != CanonicalIVInit && Init != CanonicalIVFinal)
- continue;
- break;
- }
- Value *RecValue = PN.getIncomingValueForBlock(BackEdgeBlock);
- assert(RecValue && "expect recurrent IndVar value");
-
- LLVM_DEBUG(dbgs() << "Found candidate PN of EVL-based IndVar: " << PN
- << "\n");
-
- // Check 3: Pattern match to find the EVL-based index and total trip count
- // (TC).
- if (match(RecValue,
- m_c_Add(m_ZExtOrSelf(IntrinsicMatch), m_Specific(&PN))) &&
- match(RemTC, m_Sub(m_Value(TC), m_Specific(&PN)))) {
- EVLIndVar = RecValue;
- break;
- }
- }
-
- if (!EVLIndVar || !TC)
- return false;
-
- LLVM_DEBUG(dbgs() << "Using " << *EVLIndVar << " for EVL-based IndVar\n");
- if (ORE) {
- ORE->emit([&]() {
- DebugLoc DL;
- BasicBlock *Region = nullptr;
- if (auto *I = dyn_cast<Instruction>(EVLIndVar)) {
- DL = I->getDebugLoc();
- Region = I->getParent();
- } else {
- DL = L.getStartLoc();
- Region = L.getHeader();
- }
- return OptimizationRemark(DEBUG_TYPE, "UseEVLIndVar", DL, Region)
- << "Using " << ore::NV("EVLIndVar", EVLIndVar)
- << " for EVL-based IndVar";
- });
- }
-
- // Create an EVL-based comparison and replace the branch to use it as
- // predicate.
-
- // Loop::getLatchCmpInst check at the beginning of this function has ensured
- // that latch block ends in a conditional branch.
- auto *LatchBranch = cast<BranchInst>(LatchBlock->getTerminator());
- assert(LatchBranch->isConditional() &&
- "expect the loop latch to be ended with a conditional branch");
- ICmpInst::Predicate Pred;
- if (LatchBranch->getSuccessor(0) == L.getHeader())
- Pred = ICmpInst::ICMP_NE;
- else
- Pred = ICmpInst::ICMP_EQ;
-
- IRBuilder<> Builder(OrigLatchCmp);
- auto *NewLatchCmp = Builder.CreateICmp(Pred, EVLIndVar, TC);
- OrigLatchCmp->replaceAllUsesWith(NewLatchCmp);
-
- // llvm::RecursivelyDeleteDeadPHINode only deletes cycles whose values are
- // not used outside the cycles. However, in this case the now-RAUW-ed
- // OrigLatchCmp will be considered a use outside the cycle while in reality
- // it's practically dead. Thus we need to remove it before calling
- // RecursivelyDeleteDeadPHINode.
- (void)RecursivelyDeleteTriviallyDeadInstructions(OrigLatchCmp);
- if (llvm::RecursivelyDeleteDeadPHINode(IndVar))
- LLVM_DEBUG(dbgs() << "Removed original IndVar\n");
-
- ++NumEliminatedCanonicalIV;
-
- return true;
-}
-
-PreservedAnalyses EVLIndVarSimplifyPass::run(Loop &L, LoopAnalysisManager &LAM,
- LoopStandardAnalysisResults &AR,
- LPMUpdater &U) {
- Function &F = *L.getHeader()->getParent();
- auto &FAMProxy = LAM.getResult<FunctionAnalysisManagerLoopProxy>(L, AR);
- OptimizationRemarkEmitter *ORE =
- FAMProxy.getCachedResult<OptimizationRemarkEmitterAnalysis>(F);
-
- if (EVLIndVarSimplifyImpl(AR, ORE).run(L))
- return PreservedAnalyses::allInSet<CFGAnalyses>();
- return PreservedAnalyses::all();
-}
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/evl-iv-simplify.ll b/llvm/test/Transforms/LoopVectorize/RISCV/evl-iv-simplify.ll
deleted file mode 100644
index 4de0e666149f3..0000000000000
--- a/llvm/test/Transforms/LoopVectorize/RISCV/evl-iv-simplify.ll
+++ /dev/null
@@ -1,333 +0,0 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
-; RUN: opt -S -mtriple=riscv64 -mattr='+v' --passes='loop(evl-iv-simplify)' < %s | FileCheck %s
-; RUN: opt -S -mtriple=riscv64 -mattr='+v' --passes='loop(evl-iv-simplify),function(simplifycfg,dce)' < %s | FileCheck %s --check-prefix=LOOP-DEL
-
-define void @simple(ptr noalias %a, ptr noalias %b, <vscale x 4 x i32> %c, i64 %N) vscale_range(2, 1024) {
-; CHECK-LABEL: define void @simple(
-; CHECK-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], <vscale x 4 x i32> [[C:%.*]], i64 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
-; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = sub i64 -1, [[N]]
-; CHECK-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 4
-; CHECK-NEXT: [[TMP3:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
-; CHECK-NEXT: br i1 [[TMP3]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
-; CHECK: vector.ph:
-; CHECK-NEXT: [[TMP4:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT: [[TMP5:%.*]] = mul i64 [[TMP4]], 4
-; CHECK-NEXT: [[TMP6:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT: [[TMP7:%.*]] = mul i64 [[TMP6]], 4
-; CHECK-NEXT: [[TMP8:%.*]] = sub i64 [[TMP7]], 1
-; CHECK-NEXT: [[N_RND_UP:%.*]] = add i64 [[N]], [[TMP8]]
-; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[N_RND_UP]], [[TMP5]]
-; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[N_RND_UP]], [[N_MOD_VF]]
-; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
-; CHECK: vector.body:
-; CHECK-NEXT: [[EVL_BASED_IV:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_EVL_NEXT:%.*]], [[VECTOR_BODY]] ]
-; CHECK-NEXT: [[TMP11:%.*]] = sub i64 [[N]], [[EVL_BASED_IV]]
-; CHECK-NEXT: [[TMP12:%.*]] = call i32 @llvm.experimental.get.vector.length.i64(i64 [[TMP11]], i32 4, i1 true)
-; CHECK-NEXT: [[TMP13:%.*]] = add i64 [[EVL_BASED_IV]], 0
-; CHECK-NEXT: [[TMP14:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP13]]
-; CHECK-NEXT: [[TMP17:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i32 0
-; CHECK-NEXT: [[VP_OP_LOAD1:%.*]] = call <vscale x 4 x i32> @llvm.vp.load.nxv4i32.p0(ptr align 4 [[TMP17]], <vscale x 4 x i1> splat (i1 true), i32 [[TMP12]])
-; CHECK-NEXT: [[TMP18:%.*]] = add nsw <vscale x 4 x i32> [[C]], [[VP_OP_LOAD1]]
-; CHECK-NEXT: [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP13]]
-; CHECK-NEXT: [[TMP20:%.*]] = getelementptr inbounds i32, ptr [[TMP19]], i32 0
-; CHECK-NEXT: call void @llvm.vp.store.nxv4i32.p0(<vscale x 4 x i32> [[TMP18]], ptr align 4 [[TMP20]], <vscale x 4 x i1> splat (i1 true), i32 [[TMP12]])
-; CHECK-NEXT: [[TMP21:%.*]] = zext i32 [[TMP12]] to i64
-; CHECK-NEXT: [[INDEX_EVL_NEXT]] = add i64 [[TMP21]], [[EVL_BASED_IV]]
-; CHECK-NEXT: [[TMP22:%.*]] = icmp eq i64 [[INDEX_EVL_NEXT]], [[N]]
-; CHECK-NEXT: br i1 [[TMP22]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
-; CHECK: middle.block:
-; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
-; CHECK: scalar.ph:
-; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.*]] ]
-; CHECK-NEXT: br label [[FOR_BODY:%.*]]
-; CHECK: for.body:
-; CHECK-NEXT: [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], [[FOR_BODY]] ]
-; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IV]]
-; CHECK-NEXT: [[ADD:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
-; CHECK-NEXT: [[ARRAYIDX4:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IV]]
-; CHECK-NEXT: store i32 [[ADD]], ptr [[ARRAYIDX4]], align 4
-; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
-; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[N]]
-; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]], label [[FOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
-; CHECK: for.cond.cleanup.loopexit:
-; CHECK-NEXT: br label [[FOR_COND_CLEANUP]]
-; CHECK: for.cond.cleanup:
-; CHECK-NEXT: ret void
-;
-; LOOP-DEL-LABEL: define void @simple(
-; LOOP-DEL-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], <vscale x 4 x i32> [[C:%.*]], i64 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
-; LOOP-DEL-NEXT: entry:
-; LOOP-DEL-NEXT: [[TMP0:%.*]] = sub i64 -1, [[N]]
-; LOOP-DEL-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
-; LOOP-DEL-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 4
-; LOOP-DEL-NEXT: [[TMP3:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
-; LOOP-DEL-NEXT: br i1 [[TMP3]], label [[FOR_BODY:%.*]], label [[VECTOR_PH:%.*]]
-; LOOP-DEL: vector.ph:
-; LOOP-DEL-NEXT: br label [[VECTOR_BODY:%.*]]
-; LOOP-DEL: vector.body:
-; LOOP-DEL-NEXT: [[EVL_BASED_IV:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_EVL_NEXT:%.*]], [[VECTOR_BODY]] ]
-; LOOP-DEL-NEXT: [[TMP4:%.*]] = sub i64 [[N]], [[EVL_BASED_IV]]
-; LOOP-DEL-NEXT: [[TMP5:%.*]] = call i32 @llvm.experimental.get.vector.length.i64(i64 [[TMP4]], i32 4, i1 true)
-; LOOP-DEL-NEXT: [[TMP6:%.*]] = add i64 [[EVL_BASED_IV]], 0
-; LOOP-DEL-NEXT: [[TMP7:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP6]]
-; LOOP-DEL-NEXT: [[TMP10:%.*]] = getelementptr inbounds i32, ptr [[TMP7]], i32 0
-; LOOP-DEL-NEXT: [[VP_OP_LOAD1:%.*]] = call <vscale x 4 x i32> @llvm.vp.load.nxv4i32.p0(ptr align 4 [[TMP10]], <vscale x 4 x i1> splat (i1 true), i32 [[TMP5]])
-; LOOP-DEL-NEXT: [[TMP11:%.*]] = add nsw <vscale x 4 x i32> [[C]], [[VP_OP_LOAD1]]
-; LOOP-DEL-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP6]]
-; LOOP-DEL-NEXT: [[TMP13:%.*]] = getelementpt...
[truncated]
|
fhahn
left a comment
There was a problem hiding this comment.
LGTM, thanks, but good to wait a bit before landing to see if there are any concerns from potential users
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/186/builds/12673 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/76/builds/12985 Here is the relevant piece of the build log for the reference |
Initially this was needed to replace the fixed-step canonical IV with the variable-step EVL IV, but this was eventually superseded by the loop vectorizer doing this transform itself in llvm#147222. The pass was then removed from the RISC-V pipeline in llvm#151483 and the loop vectorizer stopped emitting the metadata used by the pass in llvm#155760, so now there's no users of it.
## Version and Build Flow
- fetch branch / apply local patch -> fetch exact submodule commit with
`git fetch origin "${LLVM_COMMIT}"`
- `LLVM_COMMIT + riscv-jitlink.patch hash` -> `LLVM_COMMIT` cache key
- apply `riscv-jitlink.patch` -> no patch; current LLVM already has the fix
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97: `nanobind` /
`nanobind==2.4.*` -> `nanobind>=2.9,<3.0` / `nanobind==2.9.*`
## Python Bindings
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97:
`PybindAdaptors.h` -> `NanobindAdaptors.h`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97:
`PYBIND11_MODULE` -> `NB_MODULE`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97: manual
pybind/nanobind CMake setup -> `mlir_configure_python_dev_packages()`
## MLIR API
- [#162167](llvm/llvm-project#162167) / ea291d0e8c93:
`vector::SplatOp` -> `vector::BroadcastOp`
- [#149603](llvm/llvm-project#149603) / 33465bb2bb75:
`vector::InsertElementOp` -> `vector::InsertOp`
- [#149603](llvm/llvm-project#149603) / 33465bb2bb75:
`vector::ExtractElementOp` -> `vector::ExtractOp`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c:
`ValueRange{std::nullopt}` -> `ValueRange{}`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c:
`TypeRange(std::nullopt)` -> `TypeRange{}`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c: empty
`scf.yield` with null operands -> empty `scf.yield`
- [#196082](llvm/llvm-project#196082) / dd57b0c9728a:
`computeConstantBound(..., /*closedUB=*/true)` ->
`computeConstantBound(..., ValueBoundsOptions{/*closedUB=*/true})`
- [#182715](llvm/llvm-project#182715) / 72f5050ae765:
`applyPatternsAndFoldGreedily(...)` -> `applyPatternsGreedily(...)`
## GPU and AMX
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea:
`launchOp.getWorkgroupAttributions()` ->
`launchOp.getWorkgroupAttributionBBArgs()`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea: manual
`gpu.kernel` attr -> `outlinedFunc.setKernel(true)`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea:
`setAttr(getKnown*AttrName(), ...)` ->
`setKnownBlockSizeAttr(...)` / `setKnownGridSizeAttr(...)`
- [#111197](llvm/llvm-project#111197) / 2f743ac52e94:
`mlir/Dialect/AMX/AMXDialect.h` -> `mlir/Dialect/X86/X86Dialect.h`
- [#111197](llvm/llvm-project#111197) / 2f743ac52e94:
`mlir::amx::AMXDialect` -> `mlir::x86::X86Dialect`
## LLVM Source Lists
- [#167271](llvm/llvm-project#167271) / 2345b7d98f75:
`InlineSizeEstimatorAnalysis.cpp` -> removed upstream; no Buddy source-list
entry needed
- [#172680](llvm/llvm-project#172680) / 71760f324ff9:
`ExpandLargeDivRem.cpp` -> merged into `ExpandFp.cpp`
- [#172681](llvm/llvm-project#172681) / 5c05824d2bd3:
`ExpandFp.cpp` / merged `ExpandLargeDivRem.cpp` -> `ExpandIRInsts.cpp`
- [#77370](llvm/llvm-project#77370) / 5e0a06b34d09:
`ExpandMemCmp.cpp` in CodeGen -> moved to `Transforms/Scalar`
- [#181547](llvm/llvm-project#181547) / 9a0d65cdfd0d:
`CallBrPrepare.cpp` -> `InlineAsmPrepare.cpp`
- [#160454](llvm/llvm-project#160454) / aa6a33ae6556:
`EVLIndVarSimplify.cpp` -> removed upstream; no Buddy source-list entry needed
- [#192635](llvm/llvm-project#192635) / 680a9908194e:
`VPlanSLP.cpp` -> removed upstream; no Buddy source-list entry needed
## RISC-V Backend
- local Buddy `M0`-`M7` definitions -> upstream RISC-V `M0`-`M7` definitions
- Buddy `MatrixReg` using local mask regs -> `MatrixReg` using upstream mask regs
- IME `*N` instructions in disassembler tables -> mark affected instructions
`isCodeGenOnly = 1`
## Tool API and Link Libraries
- [#156715](llvm/llvm-project#156715) / dfbd76bda01e:
`Expected<std::unique_ptr<ToolOutputFile>>` from
`setupLLVMOptimizationRemarks` -> `Expected<LLVMRemarkFileHandle>`
- [#186998](llvm/llvm-project#186998) / 69cd746bd2f1:
`codegen::setFunctionAttributes(CPUStr, FeaturesStr, F)` ->
`codegen::setFunctionAttributes(F, CPUStr, FeaturesStr[, TuneCPUStr])`
- [#186998](llvm/llvm-project#186998) / 69cd746bd2f1:
`codegen::setFunctionAttributes(CPUStr, FeaturesStr, *M)` ->
`codegen::setFunctionAttributes(*M, CPUStr, FeaturesStr[, TuneCPUStr])`
- [#87226](llvm/llvm-project#87226) / d0e72ccc54df:
`TargetMachine::buildCodeGenPipeline(MPM, OS, DwoOut, FileType, Opt, PIC)` ->
pass `MAM` and `MMI.getContext()`
- [#170846](llvm/llvm-project#170846) / cd806d7e7689: `buddy-llc`
without `Plugins` -> link LLVM `Plugins`
- [#150805](llvm/llvm-project#150805) / ace42cf063a5,
[#151150](llvm/llvm-project#151150) / e68a20e0b762: implicit MLIR
register-all symbols -> link `MLIRRegisterAllDialects`,
`MLIRRegisterAllExtensions`, `MLIRRegisterAllPasses`
- fetch branch / apply local patch -> fetch exact submodule commit with
`git fetch origin "${LLVM_COMMIT}"`
- `LLVM_COMMIT + riscv-jitlink.patch hash` -> `LLVM_COMMIT` cache key
- apply `riscv-jitlink.patch` -> no patch; current LLVM already has the fix
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97: `nanobind` /
`nanobind==2.4.*` -> `nanobind>=2.9,<3.0` / `nanobind==2.9.*`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97:
`PybindAdaptors.h` -> `NanobindAdaptors.h`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97:
`PYBIND11_MODULE` -> `NB_MODULE`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97: manual
pybind/nanobind CMake setup -> `mlir_configure_python_dev_packages()`
- [#162167](llvm/llvm-project#162167) / ea291d0e8c93:
`vector::SplatOp` -> `vector::BroadcastOp`
- [#149603](llvm/llvm-project#149603) / 33465bb2bb75:
`vector::InsertElementOp` -> `vector::InsertOp`
- [#149603](llvm/llvm-project#149603) / 33465bb2bb75:
`vector::ExtractElementOp` -> `vector::ExtractOp`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c:
`ValueRange{std::nullopt}` -> `ValueRange{}`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c:
`TypeRange(std::nullopt)` -> `TypeRange{}`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c: empty
`scf.yield` with null operands -> empty `scf.yield`
- [#196082](llvm/llvm-project#196082) / dd57b0c9728a:
`computeConstantBound(..., /*closedUB=*/true)` ->
`computeConstantBound(..., ValueBoundsOptions{/*closedUB=*/true})`
- [#182715](llvm/llvm-project#182715) / 72f5050ae765:
`applyPatternsAndFoldGreedily(...)` -> `applyPatternsGreedily(...)`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea:
`launchOp.getWorkgroupAttributions()` ->
`launchOp.getWorkgroupAttributionBBArgs()`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea: manual
`gpu.kernel` attr -> `outlinedFunc.setKernel(true)`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea:
`setAttr(getKnown*AttrName(), ...)` ->
`setKnownBlockSizeAttr(...)` / `setKnownGridSizeAttr(...)`
- [#111197](llvm/llvm-project#111197) / 2f743ac52e94:
`mlir/Dialect/AMX/AMXDialect.h` -> `mlir/Dialect/X86/X86Dialect.h`
- [#111197](llvm/llvm-project#111197) / 2f743ac52e94:
`mlir::amx::AMXDialect` -> `mlir::x86::X86Dialect`
- [#167271](llvm/llvm-project#167271) / 2345b7d98f75:
`InlineSizeEstimatorAnalysis.cpp` -> removed upstream; no Buddy source-list
entry needed
- [#172680](llvm/llvm-project#172680) / 71760f324ff9:
`ExpandLargeDivRem.cpp` -> merged into `ExpandFp.cpp`
- [#172681](llvm/llvm-project#172681) / 5c05824d2bd3:
`ExpandFp.cpp` / merged `ExpandLargeDivRem.cpp` -> `ExpandIRInsts.cpp`
- [#77370](llvm/llvm-project#77370) / 5e0a06b34d09:
`ExpandMemCmp.cpp` in CodeGen -> moved to `Transforms/Scalar`
- [#181547](llvm/llvm-project#181547) / 9a0d65cdfd0d:
`CallBrPrepare.cpp` -> `InlineAsmPrepare.cpp`
- [#160454](llvm/llvm-project#160454) / aa6a33ae6556:
`EVLIndVarSimplify.cpp` -> removed upstream; no Buddy source-list entry needed
- [#192635](llvm/llvm-project#192635) / 680a9908194e:
`VPlanSLP.cpp` -> removed upstream; no Buddy source-list entry needed
- local Buddy `M0`-`M7` definitions -> upstream RISC-V `M0`-`M7` definitions
- Buddy `MatrixReg` using local mask regs -> `MatrixReg` using upstream mask regs
- IME `*N` instructions in disassembler tables -> mark affected instructions
`isCodeGenOnly = 1`
- [#156715](llvm/llvm-project#156715) / dfbd76bda01e:
`Expected<std::unique_ptr<ToolOutputFile>>` from
`setupLLVMOptimizationRemarks` -> `Expected<LLVMRemarkFileHandle>`
- [#186998](llvm/llvm-project#186998) / 69cd746bd2f1:
`codegen::setFunctionAttributes(CPUStr, FeaturesStr, F)` ->
`codegen::setFunctionAttributes(F, CPUStr, FeaturesStr[, TuneCPUStr])`
- [#186998](llvm/llvm-project#186998) / 69cd746bd2f1:
`codegen::setFunctionAttributes(CPUStr, FeaturesStr, *M)` ->
`codegen::setFunctionAttributes(*M, CPUStr, FeaturesStr[, TuneCPUStr])`
- [#87226](llvm/llvm-project#87226) / d0e72ccc54df:
`TargetMachine::buildCodeGenPipeline(MPM, OS, DwoOut, FileType, Opt, PIC)` ->
pass `MAM` and `MMI.getContext()`
- [#170846](llvm/llvm-project#170846) / cd806d7e7689: `buddy-llc`
without `Plugins` -> link LLVM `Plugins`
- [#150805](llvm/llvm-project#150805) / ace42cf063a5,
[#151150](llvm/llvm-project#151150) / e68a20e0b762: implicit MLIR
register-all symbols -> link `MLIRRegisterAllDialects`,
`MLIRRegisterAllExtensions`, `MLIRRegisterAllPasses`
- fetch branch / apply local patch -> fetch exact submodule commit with
`git fetch origin "${LLVM_COMMIT}"`
- `LLVM_COMMIT + riscv-jitlink.patch hash` -> `LLVM_COMMIT` cache key
- apply `riscv-jitlink.patch` -> no patch; current LLVM already has the fix
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97: `nanobind` /
`nanobind==2.4.*` -> `nanobind>=2.9,<3.0` / `nanobind==2.9.*`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97:
`PybindAdaptors.h` -> `NanobindAdaptors.h`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97:
`PYBIND11_MODULE` -> `NB_MODULE`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97: manual
pybind/nanobind CMake setup -> `mlir_configure_python_dev_packages()`
- [#162167](llvm/llvm-project#162167) / ea291d0e8c93:
`vector::SplatOp` -> `vector::BroadcastOp`
- [#149603](llvm/llvm-project#149603) / 33465bb2bb75:
`vector::InsertElementOp` -> `vector::InsertOp`
- [#149603](llvm/llvm-project#149603) / 33465bb2bb75:
`vector::ExtractElementOp` -> `vector::ExtractOp`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c:
`ValueRange{std::nullopt}` -> `ValueRange{}`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c:
`TypeRange(std::nullopt)` -> `TypeRange{}`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c: empty
`scf.yield` with null operands -> empty `scf.yield`
- [#196082](llvm/llvm-project#196082) / dd57b0c9728a:
`computeConstantBound(..., /*closedUB=*/true)` ->
`computeConstantBound(..., ValueBoundsOptions{/*closedUB=*/true})`
- [#182715](llvm/llvm-project#182715) / 72f5050ae765:
`applyPatternsAndFoldGreedily(...)` -> `applyPatternsGreedily(...)`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea:
`launchOp.getWorkgroupAttributions()` ->
`launchOp.getWorkgroupAttributionBBArgs()`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea: manual
`gpu.kernel` attr -> `outlinedFunc.setKernel(true)`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea:
`setAttr(getKnown*AttrName(), ...)` ->
`setKnownBlockSizeAttr(...)` / `setKnownGridSizeAttr(...)`
- [#111197](llvm/llvm-project#111197) / 2f743ac52e94:
`mlir/Dialect/AMX/AMXDialect.h` -> `mlir/Dialect/X86/X86Dialect.h`
- [#111197](llvm/llvm-project#111197) / 2f743ac52e94:
`mlir::amx::AMXDialect` -> `mlir::x86::X86Dialect`
- [#167271](llvm/llvm-project#167271) / 2345b7d98f75:
`InlineSizeEstimatorAnalysis.cpp` -> removed upstream; no Buddy source-list
entry needed
- [#172680](llvm/llvm-project#172680) / 71760f324ff9:
`ExpandLargeDivRem.cpp` -> merged into `ExpandFp.cpp`
- [#172681](llvm/llvm-project#172681) / 5c05824d2bd3:
`ExpandFp.cpp` / merged `ExpandLargeDivRem.cpp` -> `ExpandIRInsts.cpp`
- [#77370](llvm/llvm-project#77370) / 5e0a06b34d09:
`ExpandMemCmp.cpp` in CodeGen -> moved to `Transforms/Scalar`
- [#181547](llvm/llvm-project#181547) / 9a0d65cdfd0d:
`CallBrPrepare.cpp` -> `InlineAsmPrepare.cpp`
- [#160454](llvm/llvm-project#160454) / aa6a33ae6556:
`EVLIndVarSimplify.cpp` -> removed upstream; no Buddy source-list entry needed
- [#192635](llvm/llvm-project#192635) / 680a9908194e:
`VPlanSLP.cpp` -> removed upstream; no Buddy source-list entry needed
- local Buddy `M0`-`M7` definitions -> upstream RISC-V `M0`-`M7` definitions
- Buddy `MatrixReg` using local mask regs -> `MatrixReg` using upstream mask regs
- IME `*N` instructions in disassembler tables -> mark affected instructions
`isCodeGenOnly = 1`
- [#156715](llvm/llvm-project#156715) / dfbd76bda01e:
`Expected<std::unique_ptr<ToolOutputFile>>` from
`setupLLVMOptimizationRemarks` -> `Expected<LLVMRemarkFileHandle>`
- [#186998](llvm/llvm-project#186998) / 69cd746bd2f1:
`codegen::setFunctionAttributes(CPUStr, FeaturesStr, F)` ->
`codegen::setFunctionAttributes(F, CPUStr, FeaturesStr[, TuneCPUStr])`
- [#186998](llvm/llvm-project#186998) / 69cd746bd2f1:
`codegen::setFunctionAttributes(CPUStr, FeaturesStr, *M)` ->
`codegen::setFunctionAttributes(*M, CPUStr, FeaturesStr[, TuneCPUStr])`
- [#87226](llvm/llvm-project#87226) / d0e72ccc54df:
`TargetMachine::buildCodeGenPipeline(MPM, OS, DwoOut, FileType, Opt, PIC)` ->
pass `MAM` and `MMI.getContext()`
- [#170846](llvm/llvm-project#170846) / cd806d7e7689: `buddy-llc`
without `Plugins` -> link LLVM `Plugins`
- [#150805](llvm/llvm-project#150805) / ace42cf063a5,
[#151150](llvm/llvm-project#151150) / e68a20e0b762: implicit MLIR
register-all symbols -> link `MLIRRegisterAllDialects`,
`MLIRRegisterAllExtensions`, `MLIRRegisterAllPasses`
- fetch branch / apply local patch -> fetch exact submodule commit with
`git fetch origin "${LLVM_COMMIT}"`
- `LLVM_COMMIT + riscv-jitlink.patch hash` -> `LLVM_COMMIT` cache key
- apply `riscv-jitlink.patch` -> no patch; current LLVM already has the fix
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97: `nanobind` /
`nanobind==2.4.*` -> `nanobind>=2.9,<3.0` / `nanobind==2.9.*`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97:
`PybindAdaptors.h` -> `NanobindAdaptors.h`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97:
`PYBIND11_MODULE` -> `NB_MODULE`
- [#172581](llvm/llvm-project#172581) / 3d7018c70b97: manual
pybind/nanobind CMake setup -> `mlir_configure_python_dev_packages()`
- [#162167](llvm/llvm-project#162167) / ea291d0e8c93:
`vector::SplatOp` -> `vector::BroadcastOp`
- [#149603](llvm/llvm-project#149603) / 33465bb2bb75:
`vector::InsertElementOp` -> `vector::InsertOp`
- [#149603](llvm/llvm-project#149603) / 33465bb2bb75:
`vector::ExtractElementOp` -> `vector::ExtractOp`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c:
`ValueRange{std::nullopt}` -> `ValueRange{}`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c:
`TypeRange(std::nullopt)` -> `TypeRange{}`
- [#145445](llvm/llvm-project#145445) / 63f30d7d820c: empty
`scf.yield` with null operands -> empty `scf.yield`
- [#196082](llvm/llvm-project#196082) / dd57b0c9728a:
`computeConstantBound(..., /*closedUB=*/true)` ->
`computeConstantBound(..., ValueBoundsOptions{/*closedUB=*/true})`
- [#182715](llvm/llvm-project#182715) / 72f5050ae765:
`applyPatternsAndFoldGreedily(...)` -> `applyPatternsGreedily(...)`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea:
`launchOp.getWorkgroupAttributions()` ->
`launchOp.getWorkgroupAttributionBBArgs()`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea: manual
`gpu.kernel` attr -> `outlinedFunc.setKernel(true)`
- [#188905](llvm/llvm-project#188905) / c1cff89bdcea:
`setAttr(getKnown*AttrName(), ...)` ->
`setKnownBlockSizeAttr(...)` / `setKnownGridSizeAttr(...)`
- [#111197](llvm/llvm-project#111197) / 2f743ac52e94:
`mlir/Dialect/AMX/AMXDialect.h` -> `mlir/Dialect/X86/X86Dialect.h`
- [#111197](llvm/llvm-project#111197) / 2f743ac52e94:
`mlir::amx::AMXDialect` -> `mlir::x86::X86Dialect`
- [#167271](llvm/llvm-project#167271) / 2345b7d98f75:
`InlineSizeEstimatorAnalysis.cpp` -> removed upstream; no Buddy source-list
entry needed
- [#172680](llvm/llvm-project#172680) / 71760f324ff9:
`ExpandLargeDivRem.cpp` -> merged into `ExpandFp.cpp`
- [#172681](llvm/llvm-project#172681) / 5c05824d2bd3:
`ExpandFp.cpp` / merged `ExpandLargeDivRem.cpp` -> `ExpandIRInsts.cpp`
- [#77370](llvm/llvm-project#77370) / 5e0a06b34d09:
`ExpandMemCmp.cpp` in CodeGen -> moved to `Transforms/Scalar`
- [#181547](llvm/llvm-project#181547) / 9a0d65cdfd0d:
`CallBrPrepare.cpp` -> `InlineAsmPrepare.cpp`
- [#160454](llvm/llvm-project#160454) / aa6a33ae6556:
`EVLIndVarSimplify.cpp` -> removed upstream; no Buddy source-list entry needed
- [#192635](llvm/llvm-project#192635) / 680a9908194e:
`VPlanSLP.cpp` -> removed upstream; no Buddy source-list entry needed
- local Buddy `M0`-`M7` definitions -> upstream RISC-V `M0`-`M7` definitions
- Buddy `MatrixReg` using local mask regs -> `MatrixReg` using upstream mask regs
- IME `*N` instructions in disassembler tables -> mark affected instructions
`isCodeGenOnly = 1`
- [#156715](llvm/llvm-project#156715) / dfbd76bda01e:
`Expected<std::unique_ptr<ToolOutputFile>>` from
`setupLLVMOptimizationRemarks` -> `Expected<LLVMRemarkFileHandle>`
- [#186998](llvm/llvm-project#186998) / 69cd746bd2f1:
`codegen::setFunctionAttributes(CPUStr, FeaturesStr, F)` ->
`codegen::setFunctionAttributes(F, CPUStr, FeaturesStr[, TuneCPUStr])`
- [#186998](llvm/llvm-project#186998) / 69cd746bd2f1:
`codegen::setFunctionAttributes(CPUStr, FeaturesStr, *M)` ->
`codegen::setFunctionAttributes(*M, CPUStr, FeaturesStr[, TuneCPUStr])`
- [#87226](llvm/llvm-project#87226) / d0e72ccc54df:
`TargetMachine::buildCodeGenPipeline(MPM, OS, DwoOut, FileType, Opt, PIC)` ->
pass `MAM` and `MMI.getContext()`
- [#170846](llvm/llvm-project#170846) / cd806d7e7689: `buddy-llc`
without `Plugins` -> link LLVM `Plugins`
- [#150805](llvm/llvm-project#150805) / ace42cf063a5,
[#151150](llvm/llvm-project#151150) / e68a20e0b762: implicit MLIR
register-all symbols -> link `MLIRRegisterAllDialects`,
`MLIRRegisterAllExtensions`, `MLIRRegisterAllPasses`
Initially this was needed to replace the fixed-step canonical IV with the variable-step EVL IV, but this was eventually superseded by the loop vectorizer doing this transform itself in #147222. The pass was then removed from the RISC-V pipeline in #151483 and the loop vectorizer stopped emitting the metadata used by the pass in #155760, so now there's no users of it.