[IR2Vec] Consider only reachable BBs and non-debug instructions by svkeerthy · Pull Request #143476 · llvm/llvm-project

svkeerthy · 2025-06-10T05:12:47Z

Changes to consider BBs that are reachable from the entry block. Similarly we skip debug instruction while computing the embeddings.

(Tracking issue - #141817)

svkeerthy · 2025-06-10T05:13:07Z

[MLGO][IR2Vec] Integrating IR2Vec with MLInliner #143479 : 2 dependent PRs (#143986 , #144139 )
[IR2Vec] Consider only reachable BBs and non-debug instructions #143476 👈 (View in Graphite)
[IR2Vec] Minor vocab changes and exposing weights #143200
[IR2Vec] Exposing Embedding as an data type wrapped around std::vector<double> #143197
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-06-10T05:45:45Z

@llvm/pr-subscribers-mlgo

@llvm/pr-subscribers-llvm-analysis

Author: S. VenkataKeerthy (svkeerthy)

Changes

Changes to consider BBs that are reachable from the entry block. Similarly we skip debug instruction while computing the embeddings.

(Tracking issue - #141817)

Full diff: https://github.com/llvm/llvm-project/pull/143476.diff

3 Files Affected:

(modified) llvm/lib/Analysis/IR2Vec.cpp (+11-4)
(added) llvm/test/Analysis/IR2Vec/dbg-inst.ll (+13)
(added) llvm/test/Analysis/IR2Vec/unreachable.ll (+42)

diff --git a/llvm/lib/Analysis/IR2Vec.cpp b/llvm/lib/Analysis/IR2Vec.cpp
index 2ad65c2f40c33..8a392e0709c7f 100644
--- a/llvm/lib/Analysis/IR2Vec.cpp
+++ b/llvm/lib/Analysis/IR2Vec.cpp
@@ -13,7 +13,10 @@
 
 #include "llvm/Analysis/IR2Vec.h"
 
+#include "llvm/ADT/PostOrderIterator.h"
 #include "llvm/ADT/Statistic.h"
+#include "llvm/IR/CFG.h"
+#include "llvm/IR/Function.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/Support/Debug.h"
@@ -193,7 +196,8 @@ Embedding SymbolicEmbedder::getOperandEmbedding(const Value *Op) const {
 void SymbolicEmbedder::computeEmbeddings(const BasicBlock &BB) const {
   Embedding BBVector(Dimension, 0);
 
-  for (const auto &I : BB) {
+  // We consider only the non-debug and non-pseudo instructions
+  for (const auto &I : BB.instructionsWithoutDebug()) {
     Embedding InstVector(Dimension, 0);
 
     const auto OpcVec = lookupVocab(I.getOpcodeName());
@@ -218,9 +222,12 @@ void SymbolicEmbedder::computeEmbeddings(const BasicBlock &BB) const {
 void SymbolicEmbedder::computeEmbeddings() const {
   if (F.isDeclaration())
     return;
-  for (const auto &BB : F) {
-    computeEmbeddings(BB);
-    FuncVector += BBVecMap[&BB];
+
+  // Consider only the basic blocks that are reachable from entry
+  ReversePostOrderTraversal<const Function *> RPOT(&F);
+  for (const BasicBlock *BB : RPOT) {
+    computeEmbeddings(*BB);
+    FuncVector += BBVecMap[BB];
   }
 }
 
diff --git a/llvm/test/Analysis/IR2Vec/dbg-inst.ll b/llvm/test/Analysis/IR2Vec/dbg-inst.ll
new file mode 100644
index 0000000000000..0f486b0ba6a52
--- /dev/null
+++ b/llvm/test/Analysis/IR2Vec/dbg-inst.ll
@@ -0,0 +1,13 @@
+; RUN: opt -passes='print<ir2vec>' -o /dev/null -ir2vec-vocab-path=%S/Inputs/dummy_3D_vocab.json %s 2>&1 | FileCheck %s
+
+define void @bar2(ptr %foo)  {
+  store i32 0, ptr %foo, align 4
+  tail call void @llvm.dbg.value(metadata !{}, i64 0, metadata !{}, metadata !{})
+  ret void
+}
+
+declare void @llvm.dbg.value(metadata, i64, metadata, metadata) nounwind readnone
+
+; CHECK: Instruction vectors:
+; CHECK-NEXT: Instruction:   store i32 0, ptr %foo, align 4 [ 7.00  8.00  9.00 ]
+; CHECK-NEXT: Instruction:   ret void [ 0.00  0.00  0.00 ]
diff --git a/llvm/test/Analysis/IR2Vec/unreachable.ll b/llvm/test/Analysis/IR2Vec/unreachable.ll
new file mode 100644
index 0000000000000..370fe6881d6ce
--- /dev/null
+++ b/llvm/test/Analysis/IR2Vec/unreachable.ll
@@ -0,0 +1,42 @@
+; RUN: opt -passes='print<ir2vec>' -o /dev/null -ir2vec-vocab-path=%S/Inputs/dummy_3D_vocab.json %s 2>&1 | FileCheck %s
+
+define dso_local i32 @abc(i32 noundef %a, i32 noundef %b) #0 {
+entry:
+  %retval = alloca i32, align 4
+  %a.addr = alloca i32, align 4
+  %b.addr = alloca i32, align 4
+  store i32 %a, ptr %a.addr, align 4
+  store i32 %b, ptr %b.addr, align 4
+  %0 = load i32, ptr %a.addr, align 4
+  %1 = load i32, ptr %b.addr, align 4
+  %cmp = icmp sgt i32 %0, %1
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:                                          ; preds = %entry
+  %2 = load i32, ptr %b.addr, align 4
+  store i32 %2, ptr %retval, align 4
+  br label %return
+
+if.else:                                          ; preds = %entry
+  %3 = load i32, ptr %a.addr, align 4
+  store i32 %3, ptr %retval, align 4
+  br label %return
+
+unreachable:                                      ; Unreachable
+  store i32 0, ptr %retval, align 4
+  br label %return
+
+return:                                           ; preds = %if.else, %if.then
+  %4 = load i32, ptr %retval, align 4
+  ret i32 %4
+}
+
+; CHECK: Basic block vectors:
+; CHECK-NEXT: Basic block: entry:
+; CHECK-NEXT:  [ 25.00 32.00 39.00 ]
+; CHECK-NEXT: Basic block: if.then:
+; CHECK-NEXT:  [ 11.00 13.00 15.00 ]
+; CHECK-NEXT: Basic block: if.else:
+; CHECK-NEXT:  [ 11.00 13.00 15.00 ]
+; CHECK-NEXT: Basic block: return:
+; CHECK-NEXT:  [ 4.00 5.00 6.00 ]

boomanaiden154 · 2025-06-10T08:21:30Z


-  for (const auto &I : BB) {
+  // We consider only the non-debug and non-pseudo instructions
+  for (const auto &I : BB.instructionsWithoutDebug()) {


Does this actually matter? LLVM migrated to debug records recently, so I don't believe modern frontends will actually produce debug instructions, although not too familiar with all that infrastructure.

Right. We just want to ensure we consider only non-debug, non-pseudo instructions. Some of the old ll files in tests still have debug instructions.

boomanaiden154 · 2025-06-10T08:22:25Z

-    FuncVector += BBVecMap[&BB];
+
+  // Consider only the basic blocks that are reachable from entry
+  ReversePostOrderTraversal<const Function *> RPOT(&F);


Do you have a motivating example for trimming unreachable basic blocks from the computation? Not exactly sure where the inliner falls in the pipeline, but apparently not close enough to a simplifycfg to eliminate unreachable blocks?

correct, the inliner itself would leave some unreachable BBs. Later passes clean that up.

Yeah. Also, the current MLInliner considers only reachable blocks (tests have some examples). This check ensures parity, and it makes sense to consider only reachable blocks in general. If we need the embedding of an unreachable block, it can still be obtained by directly invoking getBBVector().

mtrofin · 2025-06-10T17:37:41Z

-    FuncVector += BBVecMap[&BB];
+
+  // Consider only the basic blocks that are reachable from entry
+  ReversePostOrderTraversal<const Function *> RPOT(&F);


can you check the overhead of RPO, would it be cheaper to do a depth first traversal?

Thanks. Seems like constructing RPO is costly. Changed it to DF Traversal.

mtrofin

looks fine, but check about RPO overhead

svkeerthy · 2025-06-17T17:55:28Z

Merge activity

Jun 17, 5:55 PM UTC: A user started a stack merge that includes this pull request via Graphite.
Jun 17, 5:57 PM UTC: @svkeerthy merged this pull request with Graphite.

svkeerthy mentioned this pull request Jun 10, 2025

[IR2Vec] Exposing Embedding as an data type wrapped around std::vector<double> #143197

Merged

svkeerthy mentioned this pull request Jun 10, 2025

[IR2Vec] Minor vocab changes and exposing weights #143200

Merged

svkeerthy changed the title ~~reachable BB~~ [IR2Vec] Consider only reachable BBs and non-debug instructions Jun 10, 2025

svkeerthy requested review from kazutakahirata, mtrofin and snehasish June 10, 2025 05:17

svkeerthy mentioned this pull request Jun 10, 2025

[MLGO][IR2Vec] Integrating IR2Vec with MLInliner #143479

Merged

svkeerthy marked this pull request as ready for review June 10, 2025 05:45

llvmbot added mlgo llvm:analysis Includes value tracking, cost tables and constant folding labels Jun 10, 2025

boomanaiden154 reviewed Jun 10, 2025

View reviewed changes

mtrofin reviewed Jun 10, 2025

View reviewed changes

svkeerthy force-pushed the users/svkeerthy/06-06-vocab_changes1 branch from 96e4a8b to d639ce4 Compare June 10, 2025 18:14

svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch 2 times, most recently from abee4cd to cd2cdf5 Compare June 10, 2025 19:54

mtrofin approved these changes Jun 10, 2025

View reviewed changes

svkeerthy force-pushed the users/svkeerthy/06-06-vocab_changes1 branch from d639ce4 to c200f67 Compare June 10, 2025 21:22

svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from cd2cdf5 to 716f3d2 Compare June 10, 2025 21:22

svkeerthy force-pushed the users/svkeerthy/06-06-vocab_changes1 branch from c200f67 to 8685c74 Compare June 10, 2025 22:13

svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from 716f3d2 to 173c3b1 Compare June 10, 2025 22:14

svkeerthy requested a review from boomanaiden154 June 12, 2025 18:14

svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from 173c3b1 to 6e44bb0 Compare June 12, 2025 21:48

svkeerthy force-pushed the users/svkeerthy/06-06-vocab_changes1 branch from 8685c74 to 5a7dd50 Compare June 12, 2025 21:48

This was referenced Jun 12, 2025

[IR2Vec] Scale embeddings once in vocab analysis instead of repetitive scaling #143986

Merged

[IR2Vec] Simplifying creation of Embedder #143999

Merged

Base automatically changed from users/svkeerthy/06-06-vocab_changes1 to main June 13, 2025 17:43

svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from 6e44bb0 to 40402d2 Compare June 13, 2025 17:44

reachable BB

ece3e21

svkeerthy force-pushed the users/svkeerthy/06-10-reachable_bb branch from 40402d2 to ece3e21 Compare June 13, 2025 18:18

svkeerthy mentioned this pull request Jun 13, 2025

[NFC] Formatting PassRegistry.def #144139

Merged

svkeerthy merged commit e29bb9a into main Jun 17, 2025
7 checks passed

svkeerthy deleted the users/svkeerthy/06-10-reachable_bb branch June 17, 2025 17:57

This was referenced Jun 20, 2025

[NFC][IR2Vec] Increasing tolerance in approximatelyEquals() of Embedding #145117

Merged

[IR2Vec] Add out-of-place arithmetic operators to Embedding class #145118

Merged

[IR2Vec] Restructuring Vocabulary #145119

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IR2Vec] Consider only reachable BBs and non-debug instructions#143476

[IR2Vec] Consider only reachable BBs and non-debug instructions#143476
svkeerthy merged 1 commit into
mainfrom
users/svkeerthy/06-10-reachable_bb

svkeerthy commented Jun 10, 2025 •

edited

Loading

Uh oh!

svkeerthy commented Jun 10, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jun 10, 2025 •

edited

Loading

Uh oh!

boomanaiden154 Jun 10, 2025

Uh oh!

svkeerthy Jun 10, 2025

Uh oh!

boomanaiden154 Jun 10, 2025

Uh oh!

mtrofin Jun 10, 2025

Uh oh!

svkeerthy Jun 10, 2025

Uh oh!

mtrofin Jun 10, 2025

Uh oh!

svkeerthy Jun 10, 2025

Uh oh!

mtrofin left a comment

Uh oh!

svkeerthy commented Jun 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

svkeerthy commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

svkeerthy commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

boomanaiden154 Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

svkeerthy Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

boomanaiden154 Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

mtrofin Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

svkeerthy Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

mtrofin Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

svkeerthy Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

mtrofin left a comment

Choose a reason for hiding this comment

Uh oh!

svkeerthy commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

svkeerthy commented Jun 10, 2025 •

edited

Loading

svkeerthy commented Jun 10, 2025 •

edited

Loading

llvmbot commented Jun 10, 2025 •

edited

Loading

svkeerthy commented Jun 17, 2025 •

edited

Loading