FIR Transforms by idavis · Pull Request #3187 · microsoft/qdk

idavis · 2026-04-29T17:28:33Z

Summary

This PR adds FIR passes to enable broader code generation scenarios.

QIR does not support:

Function pointers (and thus dynamic dispatch)
Structs
Tuples
Generics

RIR doesn't currently support:

Multiple returns: return_unify tries to remove this constraint but there are some odd things we can't deal with.

The passes peel each unsupported piece off in the pipepline.

Monomorphize cleans up generics
Return unify gets rid of multiple returns and allows us to better understand control flow
defunctionalization gets rid of callable exprs
erase_utds rewrites the FIR so that it uses tuples in place of structs
lower_tuple_comparison handles a special case of binop replacing it with short-circuiting element-wise comparisons which can be codegen'd
sroa and arg_promote work together to get rid of all possible tuple element usage
dce and gc passes clean up code that isn't called any longer so that RCA doesn't pay attention to it
there are some mini passes as well that collapse very specific patterns like the defunc prepass

Aside from the passes, this PR also tries to unify how the code goes through RCA and codegen compilation. There are some side effects which leak into circuits as we have to generate new functions as part of the passes that we don't necessarily want reflected in the circuit representation.

Suggested Review Assignment

Reviewer	Best-fit parts
@swernli	Core FIR transform pipeline, qsc_fir_transforms, QIR/codegen integration, partial eval, RCA, RIR, circuit behavior, root Cargo changes
@minestarks	Broad compiler integration, qsc/qsc_circuit/qsc_frontend/qsc_lowerer/qsc_openqasm_compiler/qsc_passes, Python/package-facing changes, RIR, circuit behavior, root Cargo changes
@billti	Python package tests/snapshots, fuzz target, wasm diagnostic touchpoint, resource estimator test, default-owned samples/index_map fallout
@ScottCarda-MS	Language service and npm snapshot changes

Crate organization

Integrating qsc_fir_transforms with qsc_passes was going to make the PR look much bigger with a lot of moved files. My plan was to merge qsc_fir_tranforms into qsc_passes and organize them by HIR and FIR. This way we'd have a clean refactoring PR with no functional changes. This PR is already very large and I thought this integration was just too much to add.

Error types

ErrorKind::FirTransform will merge with ErrorKind::Pass in source/compiler/qsc/src/compile.rs in a follow up PR unless we want to differentiate between HIR and FIR passes at this level. We may want to differentiate at the qsc_passes level but merge them at this level as diagnostic transparent pass errors. The same follows for Error::Pass and Error::FirTransform in source/compiler/qsc/src/interpret.rs.

Interpret

This crate has two major changes. First the codegen module has a lot of added code for preparing the compilation. When we have both callables with interpret values (which may themselves be callables/structs/tuples which may contain the same complicated values) and entry expressions, we need to update the compilation in very different ways. For callables we need to effectively generate a new synthetic entry expr which can use the interpreter values. There is a case when dealing with closures where we need to partially abandon this pass and use a fallback of pinned non-entry-reachable items which are passed into the pipeline for processing. Entry expresssions are the easy path and just work as normal heading into the pipeline.

The interpret module does some setup work to help the codegen module.

The openqasm module has some fixes that are related to profile not being plumbed correctly. We weren't handling the user's specified profile and the codes annotated profile correct when used together and making the assumtion that if it was missing from the code that the profile was unrestricted. You'll see this update propagated into the Python and parser.

qsc_fir

The big addition here is the assigner. The FIR transforms do a lot of code generation and mutation, but it is additive. When we are generating new code, we need consistent, non-overlapping ids, for blocks, exprs, items, etc. This assigner update allows us to create an assigner from a package which finds the next values of each id needed so that we can safely allocate.

Testing

Some tests have been added to seemingly random places. These tests were added after I broke things and didn't know as no tests were failing. They are there to prevent regressions.

New instruction `frem`

The frem instruction is added to support OpenQASM dynamic angle support. Hopefully it will be added to the adaptive profile soon. Without this instruction we cannot do runtime angle calculations in OpenQASM as the angle type requires this computation.

Codegen

The qir codegen now requires RCA to have been already done before calling into fir_to_rir. We had too many places where we were or were not running RCA and then having to run it after the fact. This made it difficult to know when RCA was actually taking place. There are a few refactorings around this so that we have this more consolidated, but we might want to take a deeper step towards unifying in the future.

Circuits

Transformed callables are cloned into the user package. In order to maintain the same visualization as before, we have to detect whether we are in a 'synthetic' callable context so that we don't emit the call as a grouping context.

Partial eval

There is a lot of code in partial eval for dealing with return statements. I've documented source/compiler/qsc_partial_eval/src/evaluation_context.rs indicating that this is no longer required, but such a refactoring adds a lot of risk and code change which is better defferred to a follow up PR.

LLVM IR Changes

There are a few test files which are updated as the passes enable better code generation options that were impossible to handle before and were forced to be inlined.

Performance

The FIR transforms can be made faster, but they take less than 1/5 the time of the regular compilation and 1/15 as much time as RCA, so they are fast enough for the moment.

Random looking changes

source/compiler/qsc_frontend/src/closure.rs - documented here as the exact shape of closures has downstream effects and we can't vary from this structure without also changing many other sites.
source/compiler/qsc_frontend/src/resolve.rs - fixes a bug in type resolution where supplying an explicit : Qubit type on use statements leads to the var's pat type being error.

orpuente-MS · 2026-05-14T20:35:27Z

+    /// Pin-based fallback for callable args containing closures with captures.
+    ///
+    /// Seeds concrete (non-arrow-input) callables into the entry for reachability,
+    /// pins arrow-input callables and the target for DCE survival, and lets
+    /// `fir_to_qir_from_callable` handle specialization at QIR generation time.
+    fn prepare_codegen_fir_from_callable_args_pinned(
+        package_store: &PackageStore,
+        callable: qsc_hir::hir::ItemId,
+        _args: &Value,
+        capabilities: TargetCapabilityFlags,
+        mut concrete_callables: FxHashSet<qsc_fir::fir::StoreItemId>,
+    ) -> Result<CodegenFir, Vec<Error>> {


Not using _args here? Maybe can delete the _args parameter?

swernli · 2026-05-18T16:47:06Z

+            assigner.set_next_stmt(StmtId::from(max + 1));
+        }
+
+        // NodeId — scan callable and spec decls


As it turns out, NodeId is only used in three places in FIR where it is set as an id, but then never read. I think we can drop it from FIR entirely.

swernli · 2026-05-18T20:32:45Z

@@ -65,6 +87,38 @@ fn test_single_qubit() {
    );
 }

+#[test]
+fn test_explicitly_annotated_single_qubit_rewrite_preserves_binding_name_and_types() {


This test and the one above are effectively identical... they don't verify anything different just use different mechanisms to do so.

swernli · 2026-05-18T20:36:19Z

+    let qir = generate_qir_from_ast(
+        package,
+        unit.source_map,
+        unit.profile.unwrap_or(Profile::Unrestricted),


I get that this is only used for tests, but it seems odd for the default for QIR generation to be a profile that we know will fail QIR generation. Should this be Adaptive_RIF?

swernli · 2026-05-18T21:39:35Z

+            Value::Array(vs) => {
+                let mut lowered_ids = Vec::with_capacity(vs.len());
+                for v in vs.iter() {
+                    lowered_ids.push(lower_value_to_expr(package, assigner, v, callable_types));
+                }
+                let elem_ty = lowered_ids.first().map_or(qsc_fir::ty::Ty::Err, |id| {
+                    package.exprs.get(*id).expect("just inserted").ty.clone()
+                });
+                (
+                    qsc_fir::fir::ExprKind::Array(lowered_ids),
+                    qsc_fir::ty::Ty::Array(Box::new(elem_ty)),
+                )
+            }
+            Value::Range(r) => {


Since we know some folks invoke Q# callables with very large arrays (RE and chemistry scenarios, for example), we may pay a high cost of generating a large array literal into the synthetic entry expression only for it to be mostly ignored (since the synthetic entry is used for analysis in the passes and not execution). It might be worth trying to detect this case and avoid emitting constant arrays when not needed.

swernli · 2026-05-19T20:36:33Z

+/// 3. Asserts the two results match (both succeed with equal values, or
+///    both fail).
+#[cfg(test)]
+#[allow(dead_code)]


Looks like this allow isn't needed anymore.

swernli · 2026-05-19T20:48:59Z

+testutil = ["qsc_frontend", "qsc_hir", "qsc_passes"]
+
+[dev-dependencies]
+qsc_fir_transforms = { path = ".", features = ["testutil"] }


We started talking about this, and I see why it's needed now to make the testutil functionality available via the public API to scenario tests. It seems like there might be another way around that (maybe moving the tests, maybe moving the utils), but it's not critical for this PR.

swernli · 2026-06-03T16:19:48Z

+    package: &Package,
+    block_id: BlockId,
+    local_id: LocalVarId,
+    uses: &mut Vec<bool>,


Looking at how this is used, it doesn't really need to be a vector. It could instead be a three value enum with something like Unused, FieldOnly, and GeneralUse.

Done in c43fba3

…oft#3268) This PR adds support for the `frem`, `fptoui`, and `uitofp` instructions to the QIR simulators to handle the codegen changes in microsoft#3187 .

…ed it to tuple decomp. This also saves convergence rounds.

minestarks · 2026-06-03T20:57:59Z

+//! # Append-only arena contract
+//!
+//! FIR arenas (`Package.blocks`, `.stmts`, `.exprs`, `.pats`) are backed by
+//! `IndexMap<K, V>` which stores `Vec<Option<V>>`. FIR transform passes
+//! create new nodes via `Assigner::next_*()` and may mutate existing nodes
+//! in-place, but they **never remove entries** from the arenas. This means
+//! pre-transform nodes remain as populated-but-unreachable entries ("orphans")
+//! after transforms complete.
+//!
+//! Any code that iterates a FIR arena directly (via `IndexMap::iter()`) will
+//! encounter orphan entries alongside live entries. Analyzers must either:
+//! - Filter to reachable nodes before processing (see `qsc_rca::common`), or
+//! - Tolerate orphan entries gracefully (e.g., in-place type mutations).
+//!
+//! The `gc_unreachable` pass in `qsc_fir_transforms` can tombstone orphan
+//! entries after the pipeline completes, making `iter()` skip them.


I'm having trouble making sense of this. First, it says

FIR transform passes should never remove nodes

Other code should expect that iter() may yield orphaned nodes

Then, it says:

The gc_unrecahable FIR transform pass does remove nodes

Therefore, other code can assume iter() only yields reachable nodes.

So which one is true? Can FIR contain orphaned nodes or not?

Nice catch, old docs that didn't get updated as the passes evolved in a separate crate, I'll clean up the assigner docs.

This has been updated. As part of other bits we discovered I removed the NodeId from FIR as it is effectively dead code now, so there are a bunch of updates to the assigner since you looked.

minestarks · 2026-06-03T21:40:33Z

        self.pragma_config
            .pragmas
            .get(&PragmaKind::QdkQirProfile)
-            .map_or(Profile::Unrestricted, |profile_str| {
+            .map(|profile_str| {


What did we gain from propagating the Option here? Every single usage site calls .unwrap_or(Profile::Unrestricted) . It's the same thing with more steps, AFAICT

The old code wasn't handling profile overrides correctly. We have profile specified via python API and profile specified via attrs/annotations. The Option represents the profile as it was parsed from the source via attrs/annotations and is overridden by the API call. If nothing was specified, a default is being applied. If the profile is never specified it defaults to Unrestricted unless we are in a codegen path in which there should be a different default.

minestarks · 2026-06-03T21:43:55Z

-    profile: Profile,
+    /// The QIR profile for compilation, derived from pragmas.
+    /// Returns `None` if no profile pragma was specified in the `OpenQASM` source.
+    profile: Option<Profile>,


Same comment as above - changing to Option doesn't seem necessary for this PR

swernli · 2026-06-03T21:55:06Z

+        changed |= normalize_tuple_destructuring(store, package_id, assigner);
+


now that normalize_tuple_destructuring happens as part of the tuple decompose pass, it doesn't actually need to be called here. I tried commenting it out and all the tests passed without any changes, so this is likely a no-op.

Dropping it from arg_promote is valid, it was put there as a defensive measure in case we generate that pattern in the future, but we're ok to drop for now as long as we keep calling it as part of the outer fixed point loop.

minestarks · 2026-06-04T00:06:14Z

@@ -80,7 +80,7 @@
            <g>
              <g class="gate" data-location="0,0-0,0">
                <a href="#" class="qs-circuit-source-link">
-                  <title>lambda.qs:3:24 let lambda = (q =&gt; H(q));</title>
+                  <title>lambda.qs:4:5 lambda(q);</title>


This location isn't correct, and I can't understand why it changed.

It changed as (q) => H(q) is rewritten by the peephole optimization pass which rewrites it as let lambda = H; then the next part of defunc sees let lambda = H; as a HOF expression, tracks its usage and effectively replaces the call on lamba with a call to H and the entire let binding is removed.

@minestarks 94fadcd

minestarks · 2026-06-04T18:16:29Z

+            self.values[index] = Some(value);
+            true
+        } else {
+            false


I think this method is essentially dead code, we should delete it in favor of just insert

I thought it was odd that we'd want this method at all - when would you ever want to blindly try to double-insert a value and reject the second attempt? Felt like a pattern that would hide bugs in the calling code.

So I added a panic on line 138 and ran the build. All the tests run fine (except for the insert_if_absent tests of course which I commented out). This would imply the "skip if exists" logic is actually doing nothing. insert seems like right alternative to me.

This was needed as part of RCA arity issues. It may no longer be needed after I fixed a bunch of RCA related issue. I'll dig into it. Thank you!

Fixed in 937e363, no longer needed after other updates.

minestarks · 2026-06-04T18:26:01Z

+    // This matches the codegen pipeline ordering in qsc/src/codegen.rs.
+    // The transforms require an entry expression (defunctionalize uses reachability from entry),
+    // so only run when the package has one.
+    if fir_store.get(fir_package_id).entry.is_some() {


Did this introduce a delay in the language service? It seems like enough work to slow down the squiggles showing up as you type

The FIR transforms themselves on Dynamics.qs (and all reachable std/core code) takes about 6ms. There is a perf issue around the invariants that I need to optimize, but the passes themselves are very fast.

swernli · 2026-06-05T17:50:55Z

+fn find_tuple_bindings_in_block(
+    store: &PackageStore,
+    package_id: PackageId,
+    block_id: BlockId,
+) -> Vec<TupleBinding> {
+    let mut bindings = Vec::new();
+    // Collect the root block's own `Local` patterns; the walker below only
+    // visits the block's expressions, not the block node itself.
+    collect_block_local_binds(store, package_id, block_id, &mut bindings);
+    let package = store.get(package_id);
+    for_each_expr_in_block(package, block_id, &mut |_expr_id, expr| {
+        // `Block` and `While` are the only `ExprKind` variants that hold a
+        // block directly; `If` bodies are themselves `Block` expressions, so
+        // they are reached through the `Block` arm. Each block is visited
+        // exactly once, so its `Local` patterns are collected exactly once.
+        match &expr.kind {
+            ExprKind::Block(nested_block_id) | ExprKind::While(_, nested_block_id) => {
+                collect_block_local_binds(store, package_id, *nested_block_id, &mut bindings);
+            }
+            _ => {}
+        }
+    });
+    bindings
+}
+
+/// Collects candidate tuple-typed bindings from a single block's
+/// `StmtKind::Local` patterns, without descending into nested blocks (the
+/// caller's [`for_each_expr_in_block`] walk handles expression descent).
+fn collect_block_local_binds(
+    store: &PackageStore,
+    package_id: PackageId,
+    block_id: BlockId,
+    bindings: &mut Vec<TupleBinding>,
+) {
+    let package = store.get(package_id);
+    let block = package.get_block(block_id);
+    for &stmt_id in &block.stmts {
+        let stmt = package.get_stmt(stmt_id);
+        if let StmtKind::Local(_, pat_id, _) = &stmt.kind {
+            find_binds_in_pat(store, package_id, *pat_id, bindings);
+        }
+    }
+}


This logic is correct, but is it possible it could be simplified further by just using an FIR visitor that only looks for StmtKind::Local?

Done in a34971c

swernli · 2026-06-05T23:03:24Z

+    // Mark export targets that resolve to local callables as reachable so
+    // the preserved exports don't point at removed items. Cross-package
+    // export targets and unresolved (Res::Err) exports are ignored.
+    for item in package.items.values() {
+        if let ItemKind::Export(_name, Res::Item(item_id)) = &item.kind
+            && item_id.package == package_id
+        {
+            local_reachable.insert(item_id.item);
+        }
+    }


we talked about this already, but just leaving a note here: exported items don't need to be considered "used" or reachable, and in fact the export/import and namespace items might not even need to be in FIR at all.

swernli · 2026-06-05T23:07:31Z

+    gc_unreachable::gc_unreachable(store.get_mut(package_id));
+    invariants::check(store, package_id, invariants::InvariantLevel::PostGc);
+    if matches!(stage, PipelineStage::Gc) {
+        return result;
+    }


I don't think PostGc and PostItemDce need to be separate stages. Since this call to gc_unreachable will clean up the unreachable expressions, blocks, and statements, you could just run it once unconditionally after item dce, rather than the current logic that runs it unconditionally before and then conditionally afterward based on whether item dce removed anything.

swernli · 2026-06-05T23:35:46Z

+        // For the synthetic entry, we emit a Var referencing the closure's underlying
+        // callable. Captures are irrelevant for pipeline reachability — defunc handles
+        // specialization. Both captureless and capturing closures use the same Var form.
+        let ty = callable_types
+            .get(&closure.id)
+            .expect("Closure callable type must be pre-computed")
+            .clone();
+        let kind = qsc_fir::fir::ExprKind::Var(
+            qsc_fir::fir::Res::Item(qsc_fir::fir::ItemId {
+                package: closure.id.package,
+                item: closure.id.item,
+            }),
+            Vec::new(),
+        );


there is a subtle bug here... when creating the synthetic entry point, the arguments are meant to be represented in a way the assists with computing reachability. But by dropping the captured values, it might be dropping values that are Value::Global or Value::Closure that would need to be preserved. The comment mentions that defunc will handle finding the captures, which it would do, but it never sees them because they are not present in the synthetic entry.

swernli · 2026-06-05T23:50:26Z

+fn adaptive_capabilities() -> TargetCapabilityFlags {
+    TargetCapabilityFlags::Adaptive
+        | TargetCapabilityFlags::IntegerComputations
+        | TargetCapabilityFlags::FloatingPointComputations
+}


the profile type supports conversion into capabilities, so you can just replace calls to this function with Profile::AdaptiveRIF.into() and remove the function.

idavis self-assigned this Apr 29, 2026

swernli reviewed Apr 29, 2026

View reviewed changes

Comment thread source/compiler/qsc_eval/src/lib.rs Outdated

swernli reviewed Apr 29, 2026

View reviewed changes

Comment thread source/index_map/src/lib.rs Outdated

swernli reviewed Apr 29, 2026

View reviewed changes

Comment thread source/compiler/qsc_codegen/src/qir/v1.rs Outdated

swernli reviewed Apr 29, 2026

View reviewed changes

Comment thread source/compiler/qsc_partial_eval/src/evaluation_context.rs

swernli reviewed Apr 30, 2026

View reviewed changes

Comment thread source/compiler/qsc/src/interpret.rs

swernli reviewed Apr 30, 2026

View reviewed changes

Comment thread source/compiler/qsc/src/codegen/tests.rs Outdated

swernli reviewed Apr 30, 2026

View reviewed changes

Comment thread source/compiler/qsc/src/codegen.rs Outdated

swernli reviewed Apr 30, 2026

View reviewed changes

Comment thread source/compiler/qsc/src/codegen.rs Outdated

swernli reviewed May 7, 2026

View reviewed changes

Comment thread source/compiler/qsc/src/codegen/tests.rs Outdated

orpuente-MS reviewed May 14, 2026

View reviewed changes

Comment thread source/compiler/qsc/src/codegen.rs Outdated

orpuente-MS reviewed May 14, 2026

View reviewed changes

Comment thread source/compiler/qsc/src/codegen.rs Outdated

orpuente-MS reviewed May 14, 2026

View reviewed changes

swernli reviewed May 18, 2026

View reviewed changes

Comment thread source/compiler/qsc_fir/src/assigner.rs Outdated

swernli reviewed May 18, 2026

View reviewed changes

idavis marked this pull request as ready for review May 18, 2026 17:06

idavis requested review from ScottCarda-MS, billti and minestarks as code owners May 18, 2026 17:06

swernli reviewed May 18, 2026

View reviewed changes

Comment thread source/compiler/qsc/src/lib.rs

swernli reviewed May 18, 2026

View reviewed changes