Skip to content

feat: allocator driver + shadow-allocation pass — measures the wiring need on real benches (#209/#242)#280

Merged
avrabe merged 5 commits into
mainfrom
feat/vcr-ra-driver-shadow-alloc
Jun 5, 2026
Merged

feat: allocator driver + shadow-allocation pass — measures the wiring need on real benches (#209/#242)#280
avrabe merged 5 commits into
mainfrom
feat/vcr-ra-driver-shadow-alloc

Conversation

@avrabe

@avrabe avrabe commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Breaking the staring contest: running, measuring allocator code

Instead of "next is the wiring," this assembles the allocator entry point and runs it (measure-only) on gale's real benches — turning the plan into concrete on-bench data.

What

  • liveness::allocate_function(instrs, k, precolored)Allocated{coloring, remat_opportunities} | NeedsSpill(set) | Declined. Runs interference graph → k-colouring (reserved regs precoloured) → result, plus the perf: --relocatable direct selector bypasses synth-opt — general codegen optimization (research + stats tracking) #209 const-CSE/rematerialization headroom count. The call the virtual-register wiring will make.
  • Shadow pass in arm_backend behind SYNTH_SHADOW_ALLOC=1 (default-off, eprintln-only, zero codegen impact): runs the allocator on every real function and logs whether it colours within R0–R8 and the remat headroom.

Concrete finding (the justification for virtual-register output)

Running the allocator on the existing physical-register stream reports spurious spills (flat_flight "would spill R1") and 0 remat opportunities — because the greedy selector already overloaded each physical register:

  • R1 is one interference-graph node conflicting with every value it was ever reused for → looks uncolourable.
  • every redundant #0x7e shows as "not redundant" because its register was already clobbered for reuse.

Both artifacts prove, on-bench, that the allocator is blind until the selector emits virtual registers (one node per value). This is no longer a guess — it's measured. The shadow pass will quantify the real win the moment virtual-register output lands.

Safety

Shadow pass off by default → all three differential fixtures bit-identical (control_step 0x00210A55, flight_seam 0x07FDF307, div_const 338/338, verified). 316 lib tests (6 new: driver + clamp SelectMove/Select modeling); clippy clean.

Next

Virtual-register selector output (step 3 of docs/design/vcr-ra-allocator-wiring.md) — now with a measure-only harness to watch the spurious spills + 0-remat flip to real wins as it lands.

🤖 Generated with Claude Code

…e wiring need on real benches (#209/#242)

Assemble the allocator entry point and run it (measure-only) on gale's real
benches to break the "next is the wiring" stalemate with running, measuring code.

- `liveness::allocate_function(instrs, k, precolored) -> AllocationOutcome`
  (Allocated{coloring, remat_opportunities} | NeedsSpill(set) | Declined):
  interference graph → k-colouring with reserved regs precoloured → result,
  plus the #209 const-CSE/rematerialization headroom count. Pure; the call the
  virtual-register wiring will make.
- arm_backend SHADOW pass behind `SYNTH_SHADOW_ALLOC=1` (default-off,
  eprintln-only, zero codegen impact): runs the allocator on every real
  function and logs whether it colours within R0-R8 and the remat headroom.

CONCRETE FINDING (on-bench, the justification for virtual-register output):
running the allocator on the existing PHYSICAL-register stream reports spurious
spills (flat_flight "would spill R1") and 0 remat opportunities — because the
greedy selector already overloaded each physical register: R1 is one
interference node conflicting with every value it was ever reused for, and each
redundant const shows as "not redundant" since its register was already
clobbered for reuse. Both artifacts PROVE the allocator is blind until the
selector emits VIRTUAL registers (one node per value). The shadow pass will
quantify the real win the moment virtual-register output lands.

Safe: shadow pass off by default → all fixtures bit-identical (control_step
0x00210A55 / flight_seam 0x07FDF307 / div_const 338/338 verified). 316 lib
tests (6 new: driver + clamp modeling); clippy clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
avrabe and others added 4 commits June 5, 2026 18:11
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e spurious (9 fits in 9) (#209/#242)

Add straight_line_peak_pressure / function_peak_pressure: the max number of
distinct VALUES (def-to-last-use ranges) live at once — the true register
requirement, vs the physical-register count the greedy selector inflates by
reusing one register for many values. Wired into the shadow report.

CONCRETE RESULT on flat_flight (SYNTH_SHADOW_ALLOC=1):
  physical-graph would spill {R1}, but peak value-pressure is 9 (<=9 => spurious;
  fits once virtually allocated)

i.e. flat_flight's true register need is exactly 9 = the R0-R8 pool, so it fits
with ZERO spills under virtual-register allocation — gale's 17 greedy spills are
almost entirely spurious, eliminable by the allocator. A measured projection of
the allocator win (≈17 str/ldr removed) on top of the const-CSE headroom, not a
guess.

Unwired analysis; shadow pass off by default → fixtures bit-identical
(control_step 0x00210A55 / flight_seam 0x07FDF307 / div_const 338/338 verified).
2 new pressure tests (counts values not pregs; reuse-invariant). clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…280)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…cation consumes (#209/#242)

straight_line_value_ranges(segment) -> Vec<ValueRange{vreg, reg, def, last_use}>:
splits each physical register's def-use chains into distinct virtual registers
(value ranges). Upgrades the peak-pressure COUNT into the actual per-value
ASSIGNMENT — the renaming a re-allocation pass colours and rewrites.

Sound for straight-line code (the reaching def of every use is the most recent
def, unambiguous); cross-block web merging is the next step. The number of
ranges a physical register splits into IS the overloading that inflated the
physical interference graph and produced the spurious spill — e.g. R1 splitting
into a dozen ranges. Colouring these ranges instead is what removes the spurious
spill (the 9-fits-in-9 finding made concrete per-value).

Unwired analysis; no codegen change. Test: a reused R1 splits into two distinct
vregs with the right def/last-use bounds. clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.39837% with 31 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/synth-synthesis/src/liveness.rs 92.60% 17 Missing ⚠️
crates/synth-backend/src/arm_backend.rs 12.50% 14 Missing ⚠️

📢 Thoughts on this report? Let us know!

@avrabe avrabe merged commit 6b61244 into main Jun 5, 2026
14 checks passed
@avrabe avrabe deleted the feat/vcr-ra-driver-shadow-alloc branch June 5, 2026 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant