Skip to content

feat: const-CSE rematerialization-avoidance — sound, measures the residency gap (#209/#242)#281

Merged
avrabe merged 1 commit into
mainfrom
feat/vcr-ra-const-cse
Jun 5, 2026
Merged

feat: const-CSE rematerialization-avoidance — sound, measures the residency gap (#209/#242)#281
avrabe merged 1 commit into
mainfrom
feat/vcr-ra-const-cse

Conversation

@avrabe

@avrabe avrabe commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

The const-CSE pass, built and validated — and the measurement that pins the real blocker

apply_const_cse drops a movw rd,#v that re-materializes a constant already resident in ra and retargets rd's reads to ra. Every rewrite is proven by the liveness analysis (rename_use rewrites uses never defs; safe_cse_uses requires ra still holds v through every read and rd's value to be segment-local), and it only removes materializations (pressure never rises) — so unlike the mla fusion (#277) it cannot regress on-target.

Validated sound

Behind SYNTH_CONST_CSE=1 (default-off). With it on, all three differential fixtures are result-identical: control_step 0x00210A55, flight_seam 0x07FDF307, div_const 338/338.

The measured finding (the point)

It fires zero times on the real flat_flight170→170 insns, 26→26 movw, byte-identical. Because the greedy selector already destroyed the residency it needs: by the second movw #0x7e, the register holding the first was reused for another value, so there's no resident copy to fold.

This is the third time measurement redirected the work (dead-store #246 no-op → mla regressed #277 → const-CSE fires 0). The consistent lesson: const-CSE is an allocator capability, not a post-pass. It only fires after re-allocation gives each value a stable register.

So

The pass stays flag-gated infra — differential-validated and de-risked for the post-re-allocation pipeline stage — not default-on (0 delta on greedy code, which the audit rule forbids shipping). The genuine prerequisite is the re-allocation that restores residency; that's the next and final piece.

3 unit tests (fold a clamp bound; decline on clobbered-resident; decline on maybe-live-out). clippy/fmt clean; 322 lib tests pass.

🤖 Generated with Claude Code

…ures the residency gap (#209/#242)

apply_const_cse(instrs): drop a `movw rd,#v` that re-materializes a constant
already resident in `ra`, and retarget rd's in-segment reads to ra — every
rewrite proven by the liveness analysis (rename_use + safe_cse_uses), and it
ONLY removes materializations (pressure never rises), so unlike the mla fusion
(#277) it cannot regress on-target. Plus the operand rewriter (rename_use:
rewrites uses, never defs; declines on RMW/unmodeled ops) and value ranges.

Wired behind SYNTH_CONST_CSE=1 (default-off). VALIDATED SOUND: all three
differential fixtures RESULT-IDENTICAL with it on (control_step 0x00210A55,
flight_seam 0x07FDF307, div_const 338/338).

MEASURED FINDING (the point of this commit): it fires ZERO times on the real
flat_flight (170→170 insns, 26→26 movw, byte-identical) — because the greedy
selector already DESTROYED the residency it needs: by the second `movw #0x7e`
the register that held the first was reused for another value, so there is no
resident copy to fold. This is the third time measurement redirected the work
(dead-store #246 no-op; mla regressed; const-CSE fires 0). The consistent
lesson: const-CSE is an ALLOCATOR capability, not a post-pass — it only fires
AFTER re-allocation gives each value a stable register. So the pass stays
flag-gated infra (de-risked, differential-validated, ready for the
post-re-allocation pipeline stage), NOT default-on (0 delta on greedy code,
which the audit rule forbids shipping).

Tests: fold a redundant clamp bound; decline on clobbered-resident; decline on
maybe-live-out. clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@avrabe

avrabe commented Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

On-target validation across the acceptance suite with SYNTH_CONST_CSE=1 (#281) — it's sound, a clean win where it fires, and pins the residency gap per-bench:

bench const-CSE off on (SYNTH_CONST_CSE=1) result
control_step 158 157 −1 cyc, seam 2165333 ✅ — 1 opportunity survives
flat_flight 255 255 (byte-identical) 0 fires — residency gap (its ~14 need vreg output)
filter_axis 37 37 0 fires
controller_step 162 162 0 fires

So your "sound but measures the residency gap" framing is exactly right on silicon: const-CSE is correct and strictly non-regressing (the one control_step site that stays resident folds to a real −1 cyc), and the 0-fires on flat_flight is the measured proof that the residency comes from the vreg output, not the CSE pass. The pass is ready; it's starved of resident constants until the selector stops overloading registers.

This is the clean half of the #277 lesson playing out: const-CSE can't regress (removes only), so unlike mla there's no on-target risk to flipping it on once residency exists — the 0 → 14 on flat_flight will land entirely from the vreg output feeding this same pass. I'll re-run the full suite (shadow + silicon) the moment that's in; expect control_step 157 to hold and flat_flight to take its first real drop. Nice — the pass being non-regressing by construction is the right property.

@avrabe avrabe merged commit 195d15b into main Jun 5, 2026
13 checks passed
@avrabe avrabe deleted the feat/vcr-ra-const-cse branch June 5, 2026 19:53
@codecov

codecov Bot commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 46.85864% with 203 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/synth-synthesis/src/liveness.rs 46.70% 202 Missing ⚠️
crates/synth-backend/src/arm_backend.rs 66.66% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant