feat: const-CSE rematerialization-avoidance — sound, measures the residency gap (#209/#242)#281
Conversation
…ures the residency gap (#209/#242) apply_const_cse(instrs): drop a `movw rd,#v` that re-materializes a constant already resident in `ra`, and retarget rd's in-segment reads to ra — every rewrite proven by the liveness analysis (rename_use + safe_cse_uses), and it ONLY removes materializations (pressure never rises), so unlike the mla fusion (#277) it cannot regress on-target. Plus the operand rewriter (rename_use: rewrites uses, never defs; declines on RMW/unmodeled ops) and value ranges. Wired behind SYNTH_CONST_CSE=1 (default-off). VALIDATED SOUND: all three differential fixtures RESULT-IDENTICAL with it on (control_step 0x00210A55, flight_seam 0x07FDF307, div_const 338/338). MEASURED FINDING (the point of this commit): it fires ZERO times on the real flat_flight (170→170 insns, 26→26 movw, byte-identical) — because the greedy selector already DESTROYED the residency it needs: by the second `movw #0x7e` the register that held the first was reused for another value, so there is no resident copy to fold. This is the third time measurement redirected the work (dead-store #246 no-op; mla regressed; const-CSE fires 0). The consistent lesson: const-CSE is an ALLOCATOR capability, not a post-pass — it only fires AFTER re-allocation gives each value a stable register. So the pass stays flag-gated infra (de-risked, differential-validated, ready for the post-re-allocation pipeline stage), NOT default-on (0 delta on greedy code, which the audit rule forbids shipping). Tests: fold a redundant clamp bound; decline on clobbered-resident; decline on maybe-live-out. clippy/fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
On-target validation across the acceptance suite with
So your "sound but measures the residency gap" framing is exactly right on silicon: const-CSE is correct and strictly non-regressing (the one This is the clean half of the #277 lesson playing out: const-CSE can't regress (removes only), so unlike mla there's no on-target risk to flipping it on once residency exists — the |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
The const-CSE pass, built and validated — and the measurement that pins the real blocker
apply_const_csedrops amovw rd,#vthat re-materializes a constant already resident inraand retargets rd's reads to ra. Every rewrite is proven by the liveness analysis (rename_userewrites uses never defs;safe_cse_usesrequires ra still holds v through every read and rd's value to be segment-local), and it only removes materializations (pressure never rises) — so unlike the mla fusion (#277) it cannot regress on-target.Validated sound
Behind
SYNTH_CONST_CSE=1(default-off). With it on, all three differential fixtures are result-identical:control_step0x00210A55,flight_seam0x07FDF307,div_const338/338.The measured finding (the point)
It fires zero times on the real
flat_flight—170→170insns,26→26movw, byte-identical. Because the greedy selector already destroyed the residency it needs: by the secondmovw #0x7e, the register holding the first was reused for another value, so there's no resident copy to fold.This is the third time measurement redirected the work (dead-store #246 no-op → mla regressed #277 → const-CSE fires 0). The consistent lesson: const-CSE is an allocator capability, not a post-pass. It only fires after re-allocation gives each value a stable register.
So
The pass stays flag-gated infra — differential-validated and de-risked for the post-re-allocation pipeline stage — not default-on (0 delta on greedy code, which the audit rule forbids shipping). The genuine prerequisite is the re-allocation that restores residency; that's the next and final piece.
3 unit tests (fold a clamp bound; decline on clobbered-resident; decline on maybe-live-out). clippy/fmt clean; 322 lib tests pass.
🤖 Generated with Claude Code