feat(synthesis): const-CSE detection + function-wide application (VCR-RA-001)#245
Merged
Conversation
…analysis) Second analysis on the verified-codegen allocator track (#242), building on the liveness primitive from #243. Adds redundant_const_defs(): within a straight-line block, the constant materializations that re-derive a value ALREADY resident in a register (not clobbered since) — the forward dual of local_dead_defs (which finds a write with no later read; this finds a write whose value was already computed). This is the analysis behind const-CSE / rematerialization-avoidance — the dominant hot-path waste gale measured on flat_flight (#209): the int8 saturation clamps #0x7e / #0x7f re-materialized 6x each, 61% of all constant loads redundant. A new test reproduces that exact pattern (6 clamp loads, first stays live -> 5 flagged redundant). Pure detection — reports the opportunity; the fix (keep the constant resident across its uses) is the allocator's job. Recognizes the single-instruction const forms (Movw 16-bit, Mov #imm); a Movw;Movt pair is correctly NOT treated as a pure 16-bit const (Movt invalidates the tracked value). Returns None for any stream it cannot fully model (branch / call / unmodeled op), so a result is never wrong. Not wired into codegen -> emitted bytes unchanged. Tests (5): rematerialization-while-resident flagged; clobber respected; flat_flight 6x-clamp pattern -> 5 redundant; Movw;Movt pair not a pure const; declines on branch. Full lib suite 274 passed; control_step ORACLE PASS (bit-identical). Part of #242. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6 tasks
Make the analyses usable on real (branchy) functions. The per-block detectors decline (None) on any branch; real functions like flat_flight (the br_table clamps gale measures, #209) are a sequence of straight-line blocks separated by control flow. analyze_function() splits the stream into MAXIMAL straight-line, fully-modeled segments (a branch/call/unmodeled op ends the current segment), runs both local_dead_defs and redundant_const_defs per segment, and reports global indices + segment/skip counts. Sound by construction: state resets at every boundary, so cross-block redundancy is under-reported, never over-reported, and an unmodeled instruction never lands inside a segment. Never returns None — it analyzes whatever is analyzable. This is what lets the const-CSE detector be pointed at flat_flight (gale's frozen Track-A baseline 315 cyc / 34 const-loads / 13 distinct) and yield a real function-wide redundancy count, instead of declining on the first branch. Tests (2): redundancy found in BOTH segments across a branch (per-block detector declines on the whole stream); no cross-boundary redundancy claimed (state reset at the boundary). Full lib suite 276 passed; control_step ORACLE PASS (bit-identical, still pure analysis — not wired into codegen). Part of #242. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
avrabe
added a commit
that referenced
this pull request
Jun 4, 2026
…re transform) (#246) The analysis layer is complete on main (#243 + #245); this adds the first TRANSFORM built on it — eliminate_dead_stores(): removes the provably-dead stores analyze_function finds (a write overwritten before any use within its straight-line segment, dead regardless of cross-block liveness). The generic, function-wide form of the hand-rolled #221 peephole. Branch-offset safety is by construction: dead instructions are interior to a straight-line segment and never branch targets, so the rewrite is offset-neutral — intended to run on select_with_stack output BEFORE resolve_label_branches, where branches are still symbolic labels. Side-by-side discipline: this is a PURE function, NOT wired into codegen by this change, so emitted bytes are unchanged (control_step ORACLE PASS). The wiring is a separate, fully oracle-gated step (every differential fixture result-identical + a measured size/cycle delta + fuzz) — the migration that finally emits gale's awaited flat_flight delta. Tests (3): overwritten def removed (keeps the live one); no-op + identical stream when nothing dead; a def at a segment boundary is never removed (could be live past the branch). Full lib suite 279 passed. Part of #242. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
avrabe
added a commit
that referenced
this pull request
Jun 4, 2026
…put (VCR-RA-001) (#247) * test(synthesis): validate liveness analyses against real selector output (VCR-RA-001) The liveness/CSE analyses (#243/#245) and the dead-store transform (#246) are unit-tested on synthetic instruction vectors. Before wiring the transform into the codegen path, confirm the analyses are sound on REAL select_with_stack output: run the actual selector on a constant-reusing sequence ((p & 0x7e) + (p & 0x7e), mirroring gale's repeated-clamp shape) and assert analyze_function / redundant_const_defs / eliminate_dead_stores are internally consistent with the emitted instructions — every reported redundant-const index is a materialization of the claimed value, dead-def indices are valid, and elimination is length-consistent. This is the precondition for the gated wiring step: the analysis correctly reads real codegen, not just hand-built vectors. Pure test addition — no codegen change. Part of #242. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * style: rustfmt the VCR validation test Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This was referenced Jun 4, 2026
avrabe
added a commit
that referenced
this pull request
Jun 5, 2026
…rectness fixes (#260) Promote the accumulated v0.11.30 work into the CHANGELOG before tagging: native-pointer ABI (#237) + the VCR-* constant-immediate folding (#250/#252/#254) + analysis foundation (#243/#245) + three latent-miscompile encoder fixes (#251 ORR/EOR NOP, #253 ADD/SUB large-frame, #255 CMP/ADDS/SUBS ThumbExpandImm). Adds a falsification statement covering the encoder correctness class. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two analysis increments on the verified-codegen allocator track (VCR-RA-001, epic #242), building on the liveness primitive from #243. Pure analysis — not wired into codegen → emitted bytes unchanged.
1. Redundant-constant detection (const-CSE analysis)
redundant_const_defs(): within a straight-line block, the constant materializations that re-derive a value already resident in a register (not clobbered since) — the forward dual oflocal_dead_defs(which finds a write with no later read; this finds a write whose value was already computed).Targets gale's #1 measured lever (#209): the int8 saturation clamps
#0x7e/#0x7fre-materialized 6× each, 61% of all constant loads redundant onflat_flight. A test reproduces that exact pattern → 5 of 6 flagged redundant. Recognizes the single-instruction const forms (Movw16-bit,Mov #imm); aMovw;Movtpair is correctly not a pure 16-bit const.2. Function-wide application
analyze_function(): the per-block detectors decline (None) on any branch; real functions (flat_flight'sbr_tableclamps) are a sequence of straight-line blocks. This splits the stream into maximal straight-line, fully-modeled segments, runs both detectors per segment, and reports global indices + segment/skip counts. Sound by construction: state resets at every boundary → cross-block redundancy is under-reported, never over-reported. This is what lets the detector be pointed atflat_flightand yield a real function-wide count instead of declining on the first branch.Verification
flat_flight6×-clamp pattern; 2 function-wide incl. across-a-branch + boundary-reset soundness).0x00210A55ORACLE PASS (bit-identical).Side-by-side discipline
Both halves of gale's dominant waste are now detectable on real functions without changing a byte. gale confirmed (#209) the Track-A baseline is frozen and byte-identical off merged
main— "ready to reflash the moment [code] emits a delta." The transform (allocator keeps the clamp bounds resident; or DCE deletes a dead store) migrates in as a separate oracle-gated step underVCR-VER-001.Part of #242.
🤖 Generated with Claude Code