Skip to content

feat(synthesis): const-CSE detection + function-wide application (VCR-RA-001)#245

Merged
avrabe merged 2 commits into
mainfrom
feat/vcr-const-cse-detection
Jun 4, 2026
Merged

feat(synthesis): const-CSE detection + function-wide application (VCR-RA-001)#245
avrabe merged 2 commits into
mainfrom
feat/vcr-const-cse-detection

Conversation

@avrabe

@avrabe avrabe commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Two analysis increments on the verified-codegen allocator track (VCR-RA-001, epic #242), building on the liveness primitive from #243. Pure analysis — not wired into codegen → emitted bytes unchanged.

1. Redundant-constant detection (const-CSE analysis)

redundant_const_defs(): within a straight-line block, the constant materializations that re-derive a value already resident in a register (not clobbered since) — the forward dual of local_dead_defs (which finds a write with no later read; this finds a write whose value was already computed).

Targets gale's #1 measured lever (#209): the int8 saturation clamps #0x7e / #0x7f re-materialized 6× each, 61% of all constant loads redundant on flat_flight. A test reproduces that exact pattern → 5 of 6 flagged redundant. Recognizes the single-instruction const forms (Movw 16-bit, Mov #imm); a Movw;Movt pair is correctly not a pure 16-bit const.

2. Function-wide application

analyze_function(): the per-block detectors decline (None) on any branch; real functions (flat_flight's br_table clamps) are a sequence of straight-line blocks. This splits the stream into maximal straight-line, fully-modeled segments, runs both detectors per segment, and reports global indices + segment/skip counts. Sound by construction: state resets at every boundary → cross-block redundancy is under-reported, never over-reported. This is what lets the detector be pointed at flat_flight and yield a real function-wide count instead of declining on the first branch.

Verification

  • 7 new tests total (5 const-CSE incl. the flat_flight 6×-clamp pattern; 2 function-wide incl. across-a-branch + boundary-reset soundness).
  • Full synth-synthesis lib suite: 276 passed.
  • control_step 0x00210A55 ORACLE PASS (bit-identical).

Side-by-side discipline

Both halves of gale's dominant waste are now detectable on real functions without changing a byte. gale confirmed (#209) the Track-A baseline is frozen and byte-identical off merged main — "ready to reflash the moment [code] emits a delta." The transform (allocator keeps the clamp bounds resident; or DCE deletes a dead store) migrates in as a separate oracle-gated step under VCR-VER-001.

Part of #242.

🤖 Generated with Claude Code

…analysis)

Second analysis on the verified-codegen allocator track (#242), building on the
liveness primitive from #243. Adds redundant_const_defs(): within a straight-line
block, the constant materializations that re-derive a value ALREADY resident in a
register (not clobbered since) — the forward dual of local_dead_defs (which finds
a write with no later read; this finds a write whose value was already computed).

This is the analysis behind const-CSE / rematerialization-avoidance — the
dominant hot-path waste gale measured on flat_flight (#209): the int8 saturation
clamps #0x7e / #0x7f re-materialized 6x each, 61% of all constant loads
redundant. A new test reproduces that exact pattern (6 clamp loads, first stays
live -> 5 flagged redundant).

Pure detection — reports the opportunity; the fix (keep the constant resident
across its uses) is the allocator's job. Recognizes the single-instruction const
forms (Movw 16-bit, Mov #imm); a Movw;Movt pair is correctly NOT treated as a
pure 16-bit const (Movt invalidates the tracked value). Returns None for any
stream it cannot fully model (branch / call / unmodeled op), so a result is never
wrong. Not wired into codegen -> emitted bytes unchanged.

Tests (5): rematerialization-while-resident flagged; clobber respected;
flat_flight 6x-clamp pattern -> 5 redundant; Movw;Movt pair not a pure const;
declines on branch. Full lib suite 274 passed; control_step ORACLE PASS
(bit-identical).

Part of #242.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Make the analyses usable on real (branchy) functions. The per-block detectors
decline (None) on any branch; real functions like flat_flight (the br_table
clamps gale measures, #209) are a sequence of straight-line blocks separated by
control flow.

analyze_function() splits the stream into MAXIMAL straight-line, fully-modeled
segments (a branch/call/unmodeled op ends the current segment), runs both
local_dead_defs and redundant_const_defs per segment, and reports global
indices + segment/skip counts. Sound by construction: state resets at every
boundary, so cross-block redundancy is under-reported, never over-reported, and
an unmodeled instruction never lands inside a segment. Never returns None — it
analyzes whatever is analyzable.

This is what lets the const-CSE detector be pointed at flat_flight (gale's frozen
Track-A baseline 315 cyc / 34 const-loads / 13 distinct) and yield a real
function-wide redundancy count, instead of declining on the first branch.

Tests (2): redundancy found in BOTH segments across a branch (per-block detector
declines on the whole stream); no cross-boundary redundancy claimed (state reset
at the boundary). Full lib suite 276 passed; control_step ORACLE PASS
(bit-identical, still pure analysis — not wired into codegen).

Part of #242.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@avrabe avrabe changed the title feat(synthesis): redundant-constant detection (VCR-RA-001, const-CSE analysis) feat(synthesis): const-CSE detection + function-wide application (VCR-RA-001) Jun 4, 2026
@avrabe avrabe merged commit b52229a into main Jun 4, 2026
12 checks passed
@avrabe avrabe deleted the feat/vcr-const-cse-detection branch June 4, 2026 19:01
@codecov

codecov Bot commented Jun 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.27660% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/synth-synthesis/src/liveness.rs 96.27% 7 Missing ⚠️

📢 Thoughts on this report? Let us know!

avrabe added a commit that referenced this pull request Jun 4, 2026
…re transform) (#246)

The analysis layer is complete on main (#243 + #245); this adds the first
TRANSFORM built on it — eliminate_dead_stores(): removes the provably-dead
stores analyze_function finds (a write overwritten before any use within its
straight-line segment, dead regardless of cross-block liveness). The generic,
function-wide form of the hand-rolled #221 peephole.

Branch-offset safety is by construction: dead instructions are interior to a
straight-line segment and never branch targets, so the rewrite is offset-neutral
— intended to run on select_with_stack output BEFORE resolve_label_branches,
where branches are still symbolic labels.

Side-by-side discipline: this is a PURE function, NOT wired into codegen by this
change, so emitted bytes are unchanged (control_step ORACLE PASS). The wiring is
a separate, fully oracle-gated step (every differential fixture result-identical
+ a measured size/cycle delta + fuzz) — the migration that finally emits gale's
awaited flat_flight delta.

Tests (3): overwritten def removed (keeps the live one); no-op + identical stream
when nothing dead; a def at a segment boundary is never removed (could be live
past the branch). Full lib suite 279 passed.

Part of #242.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
avrabe added a commit that referenced this pull request Jun 4, 2026
…put (VCR-RA-001) (#247)

* test(synthesis): validate liveness analyses against real selector output (VCR-RA-001)

The liveness/CSE analyses (#243/#245) and the dead-store transform (#246) are
unit-tested on synthetic instruction vectors. Before wiring the transform into
the codegen path, confirm the analyses are sound on REAL select_with_stack
output: run the actual selector on a constant-reusing sequence
((p & 0x7e) + (p & 0x7e), mirroring gale's repeated-clamp shape) and assert
analyze_function / redundant_const_defs / eliminate_dead_stores are internally
consistent with the emitted instructions — every reported redundant-const index
is a materialization of the claimed value, dead-def indices are valid, and
elimination is length-consistent.

This is the precondition for the gated wiring step: the analysis correctly reads
real codegen, not just hand-built vectors. Pure test addition — no codegen
change.

Part of #242.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* style: rustfmt the VCR validation test

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
avrabe added a commit that referenced this pull request Jun 5, 2026
…rectness fixes (#260)

Promote the accumulated v0.11.30 work into the CHANGELOG before tagging:
native-pointer ABI (#237) + the VCR-* constant-immediate folding (#250/#252/#254)
+ analysis foundation (#243/#245) + three latent-miscompile encoder fixes
(#251 ORR/EOR NOP, #253 ADD/SUB large-frame, #255 CMP/ADDS/SUBS ThumbExpandImm).
Adds a falsification statement covering the encoder correctness class.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant