Skip to content

fix(opt): defensive panic on unmapped vreg instead of silent R0 fallback#101

Merged
avrabe merged 1 commit into
mainfrom
fix/defensive-vreg-fallback
May 14, 2026
Merged

fix(opt): defensive panic on unmapped vreg instead of silent R0 fallback#101
avrabe merged 1 commit into
mainfrom
fix/defensive-vreg-fallback

Conversation

@avrabe

@avrabe avrabe commented May 11, 2026

Copy link
Copy Markdown
Contributor

Follow-up to PR #97 (which fixed the silicon-blocking #93 memset bug).

The root cause of #93 was that optimizer_bridge::wasm_to_ir silently dropped three wasm ops (I64ExtendI32U/S, I32WrapI64). The downstream lookup helper get_arm_reg returned Reg::R0 as a silent fallback when a vreg had no assignment — so subsequent i64 shifts read R0 as their rm_lo/rm_hi, destroying memset's destination pointer.

This PR replaces the silent fallback with a diagnostic panic naming the offending vreg. Any future "wasm op silently dropped" bugs of this class will surface as a compiler crash with a useful error message, instead of producing miscompiled firmware that boots-and-loops on real silicon.

A compiler crash is strictly better than a hung MCU.

Test plan

🤖 Generated with Claude Code

avrabe added a commit that referenced this pull request May 11, 2026
Unblocks every open PR (#96, #97, #99, #100, #101) that was failing on `No space left on device` during z3-sys's C++ build on smithy runners.
@avrabe avrabe force-pushed the fix/defensive-vreg-fallback branch from 3530719 to 37b9787 Compare May 11, 2026 05:30
@avrabe avrabe closed this May 11, 2026
@avrabe avrabe reopened this May 11, 2026
@avrabe avrabe force-pushed the fix/defensive-vreg-fallback branch from 37b9787 to 592fd27 Compare May 11, 2026 18:34
@avrabe avrabe force-pushed the fix/defensive-vreg-fallback branch from 592fd27 to d752201 Compare May 12, 2026 04:28
avrabe added a commit that referenced this pull request May 13, 2026
#101

The systematic AAPCS-clobber audit (47 sites + 27 tests) is shippable
on its own. The defensive panic in get_arm_reg fires on at least one
remaining wasm_to_ir gap (vreg v13 during simple_add.wat fib compile,
seen on Bazel CI for genrules test_add_elf / test_add_m7_elf /
test_add_riscv_elf). That panic belongs in #101 once the remaining
gap is closed.

Cargo workspace tests + clippy + fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
avrabe added a commit that referenced this pull request May 13, 2026
… usages (#108)

Sweep of 47 hardcoded R0..R3 sites across the optimizer + selector. Closes the recurring class that PR #86 (i64 const), #97 (i64 extend/wrap), #106 (24 i64 ops), and #107 (CSE MemLoad/Store) each addressed one slice of.

Categories audited:
* wasm_to_ir slot accounting for i64 arith/extend ops (15 + 3 ops)
* wasm_to_ir op-handler gaps (10 wasm ops → 6 new Opcode variants)
* CSE pass: missing src-vreg resolution for all i64 opcodes + alias invalidation on re-def
* ir_to_arm hardcoded `Reg::R0` / `Reg::R3` destinations (24 handlers fixed via `alloc_i32_scratch`)
* select_with_stack i64 sub-word loads hardcoding `R0:R1` dest pair

27 regression tests added across 4 new test files. Defensive panic stays in PR #101 (one remaining v13 vreg case during simple_add.wat fib compilation is being tracked down — silent R0 fallback preserved in this PR).
@avrabe avrabe force-pushed the fix/defensive-vreg-fallback branch from d752201 to 5a2c798 Compare May 13, 2026 19:06
@avrabe avrabe changed the base branch from main to fix/wasm-to-ir-call-result-vreg May 13, 2026 19:06
@codecov

codecov Bot commented May 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/synth-synthesis/src/optimizer_bridge.rs 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

avrabe added a commit that referenced this pull request May 14, 2026
Closes the last latent unmapped-vreg gap surfaced by the systematic AAPCS audit (PR #108). `WasmOp::Call` had no wasm_to_ir handler, fell through to `Opcode::Nop`. The call's result vreg was never mapped, and any downstream consumer (`i32.add(call_result, ...)`) resolved its operand via `get_arm_reg`'s silent R0 fallback — miscompiling on every fib-style recursive WAT.

Fix: new `Opcode::Call { dest, func_idx }` in synth-opt, mapped in wasm_to_ir, lowered to `BL func_<idx>` with `dest → R0` in ir_to_arm.

Narrow scope (does not yet model R0..R3 invalidation across BL — broader call-boundary regalloc rework deferred).

+2 regression tests. Unblocks #101 (defensive panic ships next).
Base automatically changed from fix/wasm-to-ir-call-result-vreg to main May 14, 2026 04:50
Follow-up to issue #93 (silicon-blocking memset bug). The root cause was
optimizer_bridge::wasm_to_ir silently dropping three wasm ops
(I64ExtendI32U/S, I32WrapI64) — they produced vregs that never got mapped
to ARM registers. The downstream lookup in get_arm_reg returned Reg::R0
as a silent fallback, so subsequent i64 shifts read R0 as rm_lo/rm_hi,
destroying memset's destination pointer on real silicon.

Replace the silent fallback with a panic that names the vreg. Future
"wasm op silently dropped" bugs of this class will surface at the
boundary instead of producing miscompiled firmware. A compiler crash is
strictly better than a hung MCU.

Verified:
* cargo test --workspace — all pass (no path legitimately hits the
  fallback — the post-#97 codebase produces mappings for every wasm op
  it emits IR for)
* cargo clippy --workspace --all-targets -- -D warnings — clean
* cargo fmt --check — clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@avrabe avrabe force-pushed the fix/defensive-vreg-fallback branch from 5a2c798 to 534735a Compare May 14, 2026 04:55
@avrabe avrabe merged commit b4c8780 into main May 14, 2026
6 of 7 checks passed
@avrabe avrabe deleted the fix/defensive-vreg-fallback branch May 14, 2026 05:29
avrabe added a commit that referenced this pull request May 14, 2026
Two harnesses (i64_lowering_doesnt_clobber_params, encoder_no_panic)
keep finding new AAPCS-class bugs at a rate faster than we close them
in a single release cycle. Mark them as non-gating (continue-on-error)
so the harness infrastructure can ship as part of #100 while the
bug-hunting continues in v0.3.x patches.

Promotion criteria for moving a harness from exploration → gating: 60s
smoke runs report zero findings on main for 2 consecutive weeks.

Open bugs tracked: #103 (fixed via #106), #104 (fixed via #107), the
class-wide sweep #108, the WasmOp::Call gap #109/#101, the I32WrapI64
preassign #111, and the new #112 (i64-extend chain + Movw R0).
avrabe added a commit that referenced this pull request May 21, 2026
…113 (#117)

* feat(riscv): WasmOp::Call lowering — leaf-call subset

v0.3.1 minimum-viable cross-function call support in the RISC-V
selector. WasmOp::Call(idx) no longer errors with `Unsupported` for
leaf-call shapes — it now lowers to a label-based RiscVOp::Call that
the ELF builder resolves to a PC-relative `auipc + jalr` when the
callee is in the same compilation unit.

Behavior:
  * Move top N vstack values (capped at 8) into a0..a(N-1).
  * Emit `RiscVOp::Call { label: format!("synth_func_{idx}") }`.
  * Push a fresh `a0` vreg as the return value.

What's deliberately deferred (documented in the lower_call doc + the
#[ignore]-marked `recursive_self_call_emits_two_call_ops` test):
  * Function-signature plumbing from the decoder. Without it, the
    selector can't know how many args to pop, so the v0.3.1 cut
    over-consumes the vstack on back-to-back calls with surviving
    results. v0.4 will pipe `FuncSig` through and lift this restriction.
  * Args beyond 8 (RV psABI says spill to stack at fixed offsets — not
    implemented).
  * Caller-side a0..a7 invalidation across the BL — callers wanting to
    survive a call should `drop` or `local.tee` their live values
    explicitly until v0.4 models this properly.
  * Multi-result returns (wasm 2.0).
  * Cross-`.text` relocations for multi-unit linking.

Tests:
  * `call_emits_label_and_argument_marshalling` — single-arg call, label
    encodes `synth_func_{idx}`.
  * `call_two_args_marshals_to_a0_a1` — two-arg call from i32.const seq.
  * `recursive_self_call_emits_two_call_ops` — #[ignore]'d documentation
    of the back-to-back-calls gap, to be flipped when v0.4 plumbing
    lands.

Total: 100 passing tests in synth-backend-riscv (was 99); 1 ignored
that documents the next milestone.

* fix(lowering): return Err on stack underflow instead of panic — fuzz #113

The gating fuzz harness `wasm_ops_lower_or_error` surfaced a panic on
`FuzzInput { num_params: 1, ops: [I32DivS] }`. The harness contract is
"lower or return Err — no panics", and the panic was an unmapped-vreg
defensive assert (added in PR #101) firing on malformed wasm input.

Root cause:

  `OptimizerBridge::wasm_to_ir` synthesizes binary-op IR by referencing
  `OptReg(inst_id.saturating_sub(2))` / `saturating_sub(1)` as src1/src2.
  For a *lone* `I32DivS` (inst_id == 0), both saturating subtractions
  return 0 — the IR self-references its own dest as src1/src2. The
  resulting vreg v0 was never produced by any prior op, so the unmapped-
  vreg defensive panic at optimizer_bridge.rs:1617 fired.

Fix:

  New `synth_core::wasm_stack_check::check_no_underflow(ops)` — a
  pre-flight wasm value-stack underflow detector. Called at the top of:
    * `OptimizerBridge::optimize_full`
    * `InstructionSelector::select_with_stack`

  Stack-effect modeling covers the FuzzOp surface (all i32/i64/f32/f64
  arithmetic, conversions, locals, memory, select). Control-flow ops
  (Block/Loop/If/Else/Br/BrIf/Return/Call/Unreachable) bail
  conservatively — they have block-type-dependent effects we can't
  compute without function signatures. Production callers come through
  the wasm decoder (wasmparser), which already does full validation;
  this is a safety net for *direct callers* like the fuzz harnesses.

  The defensive panic at line 1617 is *not* removed — it still catches
  genuine wasm_to_ir gaps (the class of bug from issues #93, #109). The
  pre-flight check disambiguates "malformed wasm input" → typed Err
  from "synth internal bug" → loud panic.

Tests:

  * `crates/synth-core/src/wasm_stack_check.rs` — 11 unit tests covering
    binary/unary/store/drop/select underflow, control-flow bail, and
    happy paths.
  * `crates/synth-synthesis/tests/regression_i32divs_lone_stack_underflow.rs`
    — 3 tests reproducing the fuzz crash exactly. Asserts both lowering
    paths (and the harness-shape combined path) return cleanly.

  Full workspace test (excluding synth-verify / z3): 0 regressions
  (241 synth-synthesis tests, 52 synth-core tests, all green).

Fuzz infrastructure:

  Added `fuzz/seed_corpus/<target>/` directory layout for committed
  regression seeds. The fuzz-smoke workflow now copies these into the
  per-target corpus before running, so this crash input gets replayed
  on every CI run — even if libfuzzer's random walk wouldn't rediscover
  it within the 60s budget.

  First seed: `fuzz/seed_corpus/wasm_ops_lower_or_error/seed-pr117-
  i32divs-empty-stack` (10 bytes, the exact crash artifact uploaded by
  the failing #113 run).

Local verification:
  * `cargo test --workspace --exclude synth-verify` — 0 failures
  * `cargo clippy --package synth-core --package synth-synthesis
     --all-targets -- -D warnings` — clean
  * `cargo fmt --check` — clean

* fix(lowering): also catch Unreachable+binary-op shape — fuzz follow-up

The initial fix in this PR modeled `Unreachable` as `StackEffect::Bail`,
which made `check_no_underflow` short-circuit to `Ok(())` as soon as it
saw an `Unreachable`. The gating fuzz harness immediately found a
follow-up crash:

    FuzzInput { num_params: ..., ops: [Unreachable, I32GeS] }

The `[Unreachable, I32GeS]` sequence slipped past the bail and tripped
the unmapped-vreg panic at `optimizer_bridge.rs:1624` — same panic site
as the original `[I32DivS]` crash, different path in.

Root cause:

  wasm_to_ir handles `Unreachable` via the catch-all `_ => Opcode::Nop`
  arm (line 1358). The Nop produces no value, but the subsequent
  `I32GeS` mechanically references `OptReg(inst_id.saturating_sub(2))`
  / `saturating_sub(1)` as its operands — both saturate to `OptReg(0)`,
  which was the Nop's "dest" and never got assigned to an ARM register.
  ir_to_arm's defensive panic fires.

Fix:

  Change `Unreachable => StackEffect::Bail` to
  `Unreachable => modeled(0, 0)`. The wasm spec treats post-unreachable
  ops as type-checking against a polymorphic stack — we don't model
  that (would need a type system). Pragmatically, modeling Unreachable
  as stack-neutral makes the next op see depth 0, triggering the
  underflow check exactly when needed.

  Cost: formally-valid wasm with code-after-Unreachable that doesn't
  re-push operands (e.g. `(unreachable) (i32.ge_s)`) is now rejected.
  Real compilers don't emit this shape — wasmparser-decoded production
  input always has `i32.const` / `local.get` between the `unreachable`
  and any binary op, so depth is non-zero when the op fires and the
  check passes. The pathological case is a fuzz-harness construction,
  not a real wasm pattern.

Tests:

  * `crates/synth-core/src/wasm_stack_check.rs`:
    - Removed `unreachable_terminates_check` (asserted the old Bail
      behavior).
    - Added `unreachable_then_binary_op_at_depth_zero_is_underflow`
      (asserts the new rejection).
    - Added `unreachable_then_consts_then_binary_op_is_ok` (asserts the
      formally-valid pattern is still accepted).
  * `crates/synth-synthesis/tests/regression_i32divs_lone_stack_underflow.rs`:
    - Added `unreachable_then_binary_op_does_not_panic_optimized_path`
    - Added `unreachable_then_binary_op_does_not_panic_non_optimized_path`

Seed corpus:

  Added `fuzz/seed_corpus/wasm_ops_lower_or_error/seed-pr117-followup-
  unreachable-i32ges` — the exact 16-byte crash artifact from CI run
  25920230872. The fuzz-smoke workflow seeds the corpus from this dir
  on every run, so future regressions on the same shape will be caught
  deterministically.

Verification:
  * `cargo test --package synth-core --lib wasm_stack_check` —
    12 passed (was 11; -1 +2).
  * `cargo test --package synth-synthesis --test
     regression_i32divs_lone_stack_underflow` — 5 passed (was 3; +2).

* fix(lowering): treat Return / Br / BrTable as stack-neutral too

Third pass at the fuzz crash class. d1b2958 fixed `Unreachable` by
changing it from `StackEffect::Bail` to `modeled(0, 0)`. The harness
immediately found the next-shallowest path:

    FuzzInput { num_params: ..., ops: [Return, I64Eqz, I32Const(0)] }

Same shape — Return was bailing the same way Unreachable did. All four
wasm terminators (Unreachable, Return, Br, BrTable) have stack-
polymorphic semantics per the wasm spec, but our pre-flight check can't
model polymorphism. Modeling them as stack-neutral makes subsequent ops
see their pre-terminator depth and trigger the underflow check exactly
when needed.

Also tightened the rest of the control-flow surface:

* `BrIf(_)` — pops 1 (the i32 condition); no longer bails. `[BrIf]` at
  depth 0 is now correctly rejected.
* `Block | Loop | If | Else | End` — modeled as 0/0 (the previous Bail
  was over-conservative; their effects depend on block types we don't
  have but the depth tracking we already do is more informative than
  silently accepting).
* `Call(_)` — still Bail. Callee signature is genuinely unknown without
  the function table; that's an upstream-validator concern.

Tests:

  * `wasm_stack_check.rs`: +4 cases (`return_*`, `br_*`, `br_if_*`,
    rename of the old `control_flow_*` → `call_bails_conservatively`).
    16 total, all pass.
  * `regression_i32divs_lone_stack_underflow.rs`: +2 cases
    (`return_then_binary_op_does_not_panic_{opt,non_opt}_path`). 7
    total, all pass.

Seed corpus:

  Added `seed-pr117-followup-return-i64eqz` (10 bytes) from CI run
  25958417317. The wasm_ops_lower_or_error gating harness now has three
  seeded regressions: the original I32DivS, the Unreachable+I32GeS, and
  this Return+I64Eqz.

Full workspace test (excluding synth-verify): 0 regressions.

* fix(wasm_to_ir): Nop must not consume an inst_id slot

Fourth pass at the PR #117 fuzz class. This one is a `wasm_to_ir` bug,
not a pre-flight gap:

    FuzzInput { num_params: ..., ops: [LocalGet(0), Nop, I64ExtendI32U] }

`wasm_to_ir` overloads `inst_id` for both "instruction position" and
"vreg slot index". Every wasm op consumed one inst_id slot, including
`Nop` (which fell through to the `_ => Opcode::Nop` catch-all). The
subsequent `I64ExtendI32U` referenced `inst_id - 1` for its src_slot —
which was the Nop's unmapped slot. Defensive panic.

Fix: explicit `WasmOp::Nop => continue` arm BEFORE the catch-all.
Skipping the slot allocation lets back-references jump cleanly over
the Nop and land on the previous producer (LocalGet's vreg, in this
case).

What stayed the same: the `_ => Opcode::Nop` catch-all for unsupported
ops. Its role is to *trigger* the unmapped-vreg panic on back-
references, which is the bug-finder for missing handlers (issue #93
class). Skipping the slot there would convert "loud panic" into
"silent miscompilation" — strictly worse.

So:
* `Nop` (documented no-op): skip slot — back-references work cleanly.
* Unsupported ops: keep producing Opcode::Nop slot — back-references
  panic loudly so we discover the missing handler.

Tests:
* `regression_i32divs_lone_stack_underflow.rs`:
  +2 cases (`local_get_then_nop_then_extend_does_not_panic_{opt,non_opt}`).
  9 total, all pass.
* Workspace tests (excluding synth-verify): 0 regressions.

Seed corpus:
  Added `seed-pr117-followup-nop-slot-accounting` (20 bytes) from CI
  run 25962942061. Fourth seeded regression in the gating harness.

Fuzz tooling:
  Added `fuzz/examples/decode_crash.rs` — a one-shot decoder that
  takes a crash artifact path and prints the FuzzInput + lowered
  WasmOps. Saves the 60-second roundtrip of writing a one-off
  reproducer every time the harness reports a new crash.

* fix(wasm_to_ir): Unreachable / Return must not consume slots either

Round 5 fix. The gating fuzz harness found `[LocalGet(0), Unreachable,
I64ExtendI32U]` — same shape as round 4's Nop crash, just with
`Unreachable` in the middle. `Unreachable` had no explicit handler in
`wasm_to_ir`, fell through `_ => Opcode::Nop`, consumed an `inst_id`
slot, and the I64ExtendI32U back-reference landed on its unmapped slot.

Fix: extend the round-4 `WasmOp::Nop => continue` arm to also cover
`Unreachable` and `Return`. Both fall through the same catch-all and
have no IR-level value to produce. `Return` is added preemptively to
close the obvious round-6 sibling before the harness finds it.

Deliberately NOT skipped:
* `Br` / `BrIf` / `BrTable` — they have explicit handlers above that
  emit `Opcode::Branch` for branch-target resolution. Skipping their
  slots would break branches. If a dead-code-after-Br crash surfaces,
  it needs a separate fix (decouple inst_id from vreg_slot — larger
  surgery).
* `Block` / `Loop` / `End` — emit `Opcode::Label` referenced by branch
  targets via inst_id. Same reasoning.

Tests:
* `regression_i32divs_lone_stack_underflow.rs`: +3 cases
  (`local_get_then_unreachable_then_extend_does_not_panic_{opt,non_opt}`,
  `local_get_then_return_then_extend_does_not_panic_optimized_path`).
  12 total, all pass.
* Workspace tests (excluding synth-verify): 0 regressions.

Seed corpus:
  Added `seed-pr117-followup-unreachable-slot-accounting` (20 bytes)
  from CI run 25981870327. Fifth seeded regression in the gating
  harness's corpus.

Watcher report: #117 (comment)

* ci(fuzz): temporarily demote wasm_ops_lower_or_error to non-gating

After five rounds of targeted fixes in this PR, the gating fuzz harness
found a structural class of crashes (Drop / LocalSet / Store
consume-without-producing) that needs a proper slot_stack refactor in
wasm_to_ir — tracked as issue #121. The simpler `continue` fix that
worked for Nop / Unreachable / Return is unsound for Drop (it would
cause silent miscompilation by reading the consumed value).

Rather than block this PR's merge on a 30+ handler refactor, demote the
harness to `gating: false` until #121 lands. The harness still runs and
uploads crash artifacts — it just doesn't block CI.

The other gating harness `wasm_to_ir_roundtrip_op_coverage` stays
gating; it catches a different bug class (silent op drops).

Will be reverted to `gating: true` when #121 lands.
avrabe added a commit that referenced this pull request May 21, 2026
* fix(opt): decouple slot-stack from inst_id in wasm_to_ir (#121)

`OptimizerBridge::wasm_to_ir` overloaded `inst_id` as both the unique IR
instruction id AND a vreg-slot index, with back-references like
`inst_id.saturating_sub(2)` assuming a one-to-one correspondence with
the wasm value stack. That assumption broke whenever any wasm op
consumed a stack slot without producing one — Drop, LocalSet, GlobalSet,
the i32/i64 store family, BrIf, and the structural Block/Loop/End
markers. The next binary or unary op's back-reference would then index
a stale or never-mapped vreg, and `get_arm_reg` would either trip the
PR #101 defensive panic or (pre-PR-101) silently fall back to R0 — the
silent-miscompilation class first surfaced in issue #93.

Gale (the real-hardware test rig) caught WASM modules in the field
that tripped this on production silicon; the cargo-fuzz
`wasm_ops_lower_or_error` harness on PR #117 surfaced the same class
six different ways (Nop/Unreachable/Return were closed there; Drop,
LocalSet, Store, Block/Loop/End remained until this PR).

Fix: introduce `slot_stack: Vec<u32>` in `wasm_to_ir` that mirrors the
wasm value stack. Each producer pushes its dest vreg onto slot_stack;
each consumer pops to discover its source vreg. `inst_id` reverts to
its original meaning — a monotonically increasing unique IR id — and
is no longer used for slot lookup.

i64 values occupy two consecutive entries on slot_stack (lo first,
then hi), matching the (dest_lo, dest_hi) two-vreg-pair layout already
used by i64 opcodes. I64ExtendI32U/S aliases dest_lo to the consumed
i32 src vreg by IR convention (preserved); I32WrapI64 aliases dest to
src_lo (preserved). Drop becomes an explicit `slot_stack.pop(); continue`
no-IR-emit arm; Nop/Unreachable/Return emit Opcode::Nop with no
slot_stack effect.

Drive-by: `i64_operand_count` was missing I64DivS/I64DivU/I64RemS/
I64RemU (so `analyze_i64_local_gets` failed to mark their i64
operands), which was masked by the same inst_id-slot scrambling.
Added them; the existing i64-div WAST tests now exercise the i64
LocalGet path instead of fortuitously-correct i32 Loads.

The catch-all `_ => Opcode::Nop` is preserved as a bug-finder: unknown
ops do not touch slot_stack, so subsequent consumers fail loudly via
`slot_stack.pop().expect(...)` instead of silently mis-binding vregs.

Regression coverage: new
`crates/synth-synthesis/tests/regression_issue_121_slot_stack.rs`
exercises Drop/LocalSet/GlobalSet/Store/BrIf/Block-Loop-End/LocalTee
between producer-and-consumer plus i32 and i64 variants. Two
semantic-correctness probes confirm that Popcnt reads the surviving
stack value (not the dropped one) — proving the fix addresses silent
miscompilation, not just the panic.

Test delta: +12 tests, 0 regressions. The 4 fuzz-related regression
tests from #100/#101/#103/#104 plus the #93 memset/i64-shift tests
all continue to pass.

Refs: issue #121, PR #117 (fuzz-harness reproductions), issue #93
(silent-drop class), PR #101 (defensive panic), PR #100 (fuzz harness).

* fix(#121): wire pre-flight wasm_stack_check into slot_stack path

The slot_stack refactor in f0becb6 closes the silent-miscompilation
class that Gale hit on silicon. But the .expect() panics inside the new
slot_stack.pop() sites turn malformed-input cases (which the fuzz
harness can construct) into a different panic class — same defensive
"crash the compiler not the firmware" intent, but the harness contract
is "lower or Err, never panic".

This commit ports #117's pre-flight wasm_stack_check into the #122
branch so malformed inputs return Err *before* reaching the slot_stack
pops. The two layers together give the full guarantee:

  Pre-flight (synth_core::wasm_stack_check::check_no_underflow)
    ↓ rejects malformed wasm before lowering — typed Err
  wasm_to_ir slot_stack model
    ↓ correct semantics for valid wasm — fixes Gale's silicon class
  ir_to_arm
    ↓ unmodified

Includes:
- `crates/synth-core/src/wasm_stack_check.rs` (365 lines, 16 unit
  tests) — copy from PR #117's branch.
- Module declaration in `crates/synth-core/src/lib.rs`.
- Pre-flight call at the top of `OptimizerBridge::optimize_full`
  (`optimizer_bridge.rs:1827`).
- Pre-flight call at the top of `InstructionSelector::select_with_stack`
  (`instruction_selector.rs:3559`).
- `.github/workflows/fuzz-smoke.yml` — ported from #117 with the
  seed_corpus replay step. `wasm_ops_lower_or_error` re-promoted to
  `gating: true` now that the slot_stack model + pre-flight together
  close the panic class.
- `fuzz/seed_corpus/wasm_ops_lower_or_error/seed-pr122-i32sub-empty-stack`
- `fuzz/seed_corpus/wasm_to_ir_roundtrip_op_coverage/seed-pr122-local-tee-empty-stack`
  — the two crash artifacts the harness found on the first #122 CI
  run; now seeded for deterministic replay.

When PR #117 lands first on main, the rebase will deduplicate
wasm_stack_check.rs cleanly (identical files).

Local verification:
- `cargo test -p synth-core wasm_stack` — 16/16 pass.
- `cargo test -p synth-synthesis --test regression_issue_121_slot_stack`
  — 12/12 pass.
- `cargo test --workspace --exclude synth-verify` — 0 regressions.

Refs: issues #121 (Gale silicon), #117 (pre-flight origin).

* fix(opt): wasm_to_ir returns Result — propagate slot_stack underflow

The slot_stack refactor in f0becb6 introduced 50 `.expect(...)` sites
on `slot_stack.pop()` with the comment "wasm validator + pre-flight
check guarantee stack depth". The pre-flight check (ported in 3e3e3b1)
tracks wasm-value depth, but the slot_stack model uses *slot* depth
where i64 = 2 slots. They diverge for type-mismatched malformed
wasm: `[I32Const, I64Clz]` looks fine at the wasm-value layer (depth
1, I64Clz pops 1 pushes 1) but at the slot layer I64Clz wants 2 slots
and only has 1 → pop on empty → `.expect` panic.

This shape is rejected by the wasm validator in production, but the
fuzz harness constructs it directly bypassing wasmparser. To honor the
harness contract ("lower or Err, never panic"), wasm_to_ir now returns
`Result<(Vec<Instruction>, Cfg)>` and each `slot_stack.pop()` becomes
`.ok_or_else(|| Error::validation(...))?` instead of `.expect(...)`.

48 mechanical site replacements via sed:
  `.expect("wasm validator + pre-flight check guarantee stack depth")`
  →
  `.ok_or_else(|| synth_core::Error::validation(
      "wasm stack underflow in wasm_to_ir (slot_stack pop on empty)"
  ))?`

The function signature changed:
  `fn wasm_to_ir(...) -> (Vec<Instruction>, Cfg)`
  →
  `fn wasm_to_ir(...) -> Result<(Vec<Instruction>, Cfg)>`

Both callers in `optimize_full` and `optimize` propagate via `?`.

Tests:
  * Three new cases in `regression_issue_121_slot_stack.rs`:
    `type_mismatch_i32_then_i64_clz_does_not_panic`,
    `i64_unary_on_empty_stack_does_not_panic`,
    `i64_binary_on_partial_stack_does_not_panic`.
    15 total, all pass.
  * Workspace tests (excluding synth-verify): 0 regressions.

Seed corpus:
  `fuzz/seed_corpus/wasm_ops_lower_or_error/seed-pr122-i64clz-on-i32`
  (16 bytes) — the third #122 CI crash artifact.
avrabe added a commit that referenced this pull request May 21, 2026
…ct selection (#120) (#126)

OptimizerBridge::wasm_to_ir had zero handlers for scalar f32/f64 ops —
every float op fell through the catch-all `_ => Opcode::Nop`. A
value-producing float op (e.g. f32.div) that emits Nop produces no
vreg, so any downstream consumer references an unmapped vreg and trips
the PR #101 defensive get_arm_reg panic in ir_to_arm. This is the #93
silent-drop class, for floats — the customer-reported crash on
compiler_builtins float::div and gale_compute_ipi_mask.

The IR Opcode enum (synth-opt) has no float opcodes at all, so the
optimized path cannot lower floats without a large feature addition.
Fix (Option A): optimize_full now detects scalar f32/f64 ops up front
and returns a typed Err(Error::UnsupportedInstruction). The ARM
backend's compile_wasm_to_arm catches that Err and falls back to the
non-optimized InstructionSelector::select_with_stack path, which
already handles f32 via VFP/FPU. Honest degradation: float-containing
functions still compile correctly, just without IR-level optimization.

The defensive panic at get_arm_reg is intentionally kept — the fix
makes float ops never reach it, rather than removing the bug-finder.

f64 remains unsupported in both paths; the optimized path now declines
it cleanly instead of panicking.

Tests:
- crates/synth-synthesis/tests/regression_issue_120_f32_optimized.rs:
  exercises optimize_full + ir_to_arm (the path that panicked) for
  f32.div and assorted f32/f64 ops; asserts a clean Err, not a panic.
- arm_backend.rs: end-to-end backend tests compiling f32 div/add/mul/sub
  on Cortex-M4F via the default (optimized) config, asserting non-empty
  machine code; plus a no-FPU rejection test.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
avrabe added a commit that referenced this pull request May 23, 2026
…re-gated (#132)

* fix(opt): ir_to_arm returns Result instead of panicking on unmapped vreg

`ir_to_arm`'s `get_arm_reg` closure panicked ("synth internal compiler
error: vreg vN has no assigned ARM register...") when an IR vreg had no
mapping and no spill slot. PR #101 introduced that panic — correct then
(a loud crash beats a silent R0-fallback miscompilation), but it was the
last remaining panic site in the optimized lowering path.

Convert the `panic!` into `Err(Error::synthesis(...))` carrying the same
issue-#93 diagnostic, and change `ir_to_arm`'s signature from
`Vec<ArmOp>` to `Result<Vec<ArmOp>>`. All 166 `get_arm_reg` call sites
propagate via `?`. The defensive check is preserved — a genuine
`wasm_to_ir` bug still surfaces as a rich typed error, it just no longer
kills the process.

Callers updated: `arm_backend.rs` folds the `ir_to_arm` step into the
existing `optimize_full` fallback so an `Err` falls back to the direct
selector (honest degradation, same as the issue-#120 float path); the
`wasm_ops_lower_or_error` fuzz harness skips on `Err`. New regression
test `regression_ir_to_arm_panic_free` confirms a hand-built IR with an
unmapped src vreg now yields `Err` (not a panic) and that the error
keeps its issue-#93 diagnostic content.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* ci(fuzz): re-promote wasm_ops_lower_or_error to gating

The harness was demoted to `gating: false` in PR #117 pending the
issue-#121 slot_stack refactor. That refactor landed, and the last panic
site in the optimized path — `ir_to_arm`'s defensive `get_arm_reg` — now
returns `Result` instead of `panic!`-ing. The whole `optimize_full` +
`ir_to_arm` path is panic-free, so the harness (contract: "lower or Err,
never panic") gates again. Update the matrix comment accordingly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(opt): pop_i32_unary macro also returns Err on underflow

Sibling fix to the panic-free `ir_to_arm` work in this PR: the
`pop_i32_unary!` macro inside `wasm_to_ir` still used `.expect(...)`,
which the earlier slot_stack Result conversion missed because the
`.expect(...)` ended with `,` (the macro pattern) rather than `;` (the
statement pattern the original sed targeted).

The re-gated `wasm_ops_lower_or_error` fuzz harness — re-promoted to
gating in this PR — found a malformed input that slips past the
pre-flight check and reaches an i32 unary op via `pop_i32_unary!`,
triggering the leftover `.expect()` panic. Converted to
`.ok_or_else(...)?` matching `pop_i32_binary!`'s shape so the
function-level `Result` propagates instead of panicking.

This is exactly the bug class re-gating the harness exists to catch —
the gating fuzz did its job here.

* docs(opt): update stale catch-all comment — .expect → Err propagation

The catch-all comment still described downstream consumers failing
'loudly via slot_stack.pop().expect(...)'. After the Result conversion
and the pop_i32_unary! macro fix in this PR, no .expect() remains;
unknown-op back-references now surface as a typed Err via
.ok_or_else(...)?. Same bug-finder semantics, different failure mode.
The comment now matches the code.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
avrabe added a commit that referenced this pull request May 30, 2026
#185)

Two fuzz/lint fixes surfaced while landing #180:

- #185: `encoder_no_panic` found a pre-existing panic — feeding PC (R15) as a
  data operand to a Thumb-2 op guarded by `verify_reg_bits` aborts under
  `-Cdebug-assertions`. Convert the 11 `verify_reg_bits` debug-assert sites to a
  fallible `reg_bits_checked() -> Result<()>` that returns a typed Err, matching
  the established "return Err, not panic" pattern (#101/#120/#132). Add a
  deterministic unit test + seed the crash input into the fuzz corpus.

- Clippy (CI rust 1.96.0) flagged a collapsible `if let { if }` in the #167
  wast_compile test; rewrite as a let-chain.

The encoder has further pre-existing totality gaps (arithmetic underflow in
ARM32 BCond offset, etc.) tracked in #186 — out of scope for this bugfix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
avrabe added a commit that referenced this pull request May 30, 2026
…cause #180, re-enable optimized memory path (#183)

* fix(encoder): high-register Thumb ADD/ADDS/SUBS use 32-bit .W form (#180)

Root-cause of the optimized linear-memory miscompilation (#178, mitigated in
#179 by declining memory ops). The bug was in the Thumb *encoder*, not the
optimizer lowering: `ArmOp::Add` (and `Adds`/`Subs`) reg-forms unconditionally
emitted the 16-bit encoding, whose 3-bit register fields overflow for high
registers. The MemLoad/MemStore base scratch is R12, so `add ip, ip, r0` was
emitted as the corrupt `adds r4, r5, r1` (0x186C) — silently dropping the
address operand, making every optimized pointer-deref read/write a fixed
address. (i64 `Adds`/`Subs` low-word ops hit the same class via R8-R11 pairs.)

Fix: guard the 16-bit form on rd/rn/rm < 8 and fall back to 32-bit
ADD.W/ADDS.W/SUBS.W for high registers, exactly as the Sub handler already did.

- arm_encoder.rs: high-reg guards + encode_thumb32_adds/subs_reg_raw helpers
- optimizer_bridge.rs: remove the #179 decline + is_linear_memory_op (obsolete)
- byte-level encoder regression tests (EB0C 0C00 / EB1A 0A08 / EBBA 0A08)
- re-enable the 4 #[ignore]d issue_104 memory CSE tests
- audit_optimized_aapcs: re-assert optimized memory codegen (not the decline)
- wast_compile: rewrite the #178 CLI test to verify the address operand is used
  (no corrupt 6C 18; contains add.w ip, ip, rN) instead of byte-equality

Verified: optimized `i32.load`/`i32.store` of a pointer param now lowers to
`movw/movt ip; add.w ip, ip, r0; ldr/str [ip,#off]` (byte-checked via objdump),
matching --no-optimize semantics. Full suite: 1282 passed, 0 failed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore(release): v0.11.4 — workspace pin sweep + CHANGELOG (#180)

Bump [workspace.package] version and every path-dep version pin 0.11.3 → 0.11.4
in lockstep (Cargo.toml ×11 + MODULE.bazel + Cargo.lock), per the mandatory
pre-tag pin-sweep. CHANGELOG [0.11.4] with falsification statement.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(encoder): return Err (not panic) on PC operand; fix collapsible-if (#185)

Two fuzz/lint fixes surfaced while landing #180:

- #185: `encoder_no_panic` found a pre-existing panic — feeding PC (R15) as a
  data operand to a Thumb-2 op guarded by `verify_reg_bits` aborts under
  `-Cdebug-assertions`. Convert the 11 `verify_reg_bits` debug-assert sites to a
  fallible `reg_bits_checked() -> Result<()>` that returns a typed Err, matching
  the established "return Err, not panic" pattern (#101/#120/#132). Add a
  deterministic unit test + seed the crash input into the fuzz corpus.

- Clippy (CI rust 1.96.0) flagged a collapsible `if let { if }` in the #167
  wast_compile test; rewrite as a let-chain.

The encoder has further pre-existing totality gaps (arithmetic underflow in
ARM32 BCond offset, etc.) tracked in #186 — out of scope for this bugfix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* style(backend): fix doc_lazy_continuation in #167 test doc comment

CI clippy (rust 1.96.0) flags `-D clippy::doc-lazy-continuation`: the doc line
starting with `>= num_imports` reads as a malformed blockquote. Wrap it in a
code span so no doc line begins with `>`. Pre-existing on main; unblocks the
v0.11.4 Clippy gate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant