fix(encoder): ADD/SUB large immediates use ADDW/SUBW — fixes large-frame miscompile (VCR-RA-001) by avrabe · Pull Request #253 · pulseengine/synth

avrabe · 2026-06-04T21:10:19Z

Latent silent miscompile found while verifying the Add/Sub immediate path for folding.

The bug

encode_thumb32_add/sub emit ADD.W/SUB.W (T3), whose i:imm3:imm8 is a ThumbExpandImm modified immediate — but the code packed the raw bits without expansion. Correct only for imm ≤ 0xFF; for imm ≥ 0x100 it silently encoded a different value.

This path is live for frame alloc/dealloc: add sp, sp, #256 encoded as add sp, sp, #0 → a ≥256-byte frame (≥64 i32 locals, or fewer with spills) corrupted the stack with no error. No fixture hits it (they have small frames), so it was invisible — the kind of silent miscompile the verify-before-transform discipline exists to catch.

The fix

imm ∈ 0x100..=0xFFF → ADDW/SUBW (T4), a plain 12-bit immediate (0–4095, no ThumbExpandImm) → encodes any in-range frame correctly.
imm ≤ 0xFF → keep T3 (existing encodings bit-identical).
imm > 4095 → Err (no single-instruction form), not a silent mis-encode (Ok-or-Err, opt: repair + re-enable the optimized linear-memory path (root-cause the MemLoad ADD-register corruption) #180/arm encoder: panics (debug_assert) on PC/R15 operand instead of returning Err — fuzz-found, pre-existing #185).

Gate

clippy clean; backend suite passes; three frozen differentials stay result-identical (≤0xFF frames → bit-identical): control_step 0x00210A55, flight_seam 0x07FDF307, div_const 338/338. Test pins add sp,sp,#256 → 0d f2 00 1d, sub → ad f2 00 1d, #5000 → Err.

Part of #242.

🤖 Generated with Claude Code

…ame miscompile (VCR-RA-001) Latent silent miscompile found while verifying the Add/Sub immediate path for folding: encode_thumb32_add/sub emit ADD.W/SUB.W (T3), whose `i:imm3:imm8` is a ThumbExpandImm MODIFIED immediate — but the code packed the raw bits without expansion. Correct only for imm <= 0xFF; for imm >= 0x100 it silently encoded a different value. This path is LIVE for frame alloc/dealloc: `add sp, sp, #256` encoded as `add sp, sp, #0` → a >=256-byte frame (>=64 i32 locals, or fewer with spills) corrupted the stack with no error. Fix: for imm in 0x100..=0xFFF use ADDW/SUBW (T4) — a PLAIN 12-bit immediate (0..4095, no ThumbExpandImm) — which encodes any frame size in range correctly. Keep T3 for imm <= 0xFF so existing encodings stay bit-identical. Error for imm > 4095 (no single-instruction form) rather than silently mis-encode (Ok-or-Err, #180/#185). GATE: clippy clean; backend suite passes; the three frozen differential fixtures stay RESULT-identical (they use <= 0xFF frames → bit-identical): control_step 0x00210A55, flight_seam 0x07FDF307, div_const 338/338. Test add_sub_large_immediate_use_addw_subw_not_misencoded pins `add sp,sp,#256 → 0d f2 00 1d`, `sub → ad f2 00 1d`, and `#5000 → Err`. Part of #242. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

codecov · 2026-06-04T21:36:02Z

Codecov Report

❌ Patch coverage is 93.33333% with 3 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
crates/synth-backend/src/arm_encoder.rs	93.33%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

…-RA-001) (#254) Completes the arithmetic+bitwise immediate-folding family (and/or/xor #250/#252, now add/sub). When the operand before i32.add/i32.sub is `i32.const C` with C in 0..=0xFFF and its `movw` cleanly at the tail, fold to `add/sub rd, a, #C` and drop the materialization. Range is the full 0..=0xFFF (4095), wider than the bitwise 0..=0xFF, because the ADD/SUB immediate encoder now uses ADDW/SUBW (T4, plain imm12) for >0xFF (#253) — verified correct before folding into it. This one actively fires on real fixtures: control_step's frame `sub #16` and its const adds now fold (the differential stays result-identical, confirming correctness across a codegen change). Updated test_237_stack_pointer_global_is_ register_promoted to accept the frame-size 16 in its folded `sub #16` form (still a plain scalar immediate, still not __synth_wasm_data-relocated — the property it checks). GATE: clippy clean; 284 lib tests; three frozen differentials RESULT-identical (control_step 0x00210A55, flight_seam 0x07FDF307, div_const 338/338); test i32_add_sub_fold_const_into_immediate (byte + >0xFF ADDW-path values). CI fuzz adds totality. Part of #242. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

… silent-miscompile class (#255) (#256) gale #255: encode_thumb32_cmp_imm / _adds / _subs packed the raw `i:imm3:imm8` field with no ThumbExpandImm expansion — correct only for imm <= 0xFF, silently mis-encoding any larger value (e.g. `cmp #N` comparing the wrong constant → wrong branch). Same class as #251 (ORR/EOR NOP) and #253 (ADD/SUB frame). Fix (gale's suggested shared helper): add `try_thumb_expand_imm(u32) -> Option<u32>` — the correct ARMv7-M ThumbExpandImm reverse-encoder (zero-extended byte, the three replicated-byte patterns, and the 8-bit-rotated form). CMP/ADDS/SUBS now encode any valid modified immediate CORRECTLY and return Err on a genuinely unrepresentable one (CMP/S-forms have no plain-imm12 fallback), forcing the selector to materialize into a register rather than silently mis-compile. Note: gale's example `cmp #1000` is actually representable (1000 = 0xFA ror 30, field 0xF7A) — the old code mis-encoded it (raw 0x3E8); this encodes it right. The genuine error case is e.g. `#0x101` (bits too far apart for an 8-bit window). GATE: clippy clean; backend suite passes; three frozen differentials RESULT- identical (control_step 0x00210A55, flight_seam 0x07FDF307, div_const 338/338) — the fixed paths weren't exercised by fixtures (the comparison lowering uses reg-reg CMP), so this is purely defensive hardening of a latent class. Tests try_thumb_expand_imm_encodes_modified_immediates + cmp_adds_subs_immediate_error_on_non_modified_imm. Closes #255. Part of #242. (A follow-up can retrofit AND/ORR/EOR to share this helper too, extending their fold range beyond 0xFF.) Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

…rectness fixes (#260) Promote the accumulated v0.11.30 work into the CHANGELOG before tagging: native-pointer ABI (#237) + the VCR-* constant-immediate folding (#250/#252/#254) + analysis foundation (#243/#245) + three latent-miscompile encoder fixes (#251 ORR/EOR NOP, #253 ADD/SUB large-frame, #255 CMP/ADDS/SUBS ThumbExpandImm). Adds a falsification statement covering the encoder correctness class. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

…ress class (#259) (#261) gale #259: the eight thumb32 load/store immediate-offset encoders (encode_thumb32_ldr/str/ldrb_imm/ldrsb_imm/ldrh_imm/ldrsh_imm/strb_imm/strh_imm) masked `offset & 0xFFF` and emitted unconditionally — for offset >= 0x1000 the access silently targeted the wrong address (offset & 0xFFF). Same class as #253 (ADD/SUB) and #255 (CMP/ADDS/SUBS). Fix: a shared `check_ldst_imm12` guard — error for offset > 0xFFF (the imm12 range is 0..=4095), forcing the selector to use register-offset addressing, rather than silently wrap. Closes the load/store wrong-address class. Liveness: probing `i32.load offset=5000` compiles successfully (no Err) — the selector already materializes large offsets and never feeds the imm12 encoders >= 0x1000, so this is DEFENSIVE hardening (matching gale's note), not a live breakage. GATE: clippy clean; backend suite passes; the three frozen fixtures are BYTE-IDENTICAL between this change and main (control_step / flight_seam_flat / div_const .o compared with `cmp` — the guard only adds an error path for offset > 0xFFF, unreachable by the fixtures). Test ldst_imm12_offset_errors_when_out_of_range. NOTE: the /tmp/armv differential venv lost its `wasmtime` module (tooling regression, unrelated to this change) — verified via byte-comparison instead, which is strictly stronger than the result-level differential here. Closes #259. Part of #242. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

… the class (#266) The last two unguarded data-processing immediate encoders: - AND.W immediate raw-packed i:imm3:imm8 (correct only <=0xFF, silently wrong above — the fold guard kept it in range, but the encoder itself was unsafe). - CMN.W immediate fell back to a silent 0xBF00 NOP for imm > 0xFF. Both now go through the shared `try_thumb_expand_imm` (added in #255): encode any valid modified immediate correctly, else `Err` (forcing register materialization). This closes the silent-miscompile class across the whole data-proc immediate family — AND/ORR/EOR/CMP/CMN/ADD/SUB/ADDS/SUBS now share one correct, Ok-or-Err path (#251/#253/#255/#259 lineage). Behavior-frozen: for imm <= 0xFF the encoding is byte-identical (try_thumb_expand_imm returns the same field), so the cmp/cmn/and folding output is unchanged — the three differentials stay result-identical (control_step 0x00210A55, flight_seam 0x07FDF307, div_const 338/338). Test and_cmn_immediate_thumb_expand_else_error: byte-range unchanged, a replicated modified immediate now encodes, and 0x101 errors (AND was raw-pack, CMN was a NOP). Part of #242. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

avrabe merged commit 5cc26a0 into main Jun 4, 2026
12 checks passed

avrabe deleted the fix/encoder-addw-subw-large-imm branch June 4, 2026 21:14

avrabe mentioned this pull request Jun 4, 2026

feat(selector): fold constants into i32.add / i32.sub immediates (VCR-RA-001) #254

Merged

This was referenced Jun 5, 2026

encoder: load/store imm12 offset truncated (& 0xFFF) with no bounds check — silent wrong-address class (cf #253/#255) #259

Closed

docs(changelog): expand [0.11.30] for the release #260

Merged

avrabe mentioned this pull request Jun 5, 2026

fix(encoder): bounds-check LDR/STR imm12 offset — close the wrong-address class (#259) #261

Merged

This was referenced Jun 5, 2026

chore(release): v0.11.31 — gale codegen pass (#257/#258/#259) #265

Merged

fix(encoder): route AND/CMN immediates through ThumbExpandImm — close the class #266

Merged

avrabe mentioned this pull request Jun 10, 2026

v0.11.28: dissolved host-pointer primitive faults on silicon — wasm-static vs native-pointer base conflict (blocks mutex/sem drop-in) #237

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(encoder): ADD/SUB large immediates use ADDW/SUBW — fixes large-frame miscompile (VCR-RA-001)#253

fix(encoder): ADD/SUB large immediates use ADDW/SUBW — fixes large-frame miscompile (VCR-RA-001)#253
avrabe merged 1 commit into
mainfrom
fix/encoder-addw-subw-large-imm

avrabe commented Jun 4, 2026

Uh oh!

Uh oh!

codecov Bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avrabe commented Jun 4, 2026

The bug

The fix

Gate

Uh oh!

Uh oh!

codecov Bot commented Jun 4, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant