compiler: string-wall slice 8b — type-directed string ++ lowering by hyperpolymath · Pull Request #578 · hyperpolymath/affinescript

hyperpolymath · 2026-06-13T17:18:40Z

Phase F slice 8b — type-directed string `++` lowering

The full fix the slice-8a guard (#575) stood in for. String ++ now lowers correctly and completely to wasm — including pure variable-to-variable a ++ b, which the syntactic guard could not reach.

The channel (type-directed elaboration)

ast.ml: new ExprStringConcat of expr * expr (never produced by the parser).
typecheck.ml: synth records each ++ node it types as String concat by physical identity (string_concat_sites); elaborate_string_concat rewrites exactly those nodes to ExprStringConcat. Physical-identity keying is sound because typecheck and codegen run over the same prog object (parse_with_face's lowered prog, shared by resolve/typecheck/codegen); ExprBinary carries no span and same-text ++ occurrences are value-equal, so == is the correct key.
bin/main.ml: the wasm path runs elaborate_string_concat after typecheck, before Opt.fold. The interpreter and non-wasm backends keep the original prog (ExprBinary _ OpConcat _), so the oracle is unchanged and only the wasm backend sees the new node.

The lowering (`codegen.ml`)

Byte concat — allocate 4 + la + lb, write the length word, copy a's then b's bytes — mirroring the list-concat handler but with 1-byte elements + a single length word instead of 4-byte i32 elements. That i32-element copy was exactly the bug: a string's [len][utf8] was copied as i32 elements, so "ab" ++ "cd" read byte 2 as the length word of "cd" (= 2) instead of 'c' (= 99).

Effect-ordinal parity (`effect_sites.ml`)

ExprStringConcat recurses like ExprBinary and is not counted as an ExprApp call site, so effect-ordinals stay identical between interp (sees ExprBinary) and wasm (sees ExprStringConcat) — avoiding a #555-class desync. An intrinsic-call encoding (ExprApp "__string_concat") would have shifted the ordinals; the dedicated node avoids that. opt.ml folds sub-expressions; interp.ml handles it defensively.

The 8a guard is retained as a backstop: any String ++ reaching codegen un-elaborated still errors loudly rather than emitting garbage.

Tests / verification

tests/codegen/string_concat.{affine,mjs} — executable wasm parity, byte-exact via the slice-1 reader: the "ab" ++ "cd" byte-2 = 99 regression (was 2), the var-var case the guard could not catch, chained a ++ b ++ c, and empty operands (oracle 6513269).
test/test_e2e.ml "E2E String-wall slice 8 guard" gains a lowers-after-elaboration case.
Full run_codegen_wasm_tests.sh green incl. list_concat + slices 1-7 + effect tests; string ++ verified correct in if/match/fn/nested contexts. (dune runtest not runnable in-sandbox — no alcotest; the codegen .mjs parity goes through the real CLI pipeline.)

Migration impact

This closes the string wall's last op: every name-dispatched string builtin (slices 1-7) + concatenation (8) now lower to wasm. The next compiler half is the effect wall (≈111 effect-gated corpus files).

Builds on #575 (guard, merged) and #574 (design, merged).

https://claude.ai/code/session_01WoKhFQePiRsAj7aqnxbG8s

Generated by Claude Code

The full fix the slice-8a guard (#575) stood in for. String ++ now lowers correctly AND completely to wasm (incl. pure variable-to-variable, which the syntactic guard could not reach). Channel (type-directed elaboration): - ast.ml: new ExprStringConcat of expr * expr (not produced by the parser). - typecheck.ml: synth records each ++ node it types as String concat, by physical identity (string_concat_sites); elaborate_string_concat rewrites exactly those nodes to ExprStringConcat. Physical-identity keying is sound because typecheck and codegen run over the same prog object (parse_with_face's lowered prog, shared by resolve/typecheck/codegen); ExprBinary carries no span and same-text ++ occurrences are value-equal, so == is the correct key. - bin/main.ml: the wasm path runs elaborate_string_concat after typecheck, before Opt.fold_constants_program. The interpreter and non-wasm backends keep the original prog (String ++ = ExprBinary _ OpConcat _), so the oracle is unchanged and only the wasm backend sees the new node. Lowering (codegen.ml): byte concat — allocate 4 + la + lb, write the length word, copy a's then b's bytes — mirroring the list-concat handler but with 1-byte elements and a single length word (instead of 4-byte i32 elements, which was exactly the bug: the list path copied a string's [len][utf8] as i32 elements, so "ab" ++ "cd" read byte 2 as the length word of "cd" = 2 instead of 'c' = 99). Effect parity (effect_sites.ml): ExprStringConcat recurses like ExprBinary and is NOT counted as an ExprApp call site, so effect-ordinals stay identical between the interpreter (which sees ExprBinary) and the wasm backend (which sees ExprStringConcat) — avoiding a #555-class desync. An intrinsic-call encoding (ExprApp "__string_concat") would have shifted the ordinals; the dedicated node avoids that. opt.ml folds its sub-expressions; interp.ml handles it defensively as ordinary String ++. The 8a guard is retained as a backstop: any String ++ reaching codegen un-elaborated still errors loudly rather than emitting garbage. Tests: tests/codegen/string_concat.{affine,mjs} — executable wasm parity, byte-exact via the slice-1 reader: the "ab" ++ "cd" byte-2 = 99 regression (was 2), the var-var case the guard could not catch, chained a ++ b ++ c, and empty operands; oracle 6513269. test/test_e2e.ml "E2E String-wall slice 8 guard" gains a lowers-after-elaboration case. Verified: full run_codegen_wasm_tests.sh green incl. list_concat + slices 1-7 + effect tests; string ++ correct in if/match/fn/nested contexts. Design + ledger: proposals/DESIGN-string-concat.adoc (8b LANDED), proposals/MIGRATION-PLAN.adoc. https://claude.ai/code/session_01WoKhFQePiRsAj7aqnxbG8s

github-actions · 2026-06-13T17:19:39Z

🔍 Hypatia Security Scan

Findings: 47 issues detected

Severity	Count
🔴 Critical	2
🟠 High	24
🟡 Medium	21

⚠️ Action Required: Critical security issues found!

View findings

[
  {
    "reason": "Action actions/add-to-project@v1.0.2 needs attention",
    "type": "unpinned_action",
    "file": "add-to-roadmap.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action denoland/setup-deno@v2 needs attention",
    "type": "unpinned_action",
    "file": "publish-jsr.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action trufflesecurity/trufflehog@main needs attention",
    "type": "unpinned_action",
    "file": "secret-scanner.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Issue in add-to-roadmap.yml",
    "type": "missing_timeout_minutes",
    "file": "add-to-roadmap.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in scorecard-enforcer.yml",
    "type": "scorecard_publish_with_run_step",
    "file": "scorecard-enforcer.yml",
    "action": "split_scorecard_publish_job",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Issue in instant-sync.yml",
    "type": "secret_action_without_presence_gate",
    "file": "instant-sync.yml",
    "action": "peter-evans/repository-dispatch",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Shell execution -- validate input before passing to shell (1 occurrences, CWE-78)",
    "type": "js_exec_sync",
    "file": "/home/runner/work/affinescript/affinescript/packages/affinescript-cli/mod.js",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "high"
  },
  {
    "reason": "Shell execution -- validate input before passing to shell (2 occurrences, CWE-78)",
    "type": "js_exec_sync",
    "file": "/home/runner/work/affinescript/affinescript/packages/affine-vscode/mod.js",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "high"
  },
  {
    "reason": "Shell execution -- validate input before passing to shell (1 occurrences, CWE-78)",
    "type": "js_exec_sync",
    "file": "/home/runner/work/affinescript/affinescript/affinescript-vite/src/affine-plugin-improved.js",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "high"
  },
  {
    "reason": "expect() in hot path (32 occurrences, CWE-754)",
    "type": "expect_in_hot_path",
    "file": "/home/runner/work/affinescript/affinescript/affinescriptiser/src/codegen/wasm_gen.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

…584) ## Migration wave: 7 integer-brain kernels from string-gated idaptik modules First applied wave of the now-unblocked **string-gated corpus**. Phase B classified 71 string-gated files; closing the string wall (slices 1–8) plus the `len()` lowering (#583) opened the integer-brain extraction path. Seven kernels re-decomposed from idaptik `.res` modules into AffineScript brains under `proposals/idaptik/migrated/`, fanned out across 6 parallel agents and **re-verified by me before commit**. Each is a four-gate deliverable — G1 compile, G2 independent-oracle parity sweep, G4 assail. Strings / floats / promises / mutable state stay **host-side** per the C1–C12 recipe; only the pure-integer decision core crosses to wasm. | Kernel | Exports | Parity | Assail | |---|---|---|---| | PortScanner | 4 | 44/44 | clean | | PasswordCracker | 7 | 215/215 | clean | | FirewallDevice | 12 | 164/164 | clean | | Inventory | 9 | 2840/2840 | clean | | Drone | 32 | 1192/1192 | clean | | SecurityDog | 29 | 31533/31533 | clean | | GuardNPC | 19 | 359/359 | clean | **Re-decompositions, not transliterations** — e.g. PasswordCracker inverts the djb2 string-loop so the host walks the string and the brain does i32 math (`Math.imul`/`|0` modelled in the oracle); Inventory packs slot-state into a base-3 Int instead of a mutable array; FirewallDevice keeps CIDR/protocol *string* parsing host-side and decides over integer flags. Floats cross as floored milli-units; out-of-band inputs return guarded `-1` sentinels (assail-clean, no in-band collapse). Each oracle is an independent JS reimplementation from the `.res` semantics, not copied from the `.affine`. **Deduped:** `SecurityAI` dropped — already tracked as `migrated/securityai/` (with a boundary proof) from an earlier wave; `GlobalNetworkData` likewise pre-existed and was left untouched. **Two compiler quirks surfaced** (flagged for the playbook, not fixed here): `total` is a reserved keyword (parse error as an identifier); and an `if { … }` block immediately followed by a parenthesized expression parses as a function application. Both have trivial source-side workarounds. Builds on #583 (`len`, merged) and the string-wall slices (#574/#575/#578). https://claude.ai/code/session_01WoKhFQePiRsAj7aqnxbG8s --- _Generated by [Claude Code](https://claude.ai/code/session_01WoKhFQePiRsAj7aqnxbG8s)_ Co-authored-by: Claude <noreply@anthropic.com>

hyperpolymath marked this pull request as ready for review June 13, 2026 17:42

hyperpolymath merged commit 1f6ba66 into main Jun 13, 2026
26 of 29 checks passed

hyperpolymath deleted the claude/cool-keller-gr5sl branch June 13, 2026 17:43

hyperpolymath mentioned this pull request Jun 14, 2026

migration: 7 integer-brain kernels from string-gated idaptik modules #584

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compiler: string-wall slice 8b — type-directed string ++ lowering#578

compiler: string-wall slice 8b — type-directed string ++ lowering#578
hyperpolymath merged 1 commit into
mainfrom
claude/cool-keller-gr5sl

hyperpolymath commented Jun 13, 2026

Uh oh!

github-actions Bot commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hyperpolymath commented Jun 13, 2026

Phase F slice 8b — type-directed string ++ lowering

The channel (type-directed elaboration)

The lowering (codegen.ml)

Effect-ordinal parity (effect_sites.ml)

Tests / verification

Migration impact

Uh oh!

github-actions Bot commented Jun 13, 2026

🔍 Hypatia Security Scan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Phase F slice 8b — type-directed string `++` lowering

The lowering (`codegen.ml`)

Effect-ordinal parity (`effect_sites.ml`)