compiler: lower String ==/!= to a byte comparison (string-wall slice 9)#587
Merged
Conversation
…ice 9) String equality was polymorphic at the type level (`'a -> 'a -> Bool`) but lowered unconditionally to `I32Eq` in the wasm backend, comparing the two `[len][utf8]` *pointers* rather than their bytes. Two equal-valued strings at distinct heap addresses (a built string vs a literal) therefore compared unequal — the root cause of the "integer-gate everything" pattern in the idaptik migration. The fix mirrors the slice-8b String-`++` machinery exactly: - typecheck records String-typed `==`/`!=` nodes (string_eq_sites) by physical identity during synth; - elaborate_string_concat rewrites them to a new ExprStringEq AST node; - codegen lowers ExprStringEq to a length-prefixed byte comparison (length guard, then byte loop; `!=` negates), never reading past either string. Type-blind passes get matching arms (interp/opt/effect_sites). Int/Bool `==` is untouched (only String operands are recorded). Verified: value comparison plus empty / length-mismatch / byte-loop / negation edge cases; no regression across 38,975 int-`==` parity cases (DeviceType, VmState, VmInstruction, NetworkZones, Detection). New e2e fixture + test. https://claude.ai/code/session_01WoKhFQePiRsAj7aqnxbG8s
🔍 Hypatia Security ScanFindings: 40 issues detected
View findings[
{
"reason": "Action denoland/setup-deno@v2 needs attention",
"type": "unpinned_action",
"file": "publish-jsr.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in scorecard-enforcer.yml",
"type": "scorecard_publish_with_run_step",
"file": "scorecard-enforcer.yml",
"action": "split_scorecard_publish_job",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in instant-sync.yml",
"type": "secret_action_without_presence_gate",
"file": "instant-sync.yml",
"action": "peter-evans/repository-dispatch",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Shell execution -- validate input before passing to shell (1 occurrences, CWE-78)",
"type": "js_exec_sync",
"file": "/home/runner/work/affinescript/affinescript/packages/affinescript-cli/mod.js",
"action": "flag",
"rule_module": "code_safety",
"severity": "high"
},
{
"reason": "Shell execution -- validate input before passing to shell (2 occurrences, CWE-78)",
"type": "js_exec_sync",
"file": "/home/runner/work/affinescript/affinescript/packages/affine-vscode/mod.js",
"action": "flag",
"rule_module": "code_safety",
"severity": "high"
},
{
"reason": "Shell execution -- validate input before passing to shell (1 occurrences, CWE-78)",
"type": "js_exec_sync",
"file": "/home/runner/work/affinescript/affinescript/affinescript-vite/src/affine-plugin-improved.js",
"action": "flag",
"rule_module": "code_safety",
"severity": "high"
},
{
"reason": "expect() in hot path (32 occurrences, CWE-754)",
"type": "expect_in_hot_path",
"file": "/home/runner/work/affinescript/affinescript/affinescriptiser/src/codegen/wasm_gen.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
},
{
"reason": "expect() in hot path (29 occurrences, CWE-754)",
"type": "expect_in_hot_path",
"file": "/home/runner/work/affinescript/affinescript/affinescriptiser/src/codegen/affine_gen.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
},
{
"reason": "unsafe block -- requires SAFETY comment (2 occurrences, CWE-676)",
"type": "unsafe_block",
"file": "/home/runner/work/affinescript/affinescript/runtime/src/panic.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
},
{
"reason": "unsafe block -- requires SAFETY comment (1 occurrences, CWE-676)",
"type": "unsafe_block",
"file": "/home/runner/work/affinescript/affinescript/runtime/src/alloc.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
String equality was polymorphic at the type level (
'a -> 'a -> Bool) but the wasm backend lowered it unconditionally toI32Eq, comparing the two[len][utf8]pointers rather than their bytes. Two equal-valued strings at distinct heap addresses — e.g. a built string vs a literal — compared unequal. This is the root cause of the "integer-gate everything" pattern seen across the idaptik.res → .affinemigration (everycmd == "scan"had to be re-decomposed to integer codes).This is string-wall slice 9, the equality counterpart to slice 8b (String
++).Approach (mirrors the slice-8b
++machinery exactly)==/!=nodes by physical identity duringsynth(string_eq_sites) — they land in the polymorphic-equalityelsebranch, not thecomparisonbranchelaborate_string_concatrewrites them to a newExprStringEqAST node (one walk now drives both concat and eq; the existing single wasm-path call site is unchanged)ExprStringEqto a length-prefixed byte comparison: length guard, then a byte loop;!=negates viaI32Eqz. The loop runs only after the length check passes, so it never reads past either string (no out-of-bounds linear-memory trap)interp/opt/effect_sites)Int/Bool==is untouched — only String operands are recorded, so the change is inert for every existing program.Verification
"sc" ++ "an" == "scan"is nowtrue(wasfalseunder pointer compare)!=negation==parity cases (DeviceType 39, VmState 38577, VmInstruction 281, NetworkZones 64, Detection 14)test/e2e/fixtures/string_eq.affine) + test (test_e2e.ml, slice-9), mirroring the slice-8b concat testNot in scope
String relational ops (
</>/<=/>=, #458) still lower to pointer compares — they typecheck in thecomparisonbranch but the byte-lexicographic lowering is a separate follow-up (the byte loop here generalizes to a{-1,0,1}compare). The var-to-var gap that the full "type channel" was needed to close for++is already closed here for==/!=, since the recording is type-directed from the start.https://claude.ai/code/session_01WoKhFQePiRsAj7aqnxbG8s
Generated by Claude Code