Skip to content

feat(security): #498 — pin SHA-256 of perry.nativeLibrary prebuilt archives in perry.lock#957

Closed
proggeramlug wants to merge 1 commit into
mainfrom
feat/498-pinned-checksums
Closed

feat(security): #498 — pin SHA-256 of perry.nativeLibrary prebuilt archives in perry.lock#957
proggeramlug wants to merge 1 commit into
mainfrom
feat/498-pinned-checksums

Conversation

@proggeramlug
Copy link
Copy Markdown
Contributor

Closes #498.

Summary

A swapped or tampered prebuilt static archive (.a / .lib / .dylib) was undetectable before this change — perry resolved the path and handed it to the linker without inspection. A malicious dep update could swap in arbitrary native code and the host got no signal.

This adds a small SHA-256-based lockfile at the project root, analogous to package-lock.json's integrity field but scoped to the linker-visible archives perry consumes through perry.nativeLibrary.targets.<key>.prebuilt.

Zero runtime cost — the check is a compile-time hash + lookup; the resulting binary is the same size/shape as a build without the gate.

Cross-platform — runs in the platform-agnostic compile_command driver, so every backend (LLVM / WASM / ArkTS / HarmonyOS / Glance / SwiftUI / JS) inherits the protection from one choke point. Per-target-arch key (macos-arm64, linux-x86_64, …) matches what perry.nativeLibrary.targets uses in package.json.

File shape (<project_root>/perry.lock)

{
  "version": 1,
  "native_libraries": {
    "@bloomengine/engine": {
      "macos-arm64": "sha256:abcd..."
    },
    "lodash-native": {
      "linux-x86_64": "sha256:..."
    }
  }
}

BTreeMap-sorted on both levels → on-disk JSON is byte-deterministic across builds. git diff perry.lock is a meaningful supply-chain review signal.

Verification semantics

Lockfile state Outcome
No perry.lock Fresh lockfile written with current hashes.
Matching entry Build proceeds.
Missing entry for (pkg, tgt) Added to the lockfile (grows over time).
Mismatching entry Build fails with verbose reviewer-actionable diagnostic.

Diagnostic example

Error: archive for `@bloomengine/engine` (target `macos-arm64`)
changed since last accepted:

  expected: sha256:abcd1234...
  found:    sha256:ef015678...
  path:     /repo/node_modules/@bloomengine/engine-darwin-arm64/lib/libbloom.a

Review the package — a swapped or tampered prebuilt static
archive is exactly the supply-chain attack class this lock
was added to catch (#498).

If the change is intentional (dep upgrade, vendored archive
rebuild), rerun the build with `PERRY_LOCK_REFRESH=1` to
rewrite the lockfile entry, or delete `perry.lock` to
regenerate it from scratch.

Env-var knobs

  • PERRY_LOCK_REFRESH=1 — one-time. Rewrites mismatched entries instead of failing. Logged so it's obvious in build output. Use after a deliberate dep upgrade.
  • PERRY_LOCK_FROZEN=1 — CI verification mode. Missing AND mismatching entries both fail. Ensures developers must commit perry.lock updates themselves; CI won't silently extend the lock.

Test coverage

9 unit tests in commands::perry_lock::tests:

  • round_trip_empty_lock — serialize/deserialize stable
  • missing_file_reads_empty — first-time semantics
  • sha256_of_file_matches_known_vector — pins SHA-256 of "hello\n"
  • first_build_adds_entries — adds entry, marks modified
  • matching_hash_verifies — pass-through, no modification
  • mismatching_hash_fails — checks diagnostic content (security: pinned checksums for perry.nativeLibrary archives #498 cite, refresh hint, both hashes, package name)
  • refresh_env_var_rewrites_mismatchPERRY_LOCK_REFRESH=1 path
  • frozen_env_var_refuses_new_entriesPERRY_LOCK_FROZEN=1 path
  • lockfile_is_deterministic_across_writes — BTreeMap order invariant

cargo test --release -p perry — 256 tests pass (247 pre-existing + 9 new).

Out of scope (deferred)

  • Crate-source builds (crate_path field, built via cargo): multi-file builds need a different hash mechanism (Cargo.lock + workspace tree hash). The MVP covers single-file prebuilt archives only.
  • Cross-target pre-hashing: hashing all targets a package declares at once instead of one-per-build.
  • perry lock CLI subcommand: env-var refresh + lockfile deletion cover the same workflow without a new subcommand.

Acceptance

  • First-time resolution writes hash to perry.lock
  • Subsequent builds verify against locked hash
  • Mismatch fails with actionable error
  • [partial] perry lock --update <pkg> CLI subcommand deferred — PERRY_LOCK_REFRESH=1 covers the use case
  • Hash covers every per-target archive declared by the package, not just host arch (multi-target builds accumulate hashes one target at a time)
  • CI-friendly: --frozen mode via PERRY_LOCK_FROZEN=1

Notes

No Cargo.toml version bump, no CLAUDE.md version line touch, no CHANGELOG.md entry — maintainer folds those in at merge time.

…n perry.lock

A swapped or tampered prebuilt static archive (.a / .lib / .dylib)
was undetectable before this change: perry resolved the path and
handed it to the linker without inspection. A malicious dep update
could swap in arbitrary native code and the host got no signal.

This adds a small SHA-256-based lockfile at the project root,
analogous to package-lock.json's integrity field but scoped to the
linker-visible archives perry consumes through perry.nativeLibrary
`targets.<key>.prebuilt`.

File shape (project_root/perry.lock):

  {
    "version": 1,
    "native_libraries": {
      "@bloomengine/engine": {
        "macos-arm64": "sha256:abcd..."
      }
    }
  }

Per-package + per-target-arch entries (BTreeMap-sorted so on-disk
JSON is byte-deterministic across builds; `git diff perry.lock` is
a meaningful review signal).

Verification semantics:
- No perry.lock → fresh lockfile written, build proceeds.
- Matching entry → build proceeds.
- Missing entry → added to lockfile (grows over time as more
  packages/targets are built).
- Mismatching entry → build FAILS with a verbose, reviewer-actionable
  diagnostic that names the package, both hashes, and how to fix.

Two env-var knobs:
- PERRY_LOCK_REFRESH=1 — one-time; rewrite mismatched entries
  instead of failing. Use after a deliberate dep upgrade.
- PERRY_LOCK_FROZEN=1 — CI verification mode. Missing AND mismatching
  entries both fail. Ensures developers must commit perry.lock
  updates themselves; CI won't silently extend the lock.

Cross-platform applicability: the hash check runs in the platform-
agnostic compile_command driver, so every backend (LLVM / WASM /
ArkTS / HarmonyOS / Glance / SwiftUI / JS) inherits the protection
from one choke point. The per-target-arch key (`macos-arm64`,
`linux-x86_64`, …) matches what perry.nativeLibrary.targets uses
in package.json.

Test coverage — 9 unit tests in commands::perry_lock::tests:
- round_trip_empty_lock — serialize/deserialize stable
- missing_file_reads_empty — first-time semantics
- sha256_of_file_matches_known_vector — pins SHA-256 of "hello\n"
- first_build_adds_entries — adds entry, marks modified
- matching_hash_verifies — pass-through, no modification
- mismatching_hash_fails — checks diagnostic content (#498 cite,
  refresh hint, both hashes, package name)
- refresh_env_var_rewrites_mismatch — PERRY_LOCK_REFRESH=1 path
- frozen_env_var_refuses_new_entries — PERRY_LOCK_FROZEN=1 path
- lockfile_is_deterministic_across_writes — BTreeMap order invariant

What's NOT covered (deferred):
- Crate-source builds (crate_path field — multi-file builds need a
  different hash mechanism, follow-up).
- Cross-target pre-hashing (hashing all targets a package declares
  at once instead of accumulating per build).
- `perry lock` CLI subcommand (env-var refresh + lockfile deletion
  cover the same workflow without new surface).

Acceptance:
- [x] First-time resolution writes hash to perry.lock
- [x] Subsequent builds verify against locked hash
- [x] Mismatch fails with actionable error
- [partial] CLI `perry lock --update <pkg>` deferred — PERRY_LOCK_REFRESH=1 covers the use case
- [x] Hash covers every per-target archive declared by the package, not just host arch (multi-target builds accumulate hashes one target at a time)
- [x] CI-friendly: `perry lock --frozen` mode via PERRY_LOCK_FROZEN=1 env var
proggeramlug added a commit that referenced this pull request May 17, 2026
`import _ from "lodash"; _.add(1, 2)` resolved `_` to undefined under
`perry.compilePackages: ["lodash"]`. Two distinct bugs combined:

1. Inline `;(function() { ... }.call(this))` IIFE bodies never executed
   — `Closure.call` fell through generic method dispatch — so the CJS
   wrap's `module.exports = _` write was silently dropped. Fix:
   rewrite `<FnExpr|ArrowExpr>.call(thisArg, ...args)` to a direct
   call dropping the thisArg when the closure doesn't capture `this`.

2. `Expr::IndexUpdate` (`++arr[i]` / `obj[key]++`) bailed at codegen
   with `not yet supported`, stubbing lodash entirely. Fix: lower
   read/modify/write through `js_dyn_index_get` (extended for
   string-key dispatch) and a new `js_dyn_index_set` runtime helper
   that routes by gc_type.

Real lodash advances past the `_.add` undefined symptom; the next
runtime gap (`Function('return this')()` not callable, bare `global`
not truthy) is tracked separately.
proggeramlug added a commit that referenced this pull request May 17, 2026
…rs (#963)

Closes two distinct module-init holes flagged in PR #959's commit
message ("the next runtime gap") that kept real lodash throwing
`TypeError: value is not a function` before any user code ran:

  1. `var root = freeGlobal || freeSelf || Function('return this')();`
     The bare `Function` ident lowered to `Expr::GlobalGet(0)` (the
     no-resolution sentinel), so the inner call dispatched through
     `js_closure_call1` with a null handle. AST-match the two-call
     shape at HIR lower time and fold to a new `Expr::GlobalThisExpr`
     variant that lowers to `js_get_global_this()` — the same lazy
     singleton `globalThis[X] = V` already writes to (#611).

  2. `var reHasEscapedHtml = RegExp(reEscapedHtml.source);` (~6 sites
     in lodash). Bare `RegExp(...)` (and `new RegExp(<non-literal>)`)
     hit the same null-callee path. Fold both to a new
     `Expr::RegExpDynamic { pattern, flags }` that lowers to the
     existing `js_regexp_new(pattern, flags)` runtime entrypoint —
     the same entry the static `/foo/g` arm uses.

Real lodash advances past the IIFE-init crash; the next gap is
`var Array = context.Array` against the empty globalThis singleton
(lodash needs `globalThis.Array === Array` and friends), which is a
separate architectural change.

Regression test: test-files/test_lodash_function_return_this_regexp.ts
(12 assertions, byte-for-byte match with `node --experimental-strip-types`).
proggeramlug added a commit that referenced this pull request May 25, 2026
…#1678) (#1776)

Phase 0 of #1677 (AOT-first eval/new Function strategy). Establishes the
single decision point every later phase builds on: a classifier that
buckets each `new Function` / `Function(...)` / `eval(...)` site into

  1. const-foldable        — literal/substitution-free body (→ #1679)
  2. known-library-codegen  — from fast-json-stringify / ajv / find-my-way
                              (the Fastify JIT trio; → #1680/#1681/#1682)
  3. runtime-unknown        — genuinely runtime-dynamic code string

Only the runtime-unknown bucket is refused, with a precise diagnostic that
names the surface, file:line, and originating package (or "user source").
Buckets 1 and 2 keep their existing placeholder lowering so the phases that
own them can swap it in without a behaviour change here — Phase 0 is pure
analysis + reporting, it never compiles, folds, or evaluates anything.

Before this, both shapes silently fell through to broken lowerings (a bare
`Function`/`eval` ident → GlobalGet(0) sentinel → runtime TypeError, and
`new Function(...)` → an unknown-class class_id=0 empty-object placeholder)
with no indication of why.

New module crates/perry-hir/src/eval_classifier.rs (pure classification +
diagnostic + instrumentation), hooked at the two Function-shape lowering
sites (expr_new for `new Function`, expr_call for `Function(...)`/`eval`).
The `Function('return this')()` globalThis fold (#957/#959) runs first and
short-circuits, so it is unaffected.

Instrumentation: PERRY_EVAL_DIAG=1 logs every classified site (surface,
file:line, package, bucket, body preview) to stderr. Escape hatch:
PERRY_ALLOW_EVAL=1 downgrades the bucket-3 refusal to the legacy
fall-through for a one-off build (mirrors #503's PERRY_ALLOW_DYNAMIC_STDLIB).

Tests: 9 unit tests covering each bucket + provenance + line resolution +
preview truncation. Verified end-to-end on direct `new Function`/`eval`
samples (refused, with file:line + provenance), a const-foldable sample
(passes through), an ajv-path sample (known-library bucket), and the
existing `Function('return this')()` fold (still works).
proggeramlug added a commit that referenced this pull request May 25, 2026
…#1679) (#1783)

Phase 1 of #1677. When the Phase 0 classifier (#1678) would bucket a
`new Function(...)` / `Function(...)` site as const-foldable — every
argument is a compile-time-constant string — compile it to a real native
function instead of leaving it to fall through. This is true ahead-of-time
eval and builds the string→HIR plumbing Phase 3 will reuse.

How: synthesize the equivalent `(function (<params>) { <body> })` source
(joining all-but-last args as the param list, last arg as the body, per
Node's `new Function` semantics), parse it via perry-parser, and lower it
through the normal `lower_fn_expr` path — exactly as if the user had
written the function literal. The body references only its own params plus
globals, so it lowers to a capture-free closure (new Function has no
enclosing-scope access).

Also folds the `(0, eval)('this')` / `(0, eval)('globalThis')` indirect-eval
idiom to Expr::GlobalThisExpr (indirect eval runs in global scope), the same
singleton `Function('return this')()` folds to (#957/#959).

New module crates/perry-hir/src/lower/const_fold_fn.rs, hooked at both
Function-shape sites BEFORE the Phase 0 refusal: expr_new.rs (`new
Function`) and expr_call/mod.rs (`Function(...)` / indirect eval). The
`Function('return this')()` fold still runs first and is unaffected.
Non-constant bodies still hit the Phase 0 refusal — no regression. A
const body that parses but can't lower surfaces a clear, span-tagged
compile error at the call site instead of the old broken placeholder.

perry-parser promoted from dev- to regular dependency of perry-hir (no
cycle; only adds swc_ecma_parser to the build) so lowering can parse the
synthesized source.

Tests: test-files/test_new_function_const_fold.ts (single-expression body,
multi-arg param names, comma-joined params, no-param, multi-statement body
referencing a global, the call form) and
test-files/test_indirect_eval_globalthis.ts — both byte-for-byte parity vs
`node --experimental-strip-types`. perry-hir suite green; fmt/clippy clean.
@proggeramlug proggeramlug deleted the feat/498-pinned-checksums branch June 6, 2026 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

security: pinned checksums for perry.nativeLibrary archives

1 participant