Skip to content

fix(#957 followup): Function('return this')() + bare RegExp(...) recognisers#963

Merged
proggeramlug merged 1 commit into
mainfrom
fix-957-function-return-this-regexp
May 17, 2026
Merged

fix(#957 followup): Function('return this')() + bare RegExp(...) recognisers#963
proggeramlug merged 1 commit into
mainfrom
fix-957-function-return-this-regexp

Conversation

@proggeramlug
Copy link
Copy Markdown
Contributor

Summary

Closes two distinct module-init holes flagged in PR #959's commit message ("the next runtime gap") that kept real lodash throwing TypeError: value is not a function at <anonymous> before any user code ran.

  • Function('return this')() — the canonical "give me globalThis" idiom every CJS/UMD library copies (lodash.js:436, also underscore + Effect prelude). Bare Function lowers to Expr::GlobalGet(0) (no-resolution sentinel), so the inner call dispatches through js_closure_call1 on a null closure handle and throw_not_callable fires with the hardcoded b"value" property name — hence the line-less TypeError: value is not a function. AST-match the two-call shape at HIR lower time and fold to a new Expr::GlobalThisExpr variant that lowers to js_get_global_this() — the same lazy singleton globalThis[X] = V already writes to (perry-runtime: Effect framework SIGSEGVs during Utils.ts module init — array-fast-path called with x0 = 0x1 (Effect end-to-end blocker, post-#592) #611).

  • RegExp(pattern[, flags]) bare function call + new RegExp(<non-literal>) — lodash builds ~6 of these at module init (e.g. var reHasEscapedHtml = RegExp(reEscapedHtml.source);). Both hit the same null-callee path because RegExp lowered identically to Function. Fold both to a new Expr::RegExpDynamic { pattern, flags } that lowers to the existing js_regexp_new(pattern, flags) runtime entry — the same entry the static /foo/g arm uses. Missing flags fall back to interning the empty string at codegen so js_regexp_new always sees a real StringHeader*.

Test plan

  • New regression test-files/test_lodash_function_return_this_regexp.ts (12 assertions, byte-for-byte match with node --experimental-strip-types):
    • Function('return this')() identity + write-then-read through the singleton
    • Whitespace + trailing-semicolon variants of the inner literal
    • RegExp(literal) / RegExp(literal, 'g') / RegExp(varPattern) / RegExp(/foo/.source) (the literal lodash shape)
    • new RegExp(varPattern[, varFlags]) (the previously-broken dynamic-arg new form)
    • new RegExp(literal) (regression guard for the path that already worked)
  • test-files/test_regex.ts parity unchanged (20/20 lines match node)
  • cargo build --release -p perry-runtime -p perry-stdlib -p perry green
  • cargo fmt --all clean

Known follow-up

import _ from "lodash" no longer throws at IIFE init — module.exports = _ propagates correctly. Real lodash still hits a separate compat gap further inside runInContext(): var Array = context.Array reads undefined because the empty globalThis singleton doesn't expose JS built-in constructors as properties, and the subsequent Array.prototype throws Cannot read properties of undefined (reading 'prototype'). Tracked separately — this PR is the IIFE-init slice (the bug PR #959 explicitly flagged); the globalThis.Array === Array work is its own architectural change. The other tier3 sweep failure (debug package silent — process.stderr is undefined under perry) is independent of these recognizers and out of scope here.

🤖 Generated with Claude Code

Closes two distinct module-init holes flagged in PR #959's commit
message ("the next runtime gap") that kept real lodash throwing
`TypeError: value is not a function` before any user code ran:

  1. `var root = freeGlobal || freeSelf || Function('return this')();`
     The bare `Function` ident lowered to `Expr::GlobalGet(0)` (the
     no-resolution sentinel), so the inner call dispatched through
     `js_closure_call1` with a null handle. AST-match the two-call
     shape at HIR lower time and fold to a new `Expr::GlobalThisExpr`
     variant that lowers to `js_get_global_this()` — the same lazy
     singleton `globalThis[X] = V` already writes to (#611).

  2. `var reHasEscapedHtml = RegExp(reEscapedHtml.source);` (~6 sites
     in lodash). Bare `RegExp(...)` (and `new RegExp(<non-literal>)`)
     hit the same null-callee path. Fold both to a new
     `Expr::RegExpDynamic { pattern, flags }` that lowers to the
     existing `js_regexp_new(pattern, flags)` runtime entrypoint —
     the same entry the static `/foo/g` arm uses.

Real lodash advances past the IIFE-init crash; the next gap is
`var Array = context.Array` against the empty globalThis singleton
(lodash needs `globalThis.Array === Array` and friends), which is a
separate architectural change.

Regression test: test-files/test_lodash_function_return_this_regexp.ts
(12 assertions, byte-for-byte match with `node --experimental-strip-types`).
@proggeramlug proggeramlug merged commit d2e1d3a into main May 17, 2026
6 of 9 checks passed
@proggeramlug proggeramlug deleted the fix-957-function-return-this-regexp branch May 17, 2026 22:05
proggeramlug added a commit that referenced this pull request May 17, 2026
PR #963 closed the `Function('return this')()` recogniser so lodash's
module-init IIFE no longer threw before `runInContext` ran, but the next
blocker surfaced at lodash.js:1463 — `var Array = context.Array; var
arrayProto = Array.prototype` threw `TypeError: Cannot read properties
of undefined (reading 'prototype')` because the canonical JS built-ins
were never registered on `globalThis`.

Two coordinated pieces fix this:

1. Runtime (`crates/perry-runtime/src/object.rs`): on first access,
   `js_get_global_this`'s CAS winner now calls
   `populate_global_this_builtins(singleton)`. Constructors get a
   ClosureHeader-backed sentinel (`typeof === "function"` via the
   CLOSURE_MAGIC tag); `Math`/`JSON`/`Reflect` get a plain ObjectHeader
   (`typeof === "object"`). Each constructor carries a `prototype`
   dynamic property pointing at an empty ObjectHeader so the lodash
   `Array.prototype` chained read returns a real pointer instead of
   throwing.

2. Codegen (`crates/perry-codegen/src/expr.rs`): the
   `Expr::PropertyGet { GlobalGet(_), <name> }` arm consults
   `is_global_this_builtin_name(<name>)` and, when matched, routes
   the read through `js_get_global_this` + `js_object_get_field_by_name_f64`
   (mirrors the existing IndexGet arm from #611). The `typeof` short-
   circuit gains a parallel arm for the same names.

`test-files/test_globalthis_builtins.ts` validates the round-trip
byte-for-byte against `node --experimental-strip-types`. The lodash
`_.chunk` repro still fails to load `_` but the throw point moves from
line 1463 (`var arrayProto = Array.prototype`) to line 1493
(`funcToString.call(Object)`) — three reads + one call deeper into
`runInContext`.
proggeramlug added a commit that referenced this pull request May 17, 2026
PR #963 closed the `Function('return this')()` recogniser so lodash's
module-init IIFE no longer threw before `runInContext` ran, but the next
blocker surfaced at lodash.js:1463 — `var Array = context.Array; var
arrayProto = Array.prototype` threw `TypeError: Cannot read properties
of undefined (reading 'prototype')` because the canonical JS built-ins
were never registered on `globalThis`.

Two coordinated pieces fix this:

1. Runtime (`crates/perry-runtime/src/object.rs`): on first access,
   `js_get_global_this`'s CAS winner now calls
   `populate_global_this_builtins(singleton)`. Constructors get a
   ClosureHeader-backed sentinel (`typeof === "function"` via the
   CLOSURE_MAGIC tag); `Math`/`JSON`/`Reflect` get a plain ObjectHeader
   (`typeof === "object"`). Each constructor carries a `prototype`
   dynamic property pointing at an empty ObjectHeader so the lodash
   `Array.prototype` chained read returns a real pointer instead of
   throwing.

2. Codegen (`crates/perry-codegen/src/expr.rs`): the
   `Expr::PropertyGet { GlobalGet(_), <name> }` arm consults
   `is_global_this_builtin_name(<name>)` and, when matched, routes
   the read through `js_get_global_this` + `js_object_get_field_by_name_f64`
   (mirrors the existing IndexGet arm from #611). The `typeof` short-
   circuit gains a parallel arm for the same names.

`test-files/test_globalthis_builtins.ts` validates the round-trip
byte-for-byte against `node --experimental-strip-types`. The lodash
`_.chunk` repro still fails to load `_` but the throw point moves from
line 1463 (`var arrayProto = Array.prototype`) to line 1493
(`funcToString.call(Object)`) — three reads + one call deeper into
`runInContext`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant