diff --git a/CHANGELOG.md b/CHANGELOG.md index ce0ecf097e..038cb05370 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,63 @@ Detailed changelog for Perry. See CLAUDE.md for concise summaries. +## v0.5.994 — feat(jsruntime): V8 ModuleLoader reads from embedded module map (self-contained binaries) + +**Symptom.** v0.5.993 closed half of #818: the `__perry_js_bundle.js` artifact now contains every transitive sibling a pure-ESM npm package re-exports, so the bundle's *content* matches reality. The other half (called out in that release's own "Known next blocker") was still open — `NodeModuleLoader::load` in `crates/perry-jsruntime/src/modules.rs` reads source via `std::fs::read_to_string(&path)`, never consults `globalThis.__COMPILETS_MODULES`, and walks the real `node_modules/` tree at runtime. Move a Perry-compiled binary that uses hono / express / any other V8-fallback package into a directory that doesn't have `node_modules/` and V8 throws `Cannot resolve module` (or in the cross-thread `app.fetch(req)` shape, segfaults rc=139) for every missing file. + +**Root cause — the bundle had no runtime consumer.** `generate_js_bundle` writes the bundle next to the binary purely as a debugging artifact; nothing on the runtime side knows about it. Resolution and source-reading both go to disk: + +``` +NodeModuleLoader::resolve_module_path → resolve_with_extensions → base.exists() / std::fs::read_to_string(package_json) +NodeModuleLoader::load → std::fs::read_to_string(&path) +``` + +If `node_modules/hono/` isn't on the filesystem when the binary runs, every probe fails. + +**Fix — embed an in-memory module map at link time + teach the loader to prefer it.** Three pieces: + +1. **`perry-jsruntime/src/modules.rs`** gains two process-wide `RwLock`s: + - `EMBEDDED_MODULES`: build-time canonical path → source code (`Arc`). + - `EMBEDDED_ALIASES`: bare specifier ("hono", "@scope/x") → build-time canonical path. + Plus matching `#[no_mangle] pub unsafe extern "C"` registration FFIs (`js_register_embedded_module`, `js_register_embedded_alias`) and Rust-facing wrappers (`register_embedded_module`, `register_embedded_alias`). + +2. **Loader integration.** `NodeModuleLoader::resolve_module_path` consults the alias map first for bare specifiers and the path map second when on-disk extension/index probes fail — including a `lookup_embedded_path_with_extensions` helper that mirrors `resolve_with_extensions`'s `.js`/`.mjs`/`.cjs`/`.json` and folder-index fallbacks against the in-memory keys. `NodeModuleLoader::load` checks `EMBEDDED_MODULES` before its `std::fs::read_to_string` call. The map keys are build-time canonical paths used as opaque identifiers; `canonicalize()` later in `resolve()` already falls back to `resolved_path.clone()` for paths that don't exist on the runtime filesystem, so the `file://`-URL synthesized for V8 works either way. + +3. **Compile-time emission.** New `targets::generate_embedded_js_object` in `crates/perry/src/commands/compile/targets.rs`: + - Walks `ctx.js_modules` and emits one C `static const char[]` literal per module's source + length pair. + - Walks every TypeScript import edge whose `resolved_path` is in `ctx.js_modules`, collects `(bare_specifier, resolved_path)` pairs, emits matching string literals. + - Wraps it in a `__attribute__((constructor(101)))` that calls `js_register_embedded_module` for every source pair and `js_register_embedded_alias` for every alias pair. Priority 101 runs before `main`'s `js_runtime_init()` call, so the map is populated before V8 first asks for a module. + - Calls `cc -c` on the generated `.c` and appends the `.o` to `obj_paths` so the existing linker invocation pulls it in. + +The generator escapes non-printable bytes octally (`\NNN`) and never emits raw multibyte UTF-8 into the C source, so the resulting `.c` is ASCII-clean regardless of the JS file's encoding. `?` is also escaped to defang any `??=`/`??/` trigraph hazard under `-trigraphs`. + +**Validation.** + +``` +cd /tmp/perry-selfcontained +cat > test.ts <<'EOF' +import { Hono } from 'hono'; +const app = new Hono(); +app.get('/', (c) => c.text('Hi')); +console.log(typeof app, typeof app.get); +EOF +npm install hono@4.6.5 --silent +perry test.ts -o out + +# Move the binary somewhere isolated — no node_modules/, no source +mkdir -p /tmp/perry-iso && cp out /tmp/perry-iso/ && cd /tmp/perry-iso +ls # → just `out`, nothing else +./out # → object function +``` + +Pre-fix the third line printed `Cannot resolve module 'hono'` and exited 1. Post-fix it prints `object function` from a 45 MB binary with zero filesystem dependencies. The on-disk `__perry_js_bundle.js` is still emitted (kept as a build-time debugging artifact) but is no longer needed at runtime. New test fixture: `test-files/test_v8_self_contained.ts`. + +**Files touched.** +- `crates/perry-jsruntime/src/modules.rs` — embedded-module map + FFIs + load/resolve consults. +- `crates/perry/src/commands/compile/targets.rs` — `generate_embedded_js_object` + `c_string_literal` helper. +- `crates/perry/src/commands/compile.rs` — append generated `.o` to `obj_paths` whenever `needs_js_runtime` + non-empty `js_modules`. +- `test-files/test_v8_self_contained.ts` — fixture documenting the self-contained-binary expectation. + ## v0.5.993 — fix(compile): recursively bundle transitive ESM imports for V8 fallback **Symptom.** A program that imports a pure-ESM npm package whose entry file re-exports siblings (`hono`'s `dist/index.js` → `./hono.js` → `./hono-base.js` → `./compose.js` → `./router/*` → `./utils/*` …) ended up with a `__perry_js_bundle.js` containing only the single entry file. Roughly 20 transitive `dist/**/*.js` files were silently dropped. Compiled binaries still worked when their `node_modules/` tree happened to sit alongside them (V8's `ModuleLoader::load` opens files off disk), but shipping the binary on its own — or running it in any sandbox where the resolved paths don't exist — left V8 throwing `Cannot resolve module` for every missing sibling, and in the realistic hono call path (`app.fetch(req)` running cross-thread) cascaded to an rc=139 segfault because the missing-module callback handed unboxed `undefined` back to compiled native code expecting a NaN-boxed pointer. diff --git a/CLAUDE.md b/CLAUDE.md index adef3a0c81..67ef35911c 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -8,7 +8,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co Perry is a native TypeScript compiler written in Rust that compiles TypeScript source code directly to native executables. It uses SWC for TypeScript parsing and LLVM for code generation. -**Current Version:** 0.5.993 +**Current Version:** 0.5.994 ## TypeScript Parity Status diff --git a/Cargo.lock b/Cargo.lock index f12e8b3f96..c5f35e6712 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -4905,7 +4905,7 @@ checksum = "9b4f627cb1b25917193a259e49bdad08f671f8d9708acfd5fe0a8c1455d87220" [[package]] name = "perry" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "base64", @@ -4960,14 +4960,14 @@ dependencies = [ [[package]] name = "perry-api-manifest" -version = "0.5.993" +version = "0.5.994" dependencies = [ "serde", ] [[package]] name = "perry-codegen" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "log", @@ -4980,7 +4980,7 @@ dependencies = [ [[package]] name = "perry-codegen-arkts" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "perry-hir", @@ -4989,7 +4989,7 @@ dependencies = [ [[package]] name = "perry-codegen-glance" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "perry-hir", @@ -4997,7 +4997,7 @@ dependencies = [ [[package]] name = "perry-codegen-js" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "perry-dispatch", @@ -5007,7 +5007,7 @@ dependencies = [ [[package]] name = "perry-codegen-swiftui" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "perry-hir", @@ -5016,7 +5016,7 @@ dependencies = [ [[package]] name = "perry-codegen-wasm" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "base64", @@ -5029,7 +5029,7 @@ dependencies = [ [[package]] name = "perry-codegen-wear-tiles" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "perry-hir", @@ -5037,7 +5037,7 @@ dependencies = [ [[package]] name = "perry-diagnostics" -version = "0.5.993" +version = "0.5.994" dependencies = [ "serde", "serde_json", @@ -5045,7 +5045,7 @@ dependencies = [ [[package]] name = "perry-dispatch" -version = "0.5.993" +version = "0.5.994" [[package]] name = "perry-doc-fixture-my-bindings" @@ -5056,7 +5056,7 @@ dependencies = [ [[package]] name = "perry-doc-tests" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "clap", @@ -5071,7 +5071,7 @@ dependencies = [ [[package]] name = "perry-ext-argon2" -version = "0.5.993" +version = "0.5.994" dependencies = [ "argon2", "perry-ffi", @@ -5079,7 +5079,7 @@ dependencies = [ [[package]] name = "perry-ext-axios" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "reqwest", @@ -5088,7 +5088,7 @@ dependencies = [ [[package]] name = "perry-ext-bcrypt" -version = "0.5.993" +version = "0.5.994" dependencies = [ "bcrypt", "perry-ffi", @@ -5096,7 +5096,7 @@ dependencies = [ [[package]] name = "perry-ext-better-sqlite3" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "rusqlite", @@ -5104,7 +5104,7 @@ dependencies = [ [[package]] name = "perry-ext-cheerio" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "scraper", @@ -5112,14 +5112,14 @@ dependencies = [ [[package]] name = "perry-ext-commander" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", ] [[package]] name = "perry-ext-cron" -version = "0.5.993" +version = "0.5.994" dependencies = [ "chrono", "cron", @@ -5128,7 +5128,7 @@ dependencies = [ [[package]] name = "perry-ext-dayjs" -version = "0.5.993" +version = "0.5.994" dependencies = [ "chrono", "perry-ffi", @@ -5136,7 +5136,7 @@ dependencies = [ [[package]] name = "perry-ext-decimal" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "rust_decimal", @@ -5144,7 +5144,7 @@ dependencies = [ [[package]] name = "perry-ext-dotenv" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "serde_json", @@ -5152,7 +5152,7 @@ dependencies = [ [[package]] name = "perry-ext-ethers" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "rand 0.8.6", @@ -5160,21 +5160,21 @@ dependencies = [ [[package]] name = "perry-ext-events" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", ] [[package]] name = "perry-ext-exponential-backoff" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", ] [[package]] name = "perry-ext-fastify" -version = "0.5.993" +version = "0.5.994" dependencies = [ "bytes", "http-body-util", @@ -5188,7 +5188,7 @@ dependencies = [ [[package]] name = "perry-ext-fetch" -version = "0.5.993" +version = "0.5.994" dependencies = [ "lazy_static", "perry-ffi", @@ -5199,7 +5199,7 @@ dependencies = [ [[package]] name = "perry-ext-http" -version = "0.5.993" +version = "0.5.994" dependencies = [ "lazy_static", "perry-ext-http-server", @@ -5211,7 +5211,7 @@ dependencies = [ [[package]] name = "perry-ext-http-server" -version = "0.5.993" +version = "0.5.994" dependencies = [ "bytes", "http-body-util", @@ -5230,7 +5230,7 @@ dependencies = [ [[package]] name = "perry-ext-ioredis" -version = "0.5.993" +version = "0.5.994" dependencies = [ "lazy_static", "perry-ffi", @@ -5240,7 +5240,7 @@ dependencies = [ [[package]] name = "perry-ext-jsonwebtoken" -version = "0.5.993" +version = "0.5.994" dependencies = [ "base64", "jsonwebtoken", @@ -5251,7 +5251,7 @@ dependencies = [ [[package]] name = "perry-ext-lru-cache" -version = "0.5.993" +version = "0.5.994" dependencies = [ "lru", "perry-ffi", @@ -5259,7 +5259,7 @@ dependencies = [ [[package]] name = "perry-ext-moment" -version = "0.5.993" +version = "0.5.994" dependencies = [ "chrono", "perry-ffi", @@ -5267,7 +5267,7 @@ dependencies = [ [[package]] name = "perry-ext-mongodb" -version = "0.5.993" +version = "0.5.994" dependencies = [ "bson", "futures-util", @@ -5279,7 +5279,7 @@ dependencies = [ [[package]] name = "perry-ext-mysql2" -version = "0.5.993" +version = "0.5.994" dependencies = [ "chrono", "perry-ffi", @@ -5289,7 +5289,7 @@ dependencies = [ [[package]] name = "perry-ext-nanoid" -version = "0.5.993" +version = "0.5.994" dependencies = [ "nanoid", "perry-ffi", @@ -5298,7 +5298,7 @@ dependencies = [ [[package]] name = "perry-ext-net" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "rustls", @@ -5309,7 +5309,7 @@ dependencies = [ [[package]] name = "perry-ext-nodemailer" -version = "0.5.993" +version = "0.5.994" dependencies = [ "lettre", "perry-ffi", @@ -5319,7 +5319,7 @@ dependencies = [ [[package]] name = "perry-ext-pg" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "sqlx", @@ -5328,7 +5328,7 @@ dependencies = [ [[package]] name = "perry-ext-ratelimit" -version = "0.5.993" +version = "0.5.994" dependencies = [ "governor", "perry-ffi", @@ -5336,7 +5336,7 @@ dependencies = [ [[package]] name = "perry-ext-sharp" -version = "0.5.993" +version = "0.5.994" dependencies = [ "base64", "image", @@ -5345,14 +5345,14 @@ dependencies = [ [[package]] name = "perry-ext-slugify" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", ] [[package]] name = "perry-ext-streams" -version = "0.5.993" +version = "0.5.994" dependencies = [ "lazy_static", "perry-ffi", @@ -5360,7 +5360,7 @@ dependencies = [ [[package]] name = "perry-ext-uuid" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "uuid", @@ -5368,7 +5368,7 @@ dependencies = [ [[package]] name = "perry-ext-validator" -version = "0.5.993" +version = "0.5.994" dependencies = [ "perry-ffi", "regex", @@ -5378,7 +5378,7 @@ dependencies = [ [[package]] name = "perry-ext-ws" -version = "0.5.993" +version = "0.5.994" dependencies = [ "futures-util", "lazy_static", @@ -5389,7 +5389,7 @@ dependencies = [ [[package]] name = "perry-ext-zlib" -version = "0.5.993" +version = "0.5.994" dependencies = [ "flate2", "perry-ffi", @@ -5397,7 +5397,7 @@ dependencies = [ [[package]] name = "perry-ffi" -version = "0.5.993" +version = "0.5.994" dependencies = [ "dashmap 6.1.0", "once_cell", @@ -5406,7 +5406,7 @@ dependencies = [ [[package]] name = "perry-hir" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "perry-api-manifest", @@ -5420,7 +5420,7 @@ dependencies = [ [[package]] name = "perry-jsruntime" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "deno_core", @@ -5440,7 +5440,7 @@ dependencies = [ [[package]] name = "perry-parser" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "perry-diagnostics", @@ -5452,7 +5452,7 @@ dependencies = [ [[package]] name = "perry-runtime" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "base64", @@ -5476,7 +5476,7 @@ dependencies = [ [[package]] name = "perry-stdlib" -version = "0.5.993" +version = "0.5.994" dependencies = [ "aes 0.8.4", "aes-gcm", @@ -5546,7 +5546,7 @@ dependencies = [ [[package]] name = "perry-transform" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "perry-hir", @@ -5556,7 +5556,7 @@ dependencies = [ [[package]] name = "perry-types" -version = "0.5.993" +version = "0.5.994" dependencies = [ "anyhow", "thiserror 1.0.69", @@ -5564,11 +5564,11 @@ dependencies = [ [[package]] name = "perry-ui" -version = "0.5.993" +version = "0.5.994" [[package]] name = "perry-ui-android" -version = "0.5.993" +version = "0.5.994" dependencies = [ "itoa", "jni", @@ -5583,7 +5583,7 @@ dependencies = [ [[package]] name = "perry-ui-geisterhand" -version = "0.5.993" +version = "0.5.994" dependencies = [ "rand 0.8.6", "serde", @@ -5593,7 +5593,7 @@ dependencies = [ [[package]] name = "perry-ui-gtk4" -version = "0.5.993" +version = "0.5.994" dependencies = [ "cairo-rs", "dirs 5.0.1", @@ -5612,7 +5612,7 @@ dependencies = [ [[package]] name = "perry-ui-ios" -version = "0.5.993" +version = "0.5.994" dependencies = [ "block2", "libc", @@ -5627,7 +5627,7 @@ dependencies = [ [[package]] name = "perry-ui-macos" -version = "0.5.993" +version = "0.5.994" dependencies = [ "block2", "libc", @@ -5645,11 +5645,11 @@ version = "0.1.0" [[package]] name = "perry-ui-testkit" -version = "0.5.993" +version = "0.5.994" [[package]] name = "perry-ui-tvos" -version = "0.5.993" +version = "0.5.994" dependencies = [ "block2", "libc", @@ -5664,7 +5664,7 @@ dependencies = [ [[package]] name = "perry-ui-visionos" -version = "0.5.993" +version = "0.5.994" dependencies = [ "block2", "libc", @@ -5679,7 +5679,7 @@ dependencies = [ [[package]] name = "perry-ui-watchos" -version = "0.5.993" +version = "0.5.994" dependencies = [ "block2", "libc", @@ -5692,7 +5692,7 @@ dependencies = [ [[package]] name = "perry-ui-windows" -version = "0.5.993" +version = "0.5.994" dependencies = [ "libc", "perry-runtime", @@ -5706,7 +5706,7 @@ dependencies = [ [[package]] name = "perry-updater" -version = "0.5.993" +version = "0.5.994" dependencies = [ "base64", "ed25519-dalek", @@ -5720,7 +5720,7 @@ dependencies = [ [[package]] name = "perry-wasm-host" -version = "0.5.993" +version = "0.5.994" dependencies = [ "wasmi", ] diff --git a/Cargo.toml b/Cargo.toml index f5693aa6b2..cbaab3bfe8 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -190,7 +190,7 @@ opt-level = "s" # Optimize for size in stdlib opt-level = 3 [workspace.package] -version = "0.5.993" +version = "0.5.994" edition = "2021" license = "MIT" repository = "https://github.com/PerryTS/perry" diff --git a/crates/perry-jsruntime/src/modules.rs b/crates/perry-jsruntime/src/modules.rs index c547cb3929..c8364d60a0 100644 --- a/crates/perry-jsruntime/src/modules.rs +++ b/crates/perry-jsruntime/src/modules.rs @@ -11,7 +11,165 @@ use deno_core::{ use deno_error::JsErrorBox; use once_cell::sync::Lazy; use regex::Regex; +use std::collections::HashMap; +use std::ffi::{c_char, CStr}; use std::path::{Path, PathBuf}; +use std::sync::{Arc, RwLock}; + +// Issue #818 follow-up: embedded module map for self-contained V8-fallback +// binaries. The compile pipeline emits a generated `.c` file (one entry per +// JS module pulled into the bundle by `collect_js_module_imports`) whose +// `__attribute__((constructor))` calls `js_register_embedded_module` for +// each `(canonical_path, source)` pair plus `js_register_embedded_alias` +// for each `(bare_specifier, canonical_path)` import edge. At runtime the +// `NodeModuleLoader` consults these maps BEFORE touching `node_modules/`, +// so the resulting binary boots correctly even when shipped without the +// source tree's `node_modules/` directory. +// +// Keys are kept as build-time canonical path strings — they don't need to +// exist on the runtime filesystem. The loader uses them as opaque +// identifiers; only the source string and the import-edge alias map are +// consulted on the load hot path. +static EMBEDDED_MODULES: Lazy>>> = + Lazy::new(|| RwLock::new(HashMap::new())); +static EMBEDDED_ALIASES: Lazy>> = + Lazy::new(|| RwLock::new(HashMap::new())); + +/// Register a JS module source against its build-time canonical path. +/// Called by `js_register_embedded_module` (the C FFI) at startup from the +/// generated bundle constructor; also usable directly from Rust for tests. +pub fn register_embedded_module(path: &str, source: String) { + if let Ok(mut map) = EMBEDDED_MODULES.write() { + map.insert(path.to_string(), Arc::new(source)); + } +} + +/// Register a bare specifier → build-time canonical path alias. Lets +/// `resolve()` redirect `import "hono"` to the embedded source without +/// walking `node_modules/`. +pub fn register_embedded_alias(specifier: &str, path: &str) { + if let Ok(mut map) = EMBEDDED_ALIASES.write() { + map.insert(specifier.to_string(), path.to_string()); + } +} + +/// Look up an embedded source by build-time canonical path. Returns +/// `None` when nothing's registered (the normal dev-build case). +fn lookup_embedded_module(path: &str) -> Option> { + EMBEDDED_MODULES + .read() + .ok() + .and_then(|map| map.get(path).cloned()) +} + +/// Look up the build-time canonical path that a bare specifier maps to. +fn lookup_embedded_alias(specifier: &str) -> Option { + EMBEDDED_ALIASES + .read() + .ok() + .and_then(|map| map.get(specifier).cloned()) +} + +/// C FFI: register an embedded JS module's source. Called from the +/// compile-emitted bundle constructor. Pointers are not retained — the +/// source string is copied into the global map. UTF-8 is assumed. +/// +/// # Safety +/// +/// `path_ptr` / `source_ptr` must point to valid `len`-byte regions of +/// UTF-8 text. The map takes ownership of an internal copy. +#[no_mangle] +pub unsafe extern "C" fn js_register_embedded_module( + path_ptr: *const c_char, + path_len: usize, + source_ptr: *const c_char, + source_len: usize, +) { + if path_ptr.is_null() || source_ptr.is_null() { + return; + } + let path_bytes = std::slice::from_raw_parts(path_ptr as *const u8, path_len); + let source_bytes = std::slice::from_raw_parts(source_ptr as *const u8, source_len); + let path = match std::str::from_utf8(path_bytes) { + Ok(s) => s, + Err(_) => return, + }; + let source = match std::str::from_utf8(source_bytes) { + Ok(s) => s.to_string(), + Err(_) => return, + }; + register_embedded_module(path, source); +} + +/// C FFI: register a bare specifier → embedded-path alias. Pointers are +/// not retained. +/// +/// # Safety +/// +/// Both pointers must reference valid UTF-8 of the given lengths. +#[no_mangle] +pub unsafe extern "C" fn js_register_embedded_alias( + specifier_ptr: *const c_char, + specifier_len: usize, + path_ptr: *const c_char, + path_len: usize, +) { + if specifier_ptr.is_null() || path_ptr.is_null() { + return; + } + let spec_bytes = std::slice::from_raw_parts(specifier_ptr as *const u8, specifier_len); + let path_bytes = std::slice::from_raw_parts(path_ptr as *const u8, path_len); + let specifier = match std::str::from_utf8(spec_bytes) { + Ok(s) => s, + Err(_) => return, + }; + let path = match std::str::from_utf8(path_bytes) { + Ok(s) => s, + Err(_) => return, + }; + register_embedded_alias(specifier, path); +} + +// Allow C-style null-terminated registration too — slightly nicer codegen +// from the bundle constructor (no manual `strlen`) and matches the +// convention used elsewhere in `perry-jsruntime` FFIs. +#[allow(dead_code)] +unsafe fn cstr_to_str<'a>(ptr: *const c_char) -> Option<&'a str> { + if ptr.is_null() { + return None; + } + CStr::from_ptr(ptr).to_str().ok() +} + +/// Probe the embedded map with the same extension/index candidates used +/// by `resolve_with_extensions` against the filesystem. Returns the +/// matching build-time canonical path on hit. Used when the file isn't on +/// disk because the binary's been shipped without its `node_modules/`. +fn lookup_embedded_path_with_extensions(base: &Path) -> Option { + let key = base.to_string_lossy().to_string(); + if lookup_embedded_module(&key).is_some() { + return Some(PathBuf::from(&key)); + } + let extensions = [".js", ".mjs", ".cjs", ".json"]; + for ext in extensions { + let candidate = format!("{}{}", key, ext); + if lookup_embedded_module(&candidate).is_some() { + return Some(PathBuf::from(candidate)); + } + } + // Try as a directory containing an index file. + for ext in extensions { + let candidate = if key.ends_with('/') { + format!("{}index{}", key, ext) + } else { + format!("{}/index{}", key, ext) + }; + if lookup_embedded_module(&candidate).is_some() { + return Some(PathBuf::from(candidate)); + } + } + None +} // CJS heuristics regex set. These are tight, hot path on every loaded JS // module (called once per import); compiling them once amortizes the cost. @@ -79,6 +237,22 @@ impl NodeModuleLoader { /// Resolve a module specifier to an absolute path fn resolve_module_path(&self, specifier: &str, referrer: &Path) -> Result { + // Issue #818 follow-up: prefer embedded-bundle lookups over disk + // probes. For bare specifiers ("hono", "@scope/x") an alias map + // gives us the canonical build-time path directly; for relative + // and absolute paths we still walk the standard candidate chain + // and then check whether the resolved path matches an embedded + // entry even when the file is absent from the runtime filesystem. + if !specifier.starts_with("./") + && !specifier.starts_with("../") + && !specifier.starts_with('/') + && !specifier.starts_with("file://") + { + if let Some(embedded_path) = lookup_embedded_alias(specifier) { + return Ok(PathBuf::from(embedded_path)); + } + } + // Handle file:// URLs if specifier.starts_with("file://") { let path_str = specifier.strip_prefix("file://").unwrap_or(specifier); @@ -86,6 +260,9 @@ impl NodeModuleLoader { if path.exists() && path.is_file() { return Ok(path); } + if lookup_embedded_module(&path.to_string_lossy()).is_some() { + return Ok(path); + } return self.resolve_with_extensions(path); } @@ -93,22 +270,49 @@ impl NodeModuleLoader { if specifier.starts_with("./") || specifier.starts_with("../") { let referrer_dir = referrer.parent().unwrap_or(&self.base_dir); let resolved = referrer_dir.join(specifier); - let resolved = self.resolve_with_extensions(resolved)?; - // Check browser field mapping (e.g., ethers geturl.js -> geturl-browser.js) - if let Some(browser_path) = self.check_browser_field(&resolved) { - return Ok(browser_path); + match self.resolve_with_extensions(resolved.clone()) { + Ok(resolved) => { + // Check browser field mapping (e.g., ethers geturl.js -> geturl-browser.js) + if let Some(browser_path) = self.check_browser_field(&resolved) { + return Ok(browser_path); + } + return Ok(resolved); + } + Err(e) => { + // Self-contained binary path: the file isn't on disk + // because node_modules/ was left behind. Probe the + // embedded map with the same extension/index candidates + // we'd try against the filesystem. + if let Some(p) = lookup_embedded_path_with_extensions(&resolved) { + return Ok(p); + } + return Err(e); + } } - return Ok(resolved); } // Handle absolute paths if specifier.starts_with('/') { let resolved = PathBuf::from(specifier); + if let Ok(p) = self.resolve_with_extensions(resolved.clone()) { + return Ok(p); + } + if let Some(p) = lookup_embedded_path_with_extensions(&resolved) { + return Ok(p); + } return self.resolve_with_extensions(resolved); } // Handle node_modules - self.resolve_from_node_modules(specifier, referrer) + match self.resolve_from_node_modules(specifier, referrer) { + Ok(p) => Ok(p), + Err(e) => { + if let Some(embedded_path) = lookup_embedded_alias(specifier) { + return Ok(PathBuf::from(embedded_path)); + } + Err(e) + } + } } /// Try resolving a path with common extensions @@ -510,13 +714,24 @@ impl ModuleLoader for NodeModuleLoader { } }; - let code = match std::fs::read_to_string(&path) { - Ok(c) => c, - Err(e) => { - return ModuleLoadResponse::Sync(Err(JsErrorBox::generic(format!( - "Failed to read module {:?}: {}", - path, e - )))) + // Issue #818 follow-up: embedded-bundle first. Self-contained + // binaries register every JS module they import at startup; the + // map is keyed on build-time canonical paths, which is what + // `resolve()` returns. Falls through to disk only when nothing's + // registered for this path — preserves the dev-build behavior + // where `node_modules/` sits next to the binary. + let path_key = path.to_string_lossy().to_string(); + let code = if let Some(embedded) = lookup_embedded_module(&path_key) { + (*embedded).clone() + } else { + match std::fs::read_to_string(&path) { + Ok(c) => c, + Err(e) => { + return ModuleLoadResponse::Sync(Err(JsErrorBox::generic(format!( + "Failed to read module {:?}: {}", + path, e + )))) + } } }; diff --git a/crates/perry/src/commands/compile.rs b/crates/perry/src/commands/compile.rs index 11dfa344fa..f84ce2f296 100644 --- a/crates/perry/src/commands/compile.rs +++ b/crates/perry/src/commands/compile.rs @@ -51,8 +51,8 @@ use strip_dedup::strip_duplicate_objects_from_lib; use targets::{ apple_sdk_version, compile_for_android_widget, compile_for_ios_widget, compile_for_wasm, compile_for_watchos_widget, compile_for_wearos_tile, compile_metallib_for_bundle, - find_visionos_swift_runtime, find_watchos_swift_runtime, generate_js_bundle, - lookup_bundle_id_from_toml, + find_visionos_swift_runtime, find_watchos_swift_runtime, generate_embedded_js_object, + generate_js_bundle, lookup_bundle_id_from_toml, }; /// Result of a successful compilation @@ -5516,6 +5516,33 @@ pub fn run_with_parse_cache( OutputFormat::Text => println!("Generated JS bundle: {}", bundle_path.display()), OutputFormat::Json => {} } + // Issue #818 follow-up: embed every JS module's source into the + // final binary too. The V8 fallback `ModuleLoader` consults this + // map before falling back to disk, so the resulting binary needs + // no `node_modules/` co-located at runtime. The compiled `.o` + // contributes a `__attribute__((constructor))` that calls + // `js_register_embedded_module` once per bundled file. + let tmp_dir = std::env::temp_dir().join(format!("perry-embed-{}", std::process::id())); + let _ = fs::create_dir_all(&tmp_dir); + match generate_embedded_js_object(&ctx, &tmp_dir) { + Ok(obj) => { + if matches!(format, OutputFormat::Text) { + println!("Embedded JS bundle: {}", obj.display()); + } + obj_paths.push(obj); + } + Err(e) => { + // Don't hard-fail — the on-disk `__perry_js_bundle.js` + // still exists and the runtime falls back to filesystem + // reads. Surface a warning so the build is visibly + // degraded rather than silently shipping a binary that + // requires `node_modules/`. + eprintln!( + "warning: failed to embed JS bundle into binary ({}); the resulting binary will still require node_modules/ at runtime", + e + ); + } + } Some(bundle_path) } else { None diff --git a/crates/perry/src/commands/compile/targets.rs b/crates/perry/src/commands/compile/targets.rs index 0c2d854472..9e11cfa4bb 100644 --- a/crates/perry/src/commands/compile/targets.rs +++ b/crates/perry/src/commands/compile/targets.rs @@ -50,6 +50,172 @@ pub(super) fn generate_js_bundle(ctx: &CompilationContext, output_dir: &Path) -> Ok(bundle_path) } +/// Issue #818 follow-up: emit a generated C file whose constructor +/// registers every bundled JS module (and bare-specifier alias) into +/// `perry-jsruntime`'s embedded-module map at startup. Returns the path +/// to the compiled `.o`, ready to be appended to `obj_paths` ahead of +/// the final link. The result is a self-contained binary: the V8 +/// fallback `ModuleLoader` consults the in-memory map before touching +/// disk, so `node_modules/` is no longer required at runtime. +/// +/// Constructor priority `101` lands before any normal user constructors +/// and before `main`'s call to `js_runtime_init`, so by the time the +/// runtime asks for a module the map is fully populated. +pub(super) fn generate_embedded_js_object( + ctx: &CompilationContext, + output_dir: &Path, +) -> Result { + let c_path = output_dir.join("__perry_embedded_js.c"); + let obj_path = output_dir.join("__perry_embedded_js.o"); + + let mut c = String::new(); + c.push_str("// Auto-generated by Perry — embedded JS module map\n"); + c.push_str("// Constructor populates the V8 fallback ModuleLoader's in-memory\n"); + c.push_str("// cache so the resulting binary needs no node_modules/ at runtime.\n"); + c.push_str("#include \n\n"); + c.push_str("extern void js_register_embedded_module(const char *path, size_t path_len, const char *source, size_t source_len);\n"); + c.push_str("extern void js_register_embedded_alias(const char *specifier, size_t specifier_len, const char *path, size_t path_len);\n\n"); + + // Emit per-module string literals + a length pair. + for (idx, (specifier, module)) in ctx.js_modules.iter().enumerate() { + let path_lit = c_string_literal(specifier); + let src_lit = c_string_literal(&module.source); + c.push_str(&format!( + "static const char PERRY_EMB_PATH_{idx}[] = {path_lit};\n" + )); + c.push_str(&format!( + "static const size_t PERRY_EMB_PATH_LEN_{idx} = {};\n", + specifier.as_bytes().len() + )); + c.push_str(&format!( + "static const char PERRY_EMB_SRC_{idx}[] = {src_lit};\n" + )); + c.push_str(&format!( + "static const size_t PERRY_EMB_SRC_LEN_{idx} = {};\n", + module.source.as_bytes().len() + )); + } + + // Bare-specifier aliases: walk every native + JS module's resolved imports + // and pick the (source_specifier, resolved_path) pairs whose + // `resolved_path` is in `js_modules`. That captures both top-level + // bare imports like `import { Hono } from "hono"` and JS→JS bare + // imports (e.g. one V8-fallback module importing another package). + let mut aliases: std::collections::BTreeMap = std::collections::BTreeMap::new(); + for hir_module in ctx.native_modules.values() { + for import in &hir_module.imports { + let spec = &import.source; + // Skip relative/absolute — only bare specs need an alias. + if spec.starts_with("./") + || spec.starts_with("../") + || spec.starts_with('/') + || spec.starts_with("file://") + || spec.starts_with("node:") + { + continue; + } + if let Some(ref resolved) = import.resolved_path { + if ctx.js_modules.contains_key(resolved) { + aliases.insert(spec.clone(), resolved.clone()); + } + } + } + } + // JS-bundled modules sometimes have a `package.json` "name" we could + // also register, but the import-edge alias map already covers the + // shapes the V8 loader sees (TS-side bare imports). For relative + // imports between JS files the path-keyed map handles it directly. + + for (idx, (specifier, path)) in aliases.iter().enumerate() { + let spec_lit = c_string_literal(specifier); + let path_lit = c_string_literal(path); + c.push_str(&format!( + "static const char PERRY_ALIAS_SPEC_{idx}[] = {spec_lit};\n" + )); + c.push_str(&format!( + "static const size_t PERRY_ALIAS_SPEC_LEN_{idx} = {};\n", + specifier.as_bytes().len() + )); + c.push_str(&format!( + "static const char PERRY_ALIAS_PATH_{idx}[] = {path_lit};\n" + )); + c.push_str(&format!( + "static const size_t PERRY_ALIAS_PATH_LEN_{idx} = {};\n", + path.as_bytes().len() + )); + } + + // Constructor: register every entry before the runtime spins up. + // Priority 101 is the lowest user-defined value (0-100 are reserved + // for the implementation); this runs before any normal C++ static + // initializer or user constructor. + c.push_str("__attribute__((constructor(101)))\n"); + c.push_str("static void perry_register_embedded_js(void) {\n"); + for idx in 0..ctx.js_modules.len() { + c.push_str(&format!( + " js_register_embedded_module(PERRY_EMB_PATH_{idx}, PERRY_EMB_PATH_LEN_{idx}, PERRY_EMB_SRC_{idx}, PERRY_EMB_SRC_LEN_{idx});\n" + )); + } + for idx in 0..aliases.len() { + c.push_str(&format!( + " js_register_embedded_alias(PERRY_ALIAS_SPEC_{idx}, PERRY_ALIAS_SPEC_LEN_{idx}, PERRY_ALIAS_PATH_{idx}, PERRY_ALIAS_PATH_LEN_{idx});\n" + )); + } + c.push_str("}\n"); + + fs::write(&c_path, &c)?; + + let status = Command::new("cc") + .arg("-c") + .arg(&c_path) + .arg("-O0") + .arg("-o") + .arg(&obj_path) + .status() + .map_err(|e| anyhow!("Failed to invoke cc for embedded JS bundle: {}", e))?; + if !status.success() { + return Err(anyhow!( + "cc failed to compile embedded JS bundle ({})", + c_path.display() + )); + } + Ok(obj_path) +} + +/// Render a Rust string as a C string literal, escaping all non-ASCII +/// and special bytes byte-by-byte. We deliberately use octal escapes for +/// every byte ≥ 0x80 (and quotes/backslash/control chars) instead of +/// embedding raw UTF-8 — keeps the resulting source ASCII-clean and +/// avoids surprising the host C compiler with a multi-byte character set +/// it doesn't recognize. Length is reported as the byte length to match +/// the `size_t` parameter on the runtime side. +fn c_string_literal(s: &str) -> String { + let mut out = String::with_capacity(s.len() + 2); + out.push('"'); + for &b in s.as_bytes() { + match b { + b'"' => out.push_str("\\\""), + b'\\' => out.push_str("\\\\"), + b'\n' => out.push_str("\\n"), + b'\r' => out.push_str("\\r"), + b'\t' => out.push_str("\\t"), + // `?` only needs escaping inside trigraph sequences; modern + // toolchains either disable trigraphs (clang default) or accept + // the literal. Escape just to be safe — embedded `??=`, `??/` + // etc. inside JS source could otherwise change meaning under + // `-trigraphs`. + b'?' => out.push_str("\\?"), + 0x20..=0x7E => out.push(b as char), + _ => { + use std::fmt::Write; + let _ = write!(out, "\\{:03o}", b); + } + } + } + out.push('"'); + out +} + /// Compile for iOS widget target: emit SwiftUI source for WidgetKit extension. /// Auto-invokes `swiftc` to produce a built `WidgetExtension.appex/` directory /// unless `--skip-swift-build` is passed. diff --git a/test-files/test_v8_self_contained.ts b/test-files/test_v8_self_contained.ts new file mode 100644 index 0000000000..db61969a19 --- /dev/null +++ b/test-files/test_v8_self_contained.ts @@ -0,0 +1,20 @@ +// Issue #818 follow-up: verify that a V8-fallback binary embeds the +// imported JS module sources at compile time and can run without the +// project's node_modules/ at runtime. +// +// Build, then move the resulting executable to a directory that does NOT +// contain node_modules/. Both prints should still succeed. +// +// Expected output: +// object +// function +// self-contained ok + +import { Hono } from 'hono'; + +const app = new Hono(); +app.get('/', (c) => c.text('Hi')); + +console.log(typeof app); +console.log(typeof app.get); +console.log('self-contained ok');