Summary
@huggingface/transformers@4.x can't be used inside a Bun --compile single binary on Linux (and arguably on any platform) without patching the package, because of three independent issues:
-
onnxruntime-node is statically imported at src/backends/onnx.js:23 and dist/transformers.node.mjs:7545. Bun's --compile extracts the native .node addon at runtime, but it doesn't bundle the accompanying libonnxruntime.so.1 / libonnxruntime.1.24.3.dylib shared library that the addon dlopens. The binary crashes on first import:
Error [ERR_DLOPEN_FAILED]: libonnxruntime.so.1: cannot open shared object file: No such file or directory
-
sharp is statically imported at src/utils/image.js:1 / dist/transformers.node.mjs:17733. Same root cause — sharp's native binding can't be bundled into a single binary, and even text-only pipelines (e.g. feature-extraction) crash at module load:
Error: Could not load the "sharp" module using the darwin-arm64 runtime
-
Even after routing to onnxruntime-web, getCoreModelFile/getModelDataFiles return file paths instead of buffers when apis.IS_NODE_ENV (dist/transformers.node.mjs:22330,22338). The bundled ort.webgpu.bundle.min.mjs has its node:fs read path eliminated by tree-shaking (if (false) dead-code path) and tries to fetch() the bare path string, failing with TypeError [ERR_INVALID_URL]: fetch() URL is invalid.
Repro
Minimal reproduction in any Bun project:
// src/cli.ts
import { pipeline } from "@huggingface/transformers";
const extractor = await pipeline("feature-extraction", "Xenova/all-MiniLM-L6-v2");
console.log((await extractor("hello world", { pooling: "mean", normalize: true })).data.length);
bun build --compile --minify ./src/cli.ts --outfile dist/app
./dist/app
# ERR_DLOPEN_FAILED on Linux, or sharp error on macOS
Real-world project hitting this: evantahler/mcpx. The patches we ended up shipping are at patches/@huggingface%2Ftransformers@4.2.0.patch.
What we patched (summary)
| Change |
Reason |
Replace import * as ONNX_NODE from "onnxruntime-node" with var ONNX_NODE = void 0 |
Skip native binding load |
In IS_NODE_ENV branch: set ONNX = ort_webgpu_bundle_min_exports, supportedDevices = ["wasm"], defaultDevices = ["wasm"] |
Use the bundled web ort instead of native |
Replace import sharp from "sharp" with a stub function that throws lazily |
Skip native binding load; image processing fails only if actually invoked |
In getCoreModelFile/getModelDataFiles: pass return_path = false instead of apis.IS_NODE_ENV |
Return buffer so the bundled ort-web doesn't try to fetch() a bare path |
Suggested upstream fixes
-
Make onnxruntime-node an optional / dynamic import gated behind apis.IS_NODE_ENV && !apis.IS_BUN_COMPILE_ENV (or just behind a successful try). This is the single biggest unblocker — once the static import is gone, end users can override the backend via globalThis[Symbol.for("onnxruntime")] (which is already supported and documented) without patching.
-
Make sharp a dynamic import in src/utils/image.js, evaluated only when loadImageFunction is actually called. Text-only pipelines never need it.
-
Ship a built target that doesn't statically import either native package. Today the exports field has node → transformers.node.mjs and default → transformers.web.js. A third entry — "bun-bundle": "./dist/transformers.bundle.mjs" or a runtime feature-detect — would let Bun-compiled binaries opt into the WASM-only path without monkey-patching.
-
Fix the if (false) dead-code in the bundled ort-web so the node:fs read path stays alive. That would let users feed file paths through transformers in node-like environments without our return_path = false workaround. (This may be an upstream onnxruntime-web bundling issue rather than a transformers.js one.)
Why the existing escape hatch isn't enough
The globalThis[Symbol.for("onnxruntime")] override (lines 11551-11553 of dist/transformers.node.mjs) is a great hook — but it runs after the top-level import * as ONNX_NODE from "onnxruntime-node" has already executed and crashed. The override never gets a chance to fire. Same for sharp.
🤖 Generated with Claude Code
Summary
@huggingface/transformers@4.xcan't be used inside a Bun--compilesingle binary on Linux (and arguably on any platform) without patching the package, because of three independent issues:onnxruntime-nodeis statically imported atsrc/backends/onnx.js:23anddist/transformers.node.mjs:7545. Bun's--compileextracts the native.nodeaddon at runtime, but it doesn't bundle the accompanyinglibonnxruntime.so.1/libonnxruntime.1.24.3.dylibshared library that the addondlopens. The binary crashes on first import:sharpis statically imported atsrc/utils/image.js:1/dist/transformers.node.mjs:17733. Same root cause — sharp's native binding can't be bundled into a single binary, and even text-only pipelines (e.g.feature-extraction) crash at module load:Even after routing to onnxruntime-web,
getCoreModelFile/getModelDataFilesreturn file paths instead of buffers whenapis.IS_NODE_ENV(dist/transformers.node.mjs:22330,22338). The bundledort.webgpu.bundle.min.mjshas itsnode:fsread path eliminated by tree-shaking (if (false)dead-code path) and tries tofetch()the bare path string, failing withTypeError [ERR_INVALID_URL]: fetch() URL is invalid.Repro
Minimal reproduction in any Bun project:
bun build --compile --minify ./src/cli.ts --outfile dist/app ./dist/app # ERR_DLOPEN_FAILED on Linux, or sharp error on macOSReal-world project hitting this: evantahler/mcpx. The patches we ended up shipping are at
patches/@huggingface%2Ftransformers@4.2.0.patch.What we patched (summary)
import * as ONNX_NODE from "onnxruntime-node"withvar ONNX_NODE = void 0IS_NODE_ENVbranch: setONNX = ort_webgpu_bundle_min_exports,supportedDevices = ["wasm"],defaultDevices = ["wasm"]import sharp from "sharp"with a stub function that throws lazilygetCoreModelFile/getModelDataFiles: passreturn_path = falseinstead ofapis.IS_NODE_ENVfetch()a bare pathSuggested upstream fixes
Make
onnxruntime-nodean optional / dynamic import gated behindapis.IS_NODE_ENV && !apis.IS_BUN_COMPILE_ENV(or just behind a successful try). This is the single biggest unblocker — once the static import is gone, end users can override the backend viaglobalThis[Symbol.for("onnxruntime")](which is already supported and documented) without patching.Make
sharpa dynamic import insrc/utils/image.js, evaluated only whenloadImageFunctionis actually called. Text-only pipelines never need it.Ship a built target that doesn't statically import either native package. Today the
exportsfield hasnode→transformers.node.mjsanddefault→transformers.web.js. A third entry —"bun-bundle": "./dist/transformers.bundle.mjs"or a runtime feature-detect — would let Bun-compiled binaries opt into the WASM-only path without monkey-patching.Fix the
if (false)dead-code in the bundled ort-web so thenode:fsread path stays alive. That would let users feed file paths through transformers in node-like environments without ourreturn_path = falseworkaround. (This may be an upstreamonnxruntime-webbundling issue rather than a transformers.js one.)Why the existing escape hatch isn't enough
The
globalThis[Symbol.for("onnxruntime")]override (lines 11551-11553 ofdist/transformers.node.mjs) is a great hook — but it runs after the top-levelimport * as ONNX_NODE from "onnxruntime-node"has already executed and crashed. The override never gets a chance to fire. Same for sharp.🤖 Generated with Claude Code