Skip to content

Add Cloudflare Workers sandbox runtime for isolated code execution#3

Merged
RhysSullivan merged 6 commits into
mainfrom
feat/cloudflare-sandbox-runtime
Feb 12, 2026
Merged

Add Cloudflare Workers sandbox runtime for isolated code execution#3
RhysSullivan merged 6 commits into
mainfrom
feat/cloudflare-sandbox-runtime

Conversation

@RhysSullivan
Copy link
Copy Markdown
Owner

Summary

Adds a second code execution runtime (cloudflare-worker-loader) alongside the existing local-bun (node:vm) runtime. Agent-generated code runs in Cloudflare Workers V8 isolates with full network blocking, preventing sandbox escape and data exfiltration.

  • New package: @executor/sandbox-host (executor/packages/sandbox-host/) — Cloudflare Worker that spawns dynamic V8 isolates via the Worker Loader API
  • New runtime modules: runtime_catalog.ts, cloudflare_worker_loader_runtime.ts, transpile.ts — Convex-side dispatch, config, and shared TS transpilation
  • Runtime dispatch: executorNode.ts now routes tasks to either local-bun or cloudflare-worker-loader based on runtimeId

Architecture

Convex (runTask action)
  │ POST /v1/runs {taskId, code, timeoutMs, callback}
  ▼
CF Host Worker (executor/packages/sandbox-host)
  │ env.LOADER.get(id, () => WorkerCode)
  ▼
Dynamic V8 Isolate (globalOutbound: null)
  │ tools.* → env.TOOL_BRIDGE (RPC) → ToolBridge → POST callback/tool-call
  │ console.* → env.TOOL_BRIDGE.emitOutput() → POST callback/output
  ▼
Result JSON → Convex

Security

  • Network fully blockedglobalOutbound: null on every isolate
  • Module-level isolation — user code runs in a separate ES module (user-code.js), cannot access req, env, ctx, or the harness's fetch handler
  • Response.json hardened — captured in globals.js (evaluated before user code) so user code can't hijack result reporting
  • Timing-safe auth — constant-time token comparison on the Worker
  • No credential leakage — callback auth token lives in the host Worker's ToolBridge props, inaccessible from the isolate
  • Security tested — auth bypass (7 scenarios), sandbox escape (globals, constructor chains, cloudflare:workers import), network exfiltration (fetch, WebSocket, DNS, metadata), IIFE escape, Response.json hijack, prototype pollution, resource exhaustion

Tests

  • 18 runtime tests pass (including 2 new TypeScript transpilation tests)
  • E2E verified against both wrangler dev and production deployment

Deployment

Worker is deployed at https://executor-sandbox-host.rhys-669.workers.dev with AUTH_TOKEN secret set. To connect to Convex, set:

  • CLOUDFLARE_SANDBOX_RUN_URL=https://executor-sandbox-host.rhys-669.workers.dev/v1/runs
  • CLOUDFLARE_SANDBOX_AUTH_TOKEN=<the token>

Adds a second code execution runtime (cloudflare-worker-loader) alongside the
existing local-bun (node:vm) runtime. Agent-generated code runs in Cloudflare
Workers V8 isolates with full network blocking (globalOutbound: null), preventing
sandbox escape and data exfiltration.

Architecture:
- Host Worker (executor/packages/sandbox-host) receives code via POST /v1/runs
- Spawns a dynamic V8 isolate per task using the Worker Loader API
- User code runs in a separate ES module from the harness, preventing IIFE escape
  and Response.json hijacking
- Tool calls route through a ToolBridge RPC entrypoint back to Convex
- Console output is buffered and streamed back in real-time

Security hardening:
- User code in separate module (user-code.js) — cannot access req/env/ctx
- Response.json captured in globals.js before user module evaluation
- Timing-safe auth token comparison
- All network blocked from isolate; communication only via TOOL_BRIDGE binding

Also extracts transpileForRuntime() into a shared module so both runtimes
transpile TypeScript before execution.
@vercel
Copy link
Copy Markdown

vercel Bot commented Feb 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
executor-web Error Error Feb 12, 2026 1:06am

Request Review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fbe5316299

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +42 to +44
const timeout = setTimeout(
() => controller.abort(),
config.requestTimeoutMs,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Align host-request abort timeout with task timeout

runCodeWithCloudflareWorkerLoader always aborts the HTTP call after config.requestTimeoutMs (defaulted to 90,000ms), even though the task timeout passed to the sandbox is request.timeoutMs; this means any Cloudflare run configured above 90s (including the 300,000ms default task timeout in executor/convex/executor.ts) will be cut off early and reported as timed out before the sandbox’s own timeout elapses. This can systematically fail longer tasks and produce incorrect timeout behavior for the new runtime.

Useful? React with 👍 / 👎.

Comment on lines +35 to +38
return Boolean(
process.env.CLOUDFLARE_SANDBOX_RUN_URL
&& process.env.CLOUDFLARE_SANDBOX_AUTH_TOKEN,
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate full Cloudflare runtime config before start

isCloudflareWorkerLoaderConfigured() only checks CLOUDFLARE_SANDBOX_RUN_URL and CLOUDFLARE_SANDBOX_AUTH_TOKEN, but getCloudflareWorkerLoaderConfig() also requires callback env vars (CONVEX_SITE_URL/CONVEX_URL and EXECUTOR_INTERNAL_TOKEN); with partial config, runTask passes the preflight gate, marks the task as running, and then fails immediately when config is read. This creates misleading task.running transitions for runs that were never actually executable.

Useful? React with 👍 / 👎.

Move HARNESS_CODE and GLOBALS_MODULE from inline template literals in
index.ts into separate *.isolate.js files under src/isolate/. Wrangler's
rules config treats the *.isolate.js glob as Text modules, so they're
bundled as string constants at build time while being authored as proper
JS files with syntax highlighting and linting.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8f2a00fd9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return Response.json({ error: "Unauthorized" }, { status: 401 });
}
const token = authHeader.slice("Bearer ".length);
if (!timingSafeEqual(token, env.AUTH_TOKEN)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject requests when AUTH_TOKEN is missing

The auth gate accepts requests even when the worker secret is unset because env.AUTH_TOKEN is used directly in timingSafeEqual without a presence check; in JS, TextEncoder.encode(undefined) is empty bytes, so Authorization: Bearer (empty token) can compare equal and execute /v1/runs. This turns a missing-secret misconfiguration into unauthenticated code execution instead of a hard failure, so the handler should fail closed when AUTH_TOKEN is empty/undefined.

Useful? React with 👍 / 👎.

Comment on lines +71 to +73
const requestTimeoutMs = Number(
process.env.CLOUDFLARE_SANDBOX_REQUEST_TIMEOUT_MS ?? "90000",
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate parsed sandbox request timeout

CLOUDFLARE_SANDBOX_REQUEST_TIMEOUT_MS is parsed with Number(...) and returned without validation, so values like "90s" or other non-numeric env input become NaN; downstream timeout scheduling treats that as immediate, causing Cloudflare runs to abort instantly as timed out. This should reject or sanitize non-finite/non-positive values before exposing the config.

Useful? React with 👍 / 👎.

…catch

Replace nested try/catch patterns with Result types from better-result:
- transpileForRuntime() returns Result<string, TranspileError> (sync)
- runtime_core uses Result.tryPromise() for VM execution
- CF worker loader uses Result.tryPromise() for fetch + response parsing
- typechecker uses Result.try() for TS module loading and semantic checks
- Add TranspileError as a TaggedError for typed error discrimination
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f3db484484

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

/** Forward a tool call to the Convex internal HTTP API. */
async callTool(toolPath: string, input: unknown): Promise<ToolCallResult> {
const { callbackBaseUrl, callbackAuthToken, taskId } = this.props;
const url = `${callbackBaseUrl}/internal/runs/${taskId}/tool-call`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Gate external tool callbacks on running task state

This bridge forwards every isolate tool call to /internal/runs/{taskId}/tool-call using the static internal token, but the callback handler (handleExternalToolCall in executor/convex/executorNode.ts) only checks that the task exists and does not require status === "running". If the Convex-side request ends early (e.g. host-request timeout or transport failure) and the task is marked terminal, the still-running isolate can continue invoking tools and mutating state after users already see timed_out/failed; callback processing should be rejected for non-running tasks (or use revocable per-run callback credentials).

Useful? React with 👍 / 👎.

Replace try/catch patterns in the CF host worker with Result types:
- ToolBridge.callTool uses Result.tryPromise for fetch + response handling
- ToolBridge.emitOutput uses Result.tryPromise (best-effort, discard err)
- Request body parsing uses Result.tryPromise instead of try/catch
- Isolate execution uses nested Result.tryPromise, eliminates try/finally
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e81257d97a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

throw new TranspileError({ message: `TypeScript transpile error: ${message}` });
}

return result.outputText || code;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve empty transpilation output from TypeScript

transpileForRuntime currently does return result.outputText || code, which treats an empty transpilation result as failure and sends the original TypeScript back to execution. Valid TS snippets that erase completely (for example type/interface-only code or import type) produce empty outputText with no diagnostics, so this path executes raw TS and fails with syntax errors instead of behaving as a no-op.

Useful? React with 👍 / 👎.

/** Stream a console output line back to Convex (best-effort). */
async emitOutput(stream: "stdout" | "stderr", line: string): Promise<void> {
const { callbackBaseUrl, callbackAuthToken, taskId } = this.props;
const url = `${callbackBaseUrl}/internal/runs/${taskId}/output`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject output callbacks after task leaves running state

This bridge forwards output with a reusable internal token to /internal/runs/{taskId}/output, but the Convex handler only checks that the run exists and does not require status === "running" before appending events. If an isolate outlives the task lifecycle (for example after host-request abort/timeout), it can keep writing stdout/stderr after completion, polluting logs and creating unbounded post-terminal event writes.

Useful? React with 👍 / 👎.

Start wrangler dev for the sandbox-host package alongside other services
in dev.ts. Runs on port 8787 (configurable via SANDBOX_PORT env var).
Also cleans up stale processes on that port.
- Introduce new database schema for tool calls, including status tracking and approval handling.
- Add mutations for creating, updating, and retrieving tool calls, with support for pending approvals.
- Enhance the executor to manage tool call lifecycle, including status updates and error handling.
- Update internal API to support new tool call functionalities, ensuring proper event publishing for task states.
- Refactor existing code to integrate new tool call logic, improving overall execution flow and error management.
@RhysSullivan RhysSullivan merged commit d34b9ae into main Feb 12, 2026
1 of 2 checks passed
RhysSullivan added a commit that referenced this pull request Mar 3, 2026
Add Cloudflare Workers sandbox runtime for isolated code execution
RhysSullivan added a commit that referenced this pull request Apr 5, 2026
Add Cloudflare Workers sandbox runtime for isolated code execution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant