Add Cloudflare Workers sandbox runtime for isolated code execution#3
Conversation
Adds a second code execution runtime (cloudflare-worker-loader) alongside the existing local-bun (node:vm) runtime. Agent-generated code runs in Cloudflare Workers V8 isolates with full network blocking (globalOutbound: null), preventing sandbox escape and data exfiltration. Architecture: - Host Worker (executor/packages/sandbox-host) receives code via POST /v1/runs - Spawns a dynamic V8 isolate per task using the Worker Loader API - User code runs in a separate ES module from the harness, preventing IIFE escape and Response.json hijacking - Tool calls route through a ToolBridge RPC entrypoint back to Convex - Console output is buffered and streamed back in real-time Security hardening: - User code in separate module (user-code.js) — cannot access req/env/ctx - Response.json captured in globals.js before user module evaluation - Timing-safe auth token comparison - All network blocked from isolate; communication only via TOOL_BRIDGE binding Also extracts transpileForRuntime() into a shared module so both runtimes transpile TypeScript before execution.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fbe5316299
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const timeout = setTimeout( | ||
| () => controller.abort(), | ||
| config.requestTimeoutMs, |
There was a problem hiding this comment.
Align host-request abort timeout with task timeout
runCodeWithCloudflareWorkerLoader always aborts the HTTP call after config.requestTimeoutMs (defaulted to 90,000ms), even though the task timeout passed to the sandbox is request.timeoutMs; this means any Cloudflare run configured above 90s (including the 300,000ms default task timeout in executor/convex/executor.ts) will be cut off early and reported as timed out before the sandbox’s own timeout elapses. This can systematically fail longer tasks and produce incorrect timeout behavior for the new runtime.
Useful? React with 👍 / 👎.
| return Boolean( | ||
| process.env.CLOUDFLARE_SANDBOX_RUN_URL | ||
| && process.env.CLOUDFLARE_SANDBOX_AUTH_TOKEN, | ||
| ); |
There was a problem hiding this comment.
Validate full Cloudflare runtime config before start
isCloudflareWorkerLoaderConfigured() only checks CLOUDFLARE_SANDBOX_RUN_URL and CLOUDFLARE_SANDBOX_AUTH_TOKEN, but getCloudflareWorkerLoaderConfig() also requires callback env vars (CONVEX_SITE_URL/CONVEX_URL and EXECUTOR_INTERNAL_TOKEN); with partial config, runTask passes the preflight gate, marks the task as running, and then fails immediately when config is read. This creates misleading task.running transitions for runs that were never actually executable.
Useful? React with 👍 / 👎.
Move HARNESS_CODE and GLOBALS_MODULE from inline template literals in index.ts into separate *.isolate.js files under src/isolate/. Wrangler's rules config treats the *.isolate.js glob as Text modules, so they're bundled as string constants at build time while being authored as proper JS files with syntax highlighting and linting.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a8f2a00fd9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return Response.json({ error: "Unauthorized" }, { status: 401 }); | ||
| } | ||
| const token = authHeader.slice("Bearer ".length); | ||
| if (!timingSafeEqual(token, env.AUTH_TOKEN)) { |
There was a problem hiding this comment.
Reject requests when AUTH_TOKEN is missing
The auth gate accepts requests even when the worker secret is unset because env.AUTH_TOKEN is used directly in timingSafeEqual without a presence check; in JS, TextEncoder.encode(undefined) is empty bytes, so Authorization: Bearer (empty token) can compare equal and execute /v1/runs. This turns a missing-secret misconfiguration into unauthenticated code execution instead of a hard failure, so the handler should fail closed when AUTH_TOKEN is empty/undefined.
Useful? React with 👍 / 👎.
| const requestTimeoutMs = Number( | ||
| process.env.CLOUDFLARE_SANDBOX_REQUEST_TIMEOUT_MS ?? "90000", | ||
| ); |
There was a problem hiding this comment.
Validate parsed sandbox request timeout
CLOUDFLARE_SANDBOX_REQUEST_TIMEOUT_MS is parsed with Number(...) and returned without validation, so values like "90s" or other non-numeric env input become NaN; downstream timeout scheduling treats that as immediate, causing Cloudflare runs to abort instantly as timed out. This should reject or sanitize non-finite/non-positive values before exposing the config.
Useful? React with 👍 / 👎.
…catch Replace nested try/catch patterns with Result types from better-result: - transpileForRuntime() returns Result<string, TranspileError> (sync) - runtime_core uses Result.tryPromise() for VM execution - CF worker loader uses Result.tryPromise() for fetch + response parsing - typechecker uses Result.try() for TS module loading and semantic checks - Add TranspileError as a TaggedError for typed error discrimination
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f3db484484
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /** Forward a tool call to the Convex internal HTTP API. */ | ||
| async callTool(toolPath: string, input: unknown): Promise<ToolCallResult> { | ||
| const { callbackBaseUrl, callbackAuthToken, taskId } = this.props; | ||
| const url = `${callbackBaseUrl}/internal/runs/${taskId}/tool-call`; |
There was a problem hiding this comment.
Gate external tool callbacks on running task state
This bridge forwards every isolate tool call to /internal/runs/{taskId}/tool-call using the static internal token, but the callback handler (handleExternalToolCall in executor/convex/executorNode.ts) only checks that the task exists and does not require status === "running". If the Convex-side request ends early (e.g. host-request timeout or transport failure) and the task is marked terminal, the still-running isolate can continue invoking tools and mutating state after users already see timed_out/failed; callback processing should be rejected for non-running tasks (or use revocable per-run callback credentials).
Useful? React with 👍 / 👎.
Replace try/catch patterns in the CF host worker with Result types: - ToolBridge.callTool uses Result.tryPromise for fetch + response handling - ToolBridge.emitOutput uses Result.tryPromise (best-effort, discard err) - Request body parsing uses Result.tryPromise instead of try/catch - Isolate execution uses nested Result.tryPromise, eliminates try/finally
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e81257d97a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| throw new TranspileError({ message: `TypeScript transpile error: ${message}` }); | ||
| } | ||
|
|
||
| return result.outputText || code; |
There was a problem hiding this comment.
Preserve empty transpilation output from TypeScript
transpileForRuntime currently does return result.outputText || code, which treats an empty transpilation result as failure and sends the original TypeScript back to execution. Valid TS snippets that erase completely (for example type/interface-only code or import type) produce empty outputText with no diagnostics, so this path executes raw TS and fails with syntax errors instead of behaving as a no-op.
Useful? React with 👍 / 👎.
| /** Stream a console output line back to Convex (best-effort). */ | ||
| async emitOutput(stream: "stdout" | "stderr", line: string): Promise<void> { | ||
| const { callbackBaseUrl, callbackAuthToken, taskId } = this.props; | ||
| const url = `${callbackBaseUrl}/internal/runs/${taskId}/output`; |
There was a problem hiding this comment.
Reject output callbacks after task leaves running state
This bridge forwards output with a reusable internal token to /internal/runs/{taskId}/output, but the Convex handler only checks that the run exists and does not require status === "running" before appending events. If an isolate outlives the task lifecycle (for example after host-request abort/timeout), it can keep writing stdout/stderr after completion, polluting logs and creating unbounded post-terminal event writes.
Useful? React with 👍 / 👎.
Start wrangler dev for the sandbox-host package alongside other services in dev.ts. Runs on port 8787 (configurable via SANDBOX_PORT env var). Also cleans up stale processes on that port.
- Introduce new database schema for tool calls, including status tracking and approval handling. - Add mutations for creating, updating, and retrieving tool calls, with support for pending approvals. - Enhance the executor to manage tool call lifecycle, including status updates and error handling. - Update internal API to support new tool call functionalities, ensuring proper event publishing for task states. - Refactor existing code to integrate new tool call logic, improving overall execution flow and error management.
Add Cloudflare Workers sandbox runtime for isolated code execution
Add Cloudflare Workers sandbox runtime for isolated code execution
Summary
Adds a second code execution runtime (
cloudflare-worker-loader) alongside the existinglocal-bun(node:vm) runtime. Agent-generated code runs in Cloudflare Workers V8 isolates with full network blocking, preventing sandbox escape and data exfiltration.@executor/sandbox-host(executor/packages/sandbox-host/) — Cloudflare Worker that spawns dynamic V8 isolates via the Worker Loader APIruntime_catalog.ts,cloudflare_worker_loader_runtime.ts,transpile.ts— Convex-side dispatch, config, and shared TS transpilationexecutorNode.tsnow routes tasks to eitherlocal-bunorcloudflare-worker-loaderbased onruntimeIdArchitecture
Security
globalOutbound: nullon every isolateuser-code.js), cannot accessreq,env,ctx, or the harness's fetch handlerglobals.js(evaluated before user code) so user code can't hijack result reportingTests
wrangler devand production deploymentDeployment
Worker is deployed at
https://executor-sandbox-host.rhys-669.workers.devwithAUTH_TOKENsecret set. To connect to Convex, set:CLOUDFLARE_SANDBOX_RUN_URL=https://executor-sandbox-host.rhys-669.workers.dev/v1/runsCLOUDFLARE_SANDBOX_AUTH_TOKEN=<the token>