tangle-network · drewstone · Jun 16, 2026 · Jun 16, 2026 · Jun 16, 2026 · Jun 16, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -62,7 +62,7 @@ Types that stay in THIS repo because they're runtime-shaped (coupled to a runnin
 - `run-loop.ts` — `runLoop`, the round-synchronous leaf kernel. Per round: `driver.plan()`→N tasks→one sandbox/iteration (bounded by `maxConcurrency`, round-robin `agentRuns`)→`streamPrompt`→`output.parse`→`validator.validate`→`driver.decide`. Owns iteration accounting, concurrency, abort, cost+token aggregation, trace emission, box teardown. Exports `defaultSelectWinner` (best-valid-score, ties→earliest) — the single-sourced selection the personify combinators reuse.
 - `supervise/` — the recursive execution atom (keystone): `Scope` + `Supervisor` over the open `Executor` port, spawn/settle on a **conserved budget pool** so equal-compute holds by construction; journal→replay/resume. `runtime.ts` also holds `createExecutor({backend})` — the ONE built-in executor (backend-as-data: `router`/`router-tools`/`bridge`/`cli`/`sandbox`; `router-tools` is the off-box tool-using agentic loop — chat→tool_calls→`executeToolCall`→repeat — over the router's tool-calling, no sandbox); the per-backend bodies are internal case-arms, BYO agents implement `Executor` directly.
 - `personify/` — the content-free generic combinators (`fanout`/`loopUntil`/`widen`/`panel`/`verify`/`pipeline`) + `definePersona`/`runPersonified` + the cross-run `Corpus` + `createScopeAnalyst` (the analyst-on-scope steer firewall).
-- the **agent-driver** is the canonical "drive an agent" path: an `AgentProfile` driving another `AgentProfile` via the coordination toolbox (`createCoordinationTools`, `src/mcp/tools/coordination.ts`) over the `Scope`/`Supervisor`, plus `runAgentic`/`defineStrategy`/`runPersonified` (`strategy.ts`/`personify/persona.ts`) on the Supervisor. Child→parent messages ride ONE typed pipe — `createEventBus` (`supervise/event-bus.ts`): settled outputs, `ask_parent` questions, and analyst findings are all `CoordinationEvent` kinds, delivered pass-through (`subscribe`/`onEvent`, immediate) AND queued for the driver to pull (`await_event`, kind-filterable; `await_next` is the settled-only view). The pull queue is **priority-ordered** — a blocking question (urgency→priority: `blocks-run`=20/`blocks-step`=10) is bumped ahead of queued settles/findings; ties FIFO by `seq`. Observability is first-class: every event is stamped (`seq`/`at`/`priority`), the full `history()` is an audit/replay trail, `stats()` counts throughput (both surfaced on `CoordinationTools` and the MCP handle). `analyzeOnSettle` auto-fires trace analysts when a worker settles `done`, re-entering each result as a `finding` on the same bus (cost-governed opt-in; the firewall stays in the analyst registry). The in-process queue and a future cross-box durable mailbox share this one interface. `assertTraceDerivedFindings` (`personify/analyst.ts`) is the steer-firewall (selector≠judge). `types.ts` holds `Driver`/`AgentRunSpec`/`OutputAdapter`/`Validator`/`Iteration`/`LoopResult`/`SandboxClient` + the `LoopTraceEvent` union. `sandbox-run.ts` is `openSandboxRun` — the one run/stream/resume sandbox seam; `inline-sandbox-client.ts` is `inlineSandboxClient` — the one adapter presenting any non-box `Executor` as a `SandboxClient` for `runLoop`. `loop-dispatch.ts` adapts `runLoop`→agent-eval campaigns; `report-usage.ts` forwards token usage so the integrity guard sees a real backend.
+- the **agent-driver** is the canonical "drive an agent" path: an `AgentProfile` driving another `AgentProfile` via the coordination toolbox (`createCoordinationTools`, `src/mcp/tools/coordination.ts`) over the `Scope`/`Supervisor`, plus `runAgentic`/`defineStrategy`/`runPersonified` (`strategy.ts`/`personify/persona.ts`) on the Supervisor. Child→parent messages ride ONE typed pipe — `createEventBus` (`supervise/event-bus.ts`): settled outputs, `ask_parent` questions, and analyst findings are all `CoordinationEvent` kinds, delivered pass-through (`subscribe`/`onEvent`, immediate) AND queued for the driver to pull (`await_event({kinds?})` — the ONE wait verb; `kinds:['settled']` = next finished worker, omit = also questions/findings). The pull queue is **priority-ordered** — a blocking question (urgency→priority: `blocks-run`=20/`blocks-step`=10) is bumped ahead of queued settles/findings; ties FIFO by `seq`. The bus is **bidirectional**: UP (settled/question/finding) is queued+pullable; DOWN (`steer_worker` for any live worker — instruction/correction/continuation; `answer_question` routes an answer down) goes to the child inbox via `scope.send`→`deliver` AND records a `queue:false` event (history + subscribers, never pulled back). The receive end is `createInbox` (`supervise/inbox.ts`), which the owned tool-loop executor (`routerToolsInlineExecutor`) exposes as `Executor.deliver`: QUEUED messages flush at each step boundary AND before the worker may settle (it can't finish with an unread steer); a FORCEFUL `steer_worker({interrupt:true})` aborts the in-flight turn so the worker re-plans immediately. Black-box CLI harnesses can't be interrupted mid-step, so there the down-leg degrades to the next spawn. Observability is first-class: every event both ways is stamped (`seq`/`at`/`priority`), the full `history()` is an audit/replay trail, `stats()` counts throughput (both surfaced on `CoordinationTools` and the MCP handle). `analyzeOnSettle` auto-fires trace analysts when a worker settles `done`, re-entering each result as a `finding` on the same bus (cost-governed opt-in; the firewall stays in the analyst registry). Trace analysis is **substrate-agnostic** via `TraceSource` (`supervise/trace-source.ts`) — a worker's tool calls as agent-eval `ToolSpan`s from EITHER an owned loop (`createPushTraceSource`; `routerToolsInlineExecutor`'s `onToolStep` feeds `record`) OR a sandbox/fleet box (`sandboxSessionTraceSource(box, sessionId)` decodes `box.messages()` session parts; `decodeToolPart` is defensive across OpenAI + harness shapes). Two consumers ride a source: ONLINE `watchTrace` (`detector-monitor.ts`) folds live spans through agent-eval's published streaming kernel (`repeatedActionDetector`/`errorStreakDetector`, the SAME kernel `control-runtime` folds) → `onSignal` → a `finding`; SETTLE `analyzeTrace` (`trajectory-recorder.ts`) collects the spans and runs the published BATCH analyzers (`buildTrajectory`/`stuckLoopView`/`toolWasteView`). `ToolSpan` is the common currency; detection logic + the failure taxonomy live in agent-eval — never reimplement here. Production target = sandbox/fleet; the owned-loop push path is for local/router/cli-bridge. The in-process queue and a future cross-box durable mailbox share this one interface. `assertTraceDerivedFindings` (`personify/analyst.ts`) is the steer-firewall (selector≠judge). `types.ts` holds `Driver`/`AgentRunSpec`/`OutputAdapter`/`Validator`/`Iteration`/`LoopResult`/`SandboxClient` + the `LoopTraceEvent` union. `sandbox-run.ts` is `openSandboxRun` — the one run/stream/resume sandbox seam; `inline-sandbox-client.ts` is `inlineSandboxClient` — the one adapter presenting any non-box `Executor` as a `SandboxClient` for `runLoop`. `loop-dispatch.ts` adapts `runLoop`→agent-eval campaigns; `report-usage.ts` forwards token usage so the integrity guard sees a real backend.
 
 Two substrates coexist for the same "recursive agent decision" atom: the round-synchronous `runLoop` kernel (the leaf, what most sandbox benches drive today) and the reactive `Scope`/`Supervisor`+combinators (the canonical core — the agent-driver, `runAgentic`/`defineStrategy`/`runPersonified`). Prefer the latter for new recursive/keystone work. Both run over the one `Executor` port.
 

diff --git a/bench/src/atom-humaneval.mts b/bench/src/atom-humaneval.mts
@@ -143,7 +143,7 @@ function humanEvalWorker(task: HumanEvalTask, label: string): Agent<unknown, unk
   }
 }
 
-const driverSystem = `You are an orchestrator driving worker agents to solve a Python coding task. You do NOT write code yourself. Each worker independently attempts the task and is graded by a deterministic, hidden test suite. Tools: spawn_worker (dispatch one attempt; the "profile" argument may be {} and "task" a short note), await_next (collect the next settled worker — its result tells you valid:true if its tests PASSED, valid:false if they failed), and stopping (reply with NO tool call) once a worker has DELIVERED. Spawn one worker, await it; if it delivered, stop; if not, spawn another, up to ${K} workers total. You cannot declare success yourself — only a delivered (valid:true) worker counts.`
+const driverSystem = `You are an orchestrator driving worker agents to solve a Python coding task. You do NOT write code yourself. Each worker independently attempts the task and is graded by a deterministic, hidden test suite. Tools: spawn_worker (dispatch one attempt; the "profile" argument may be {} and "task" a short note), await_event (collect the next settled worker — its result tells you valid:true if its tests PASSED, valid:false if they failed), and stopping (reply with NO tool call) once a worker has DELIVERED. Spawn one worker, await it; if it delivered, stop; if not, spawn another, up to ${K} workers total. You cannot declare success yourself — only a delivered (valid:true) worker counts.`
 
 interface TaskOutcome {
   taskId: string

diff --git a/bench/src/atom-mcp-e2e.mts b/bench/src/atom-mcp-e2e.mts
@@ -175,7 +175,7 @@ async function main(): Promise<void> {
           messages: [
             {
               role: 'user',
-              content: `${TASK}\n\nYou are a SUPERVISOR. You have the "supervise" skill and a "coordination" MCP with tools spawn_worker, await_next, stop. Do NOT write code yourself. Author a worker profile (a JSON object with name + a rich systemPrompt telling the worker exactly what to implement) and call spawn_worker with it, then await_next, and stop once a worker delivered (valid:true).`,
+              content: `${TASK}\n\nYou are a SUPERVISOR. You have the "supervise" skill and a "coordination" MCP with tools spawn_worker, await_event, stop. Do NOT write code yourself. Author a worker profile (a JSON object with name + a rich systemPrompt telling the worker exactly what to implement) and call spawn_worker with it, then await_event, and stop once a worker delivered (valid:true).`,
             },
           ],
           cwd: supCwd,

diff --git a/bench/src/mcp-mount-probe.mts b/bench/src/mcp-mount-probe.mts
@@ -3,7 +3,7 @@
  * actually MOUNT my coordination MCP and CALL spawn_worker — landing on a real Scope.spawn?
  *
  * Serves the coordination MCP over a live Scope, then asks the bridge's opencode (with that MCP in
- * its config) to call spawn_worker + await_next. If the Scope spawned+settled, the in-box driving
+ * its config) to call spawn_worker + await_event. If the Scope spawned+settled, the in-box driving
  * path is real. No mock.
  *
  *   ROUTER_BASE=http://127.0.0.1:3355/v1 TANGLE_API_KEY=<bridge-bearer> \
@@ -86,9 +86,9 @@ async function main(): Promise<void> {
             {
               role: 'user',
               content:
-                'You have an MCP server named "coordination" with tools: spawn_worker, await_next, stop. ' +
-                'Call spawn_worker with arguments {"profile":{},"task":"hello"}. Then call await_next. ' +
-                'Then reply with exactly what await_next returned.',
+                'You have an MCP server named "coordination" with tools: spawn_worker, await_event, stop. ' +
+                'Call spawn_worker with arguments {"profile":{},"task":"hello"}. Then call await_event. ' +
+                'Then reply with exactly what await_event returned.',
             },
           ],
           mcp.url,

diff --git a/bench/src/profiles.ts b/bench/src/profiles.ts
@@ -21,7 +21,7 @@ export const OPERATOR_TOOLS = [
   'run_analyst', // run an analyst over a worker's trace → findings (selector≠judge: trace, not score)
   'observe_worker', // a worker's in-flight trace, or its last finished episode/shot
   'spawn_worker', // start a worker (or a sub-analyst) — drive many; parallelize when independent
-  'steer_worker', // send a running/parked worker its next instruction / an interrupt
+  'steer_worker', // send a live worker a message down: instruction, course-correction, or continuation (interrupt? for forceful)
   'stop', // declare the task complete (verified) or abandon a line
 ] as const
 
@@ -95,7 +95,7 @@ export const driverProfile: RoleProfile = {
     '  analysts are cheap; make them when a worker’s failure mode needs a focused lens.',
     '- observe_worker(worker): the worker’s IN-FLIGHT trace if it is still running, else its last',
     '  finished episode/shot.',
-    '- spawn_worker(profile, task) / steer_worker(worker, instruction) / stop.',
+    '- spawn_worker(profile, task) / steer_worker(worker, instruction, interrupt?) / stop.',
     '- the artifact’s own tools (read/edit/run) — use them to inspect the workspace and to contribute',
     '  decisive work yourself.',
     '',

diff --git a/docs/architecture-visual.md b/docs/architecture-visual.md
@@ -107,7 +107,7 @@ that keeps it honest.
                         ▼
         Scope: spawn child agent(s) → run → settle → verdict on the artifact
                         │
-                        └──▶ await_next → terminal? → winner = argmax(valid score)
+                        └──▶ await_event → terminal? → winner = argmax(valid score)
 ```
 
 The firewall is the load-bearing line: the **analyst reads the trace and may not cite the score**, so

diff --git a/docs/execution-model.md b/docs/execution-model.md
@@ -49,11 +49,11 @@ Before, each bench hand-rolled its own pseudo-box client. Now there is **one exe
                     │  each round it decides the TOPOLOGY MOVE ─────┐ this IS
                     │   refine │ fanout │ select │ stop          │ │ "topology grown
                     │  then drives workers via the toolbox:      │ │  by LLM decision"
-                    │   spawn_worker · await_next · steer_worker │ │ (driver.ts:52)
+                    │   spawn_worker · await_event · steer_worker │ │ (driver.ts:52)
                     └───────────────┬────────────────────────────┘ │
        spawn_worker(profile,task) ──┤  reserves budget (fails       │
        steer_worker(id,msg) ────────┤  CLOSED if the pool is dry)   │
-       await_next ──────────────────┘                               │
+       await_event ──────────────────┘                               │
                     ┌───────────────┼───────────────┐               │
                     ▼               ▼                ▼               │
              ┌───────────┐   ┌───────────┐   ┌───────────┐          │
@@ -113,7 +113,7 @@ Before, each bench hand-rolled its own pseudo-box client. Now there is **one exe
         └─ 4. settle  ──►  pool.reconcile(ticket, actualSpend)
                                             │
                                             ▼
-                              await_next wakes the driver with this child's result
+                              await_event wakes the driver with this child's result
 ```
 
 **Net:** the "unified thing" is the `Executor` port. Everything that runs work — a router call, a cli-bridge turn, a `claude -p` subprocess, a full sandbox rollout, or a BYO agent — is an `Executor`, chosen by data via `createExecutor`, metered by one budget pool. Drivers and workers are both `act`s over that port; the only structural difference is the driver carries the operator toolbox (so it can spawn/steer) and the worker does not.