chore: update Tangle agent packages#2
Merged
Conversation
tangletools
pushed a commit
that referenced
this pull request
May 17, 2026
Sweep removes commentary that describes what code used to do, what bug it replaces, or which audit found a pattern — per the CLAUDE.md doc discipline. Trims "(NEW in 0.7.0)" markers and legal-agent migration narrative from README and example docs. Deletes two orphan docs under docs/ that no source or doc references; both were point-in-time release/issue snapshots that no longer describe current state. - src/runtime-run.ts: drop "replaces legal-agent's bespoke..." paragraph from module doc; tighten complete() and randomSuffix comments. - src/trace-bridge.ts: drop "Before this module, consumers hand-rolled..." paragraph; reword tool_call args-omission and text_delta drop comments to describe current behaviour. - src/sanitize.ts: drop "the unified-union alternative was rejected because..." narrative on createRuntimeStreamEventCollector. - src/chat-turn.ts: drop "Caller pattern (replaces ~400 lines of legal/gtm/creative chat-runtime wrappers)" and tax-agent file:line reference; reword transport / fallback comments. - src/profile-conformance.ts: strip "from the canonical audit" / "#2 anti-pattern in the canonical audit" from issue messages and reword system-prompt-too-short message; trim docstring. - src/profile-conformance.test.ts: rename "the gtm-agent anti-pattern audit-found is caught" -> describes current behaviour; same for the describe block + shell-cap test. - src/index.ts: drop "(compat surface)" and "(new in 0.7.0)" section banners. - README.md: drop "(NEW in 0.7.0)" markers from quickstart table and section headers; drop legal-agent migration narrative. - examples/runtime-run: same treatment in README + .ts header. - docs/domain-agent-runtime-integration-issues.md: deleted (165 lines of issue drafts referencing "GitHub connector returns 404"; zero references in tree). - docs/product-runtime-kernel.md: deleted (326-line completion record for 0.5.0-0.5.2 release process; zero references in tree). - package.json: drop "docs" from files (directory is gone). Verification: pnpm typecheck, pnpm test (68 passing, unchanged), pnpm build all pass.
drewstone
added a commit
that referenced
this pull request
May 17, 2026
Sweep removes commentary that describes what code used to do, what bug it replaces, or which audit found a pattern — per the CLAUDE.md doc discipline. Trims "(NEW in 0.7.0)" markers and legal-agent migration narrative from README and example docs. Deletes two orphan docs under docs/ that no source or doc references; both were point-in-time release/issue snapshots that no longer describe current state. - src/runtime-run.ts: drop "replaces legal-agent's bespoke..." paragraph from module doc; tighten complete() and randomSuffix comments. - src/trace-bridge.ts: drop "Before this module, consumers hand-rolled..." paragraph; reword tool_call args-omission and text_delta drop comments to describe current behaviour. - src/sanitize.ts: drop "the unified-union alternative was rejected because..." narrative on createRuntimeStreamEventCollector. - src/chat-turn.ts: drop "Caller pattern (replaces ~400 lines of legal/gtm/creative chat-runtime wrappers)" and tax-agent file:line reference; reword transport / fallback comments. - src/profile-conformance.ts: strip "from the canonical audit" / "#2 anti-pattern in the canonical audit" from issue messages and reword system-prompt-too-short message; trim docstring. - src/profile-conformance.test.ts: rename "the gtm-agent anti-pattern audit-found is caught" -> describes current behaviour; same for the describe block + shell-cap test. - src/index.ts: drop "(compat surface)" and "(new in 0.7.0)" section banners. - README.md: drop "(NEW in 0.7.0)" markers from quickstart table and section headers; drop legal-agent migration narrative. - examples/runtime-run: same treatment in README + .ts header. - docs/domain-agent-runtime-integration-issues.md: deleted (165 lines of issue drafts referencing "GitHub connector returns 404"; zero references in tree). - docs/product-runtime-kernel.md: deleted (326-line completion record for 0.5.0-0.5.2 release process; zero references in tree). - package.json: drop "docs" from files (directory is gone). Verification: pnpm typecheck, pnpm test (68 passing, unchanged), pnpm build all pass.
tangletools
pushed a commit
that referenced
this pull request
Jun 4, 2026
…-loud session continuity Resolve all six findings from the review (none blocked landing; #1 gated enabling, #3/#4 wanted documenting). Lineage remains default-OFF and byte-identical to the fresh-box path when both flags are unset. - #1 sessionContinuity silent no-op: `continue` now asserts the session is still known to the sandbox via `box.session(id).status()` before streaming. A `null` (platform never honored the client-minted id, or it was reaped) raises a ValidationError, which executeIteration now propagates as a hard structural failure instead of degrading to a soft empty iteration — so a non-honoring platform errors loudly rather than running contextless turns. - #2 unbounded fork creation: `fork` provisions child boxes through `mapWithConcurrency` bounded by the loop's `maxConcurrency`, not a single `Promise.all` over all N branches. - #3 fork ignores per-branch specs: documented on `fork` and `LoopLineageOptions.forkFanout` that a real CRIU fork inherits the parent image/profile (per-branch specs apply only on the degraded fresh path). - #4 lineage holds every box to loop end: kernel prunes boxes no future round can descend from after each round, gated on a kernel-inferred (monotonic) branch point — skipped when the driver authors its own `parentIndex`. The unprunable case is documented as the box ceiling. - #5 abort during fork: documented the SDK's signal-less fork; abort is now checked per branch (between bounded waves) + an abort-under-lineage test. - #6 export order: alphabetized the loops barrel. Adds `mapWithConcurrency` util and six lineage tests (session-liveness pass/ fail, bounded-fork peak, mid-loop prune, no-prune-under-authored-parent, abort-under-lineage). 627 tests pass, typecheck + biome clean.
drewstone
added a commit
that referenced
this pull request
Jun 4, 2026
…r runLoop (backend-blind) (#150) * feat(loops): opt-in session continuation + checkpoint-fork lineage (backend-blind) Two @experimental, default-OFF seams on runLoop so a loop can CONTINUE a sandbox session across iterations (same box + sessionId, no prompt-text replay) and FORK fanout branches from a parent checkpoint (shared context prefix) — both behind a capability probe so the kernel asks 'can I fork?' (client.criuStatus) and never names Docker/Firecracker, degrading to fresh boxes when CRIU is absent. - sandbox-capabilities.ts: memoized, fail-closed criuStatus probe -> {canFork}. - sandbox-lineage.ts: createSandboxLineage owns box+session handles with start/continue/fork/teardown; reuses the kernel's acquireSandbox / buildBackendOptions / deleteBoxSafe; fail-loud if the probe says canFork but the box has no fork(). - run-loop.ts: RunLoopOptions.lineage (sessionContinuity / forkFanout); refine continues, fanout forks-once, else fresh-through-lineage. Default OFF is byte-identical to today, so random@k stays N independent fresh boxes (the compute-control invariant). Rejects lineage + onWorkerBox (both own boxes). - 7 new unit tests (continuation reuses session; fork when canFork; fresh fallback; default-off invariant). Full suite 621 pass, typecheck clean. * fix(loops): address PR #150 review — bound forks, prune lineage, fail-loud session continuity Resolve all six findings from the review (none blocked landing; #1 gated enabling, #3/#4 wanted documenting). Lineage remains default-OFF and byte-identical to the fresh-box path when both flags are unset. - #1 sessionContinuity silent no-op: `continue` now asserts the session is still known to the sandbox via `box.session(id).status()` before streaming. A `null` (platform never honored the client-minted id, or it was reaped) raises a ValidationError, which executeIteration now propagates as a hard structural failure instead of degrading to a soft empty iteration — so a non-honoring platform errors loudly rather than running contextless turns. - #2 unbounded fork creation: `fork` provisions child boxes through `mapWithConcurrency` bounded by the loop's `maxConcurrency`, not a single `Promise.all` over all N branches. - #3 fork ignores per-branch specs: documented on `fork` and `LoopLineageOptions.forkFanout` that a real CRIU fork inherits the parent image/profile (per-branch specs apply only on the degraded fresh path). - #4 lineage holds every box to loop end: kernel prunes boxes no future round can descend from after each round, gated on a kernel-inferred (monotonic) branch point — skipped when the driver authors its own `parentIndex`. The unprunable case is documented as the box ceiling. - #5 abort during fork: documented the SDK's signal-less fork; abort is now checked per branch (between bounded waves) + an abort-under-lineage test. - #6 export order: alphabetized the loops barrel. Adds `mapWithConcurrency` util and six lineage tests (session-liveness pass/ fail, bounded-fork peak, mid-loop prune, no-prune-under-authored-parent, abort-under-lineage). 627 tests pass, typecheck + biome clean.
drewstone
added a commit
that referenced
this pull request
Jun 17, 2026
…es agent-eval) createTrajectoryRecorder (supervise/trajectory-recorder.ts) — the post-hoc half of the analyst pipe. Replays a worker's captured tool steps as agent-eval spans (InMemoryTraceStore) and runs its PUBLISHED batch analyzers — buildTrajectory (structured run summary), stuckLoopView (full-run repeated-call view, complementing the online consecutive detector), toolWasteView. No analysis reimplemented; the thin bridge from live tool steps to the substrate trace model. Feeds from the same onToolStep seam as the online monitor. 3 recorder tests (real spans → real agent-eval findings); full suite 1017 pass; typecheck/build/lint clean. Closes both legs: online (mid-run) + settle (post-hoc).
drewstone
added a commit
that referenced
this pull request
Jun 17, 2026
…orker (#318) * feat(supervise): bidirectional bus — down-leg (steer/answer/resume) + resume_worker Close the bus to 100% bidirectional. The parent→child down-leg routes to the child inbox (scope.send→deliver) AND records a queue:false event on the same bus: it lands in history() + reaches subscribers for the audit trail, but is never pulled back by the parent. New: resume_worker (continue a parked worker — the protocol had {resume} but no verb); answer_question now routes the answer DOWN to the asking worker, unparking it. EventBus gains PublishOptions.queue for record-only events. down-leg + bidirectional history tests; full suite 1000 pass; typecheck/build/lint clean. * fix(supervise): answer_question returns delivered; close down-leg review gaps Address PR #318 review: - BLOCKING: answer_question computed `delivered` but returned only { question } — now returns { question, delivered }, consistent with steer_worker/resume_worker (no longer hides whether the answer reached a live worker). - tests: answer routed down to a LIVE worker (delivered:true happy path); resume_worker delivered:false path; a focused event-bus queue:false unit test (history+subscribers see it, pull queue never does). - resume_worker added to OPERATOR_TOOLS + the driver system prompt so the driver is actually prompted to use it. * feat(supervise): functional down-leg — workers drain a steerable inbox Make the down-leg actually move a live worker (was observable-only). New createInbox (supervise/inbox.ts) is the receive end an executor exposes as Executor.deliver; the owned tool-loop (routerToolsInlineExecutor) drains it two ways: - QUEUED (default): flush at each step boundary AND before the worker may settle — it can't finish while a steer/answer it never read is pending. - FORCEFUL (steer_worker interrupt:true): aborts the in-flight turn so the worker re-plans immediately, breaking it off a wrong path mid-task. Black-box CLI harnesses can't be interrupted mid-step → down-leg degrades to next spawn. inbox 4 + executor-drains-inbox integration test (flush-before-settle proven end to end through the real executor); full suite 1008 pass; typecheck/build/lint clean. * chore(supervise): address review nits — accurate resume_worker desc, sendDown covers answer PR #318 audit follow-ups (non-blocking): - resume_worker description no longer implies a park/resume lifecycle the scope model lacks — a settled (drained) worker is gone; says so and points to spawning fresh. - sendDown now covers the 'answer' down-leg too (removes the inline bus.publish duplication; one helper for all three down kinds). - history() docstring lists the down-leg event kinds. full suite 1008 pass; typecheck/lint clean. * refactor(supervise): unify the coordination surface (12→10 tools) Simplify without losing capability: - MERGE steer_worker + resume_worker → one steer_worker (any live worker; the only real axis was interrupt forceful-vs-queued, already a param). 'Resume' = a non- interrupt steer. Removes a redundant verb + dissolves the resume-vs-steer prompt nits. - REMOVE await_next — it was a strict subset of await_event({kinds:['settled']}). One wait-verb now; callers/prompts pass kinds:['settled'] for the next finished worker. - DROP bus.peek() — speculative, only its own test used it (YAGNI). Down-leg event union + inbox shed the dead 'resume' kind. Full suite 1007 pass; typecheck/build/lint clean. * feat(supervise): online detector monitor on the worker pipe (reuses agent-eval kernel) createDetectorMonitor (supervise/detector-monitor.ts) — the online analyst on the live worker pipe. Folds each tool step through agent-eval 0.93.0's published streaming kernel (repeatedActionDetector/errorStreakDetector — the SAME kernel control-runtime folds; no detection logic reimplemented) and fires onSignal → a finding on the bus the moment a worker loops or error-storms. routerToolsInlineExecutor feeds it via a new onToolStep seam. Bumps agent-eval ^0.93.0. monitor tests (4); full suite 1011 pass; typecheck/build/lint clean. * fix(supervise): address #318 review + wire raiseFinding (last mile) Last mile: createCoordinationTools.raiseFinding (exposed on the MCP handle) — the seam an ONLINE detector uses to publish a finding on the live bus mid-run. Proven end-to-end: a stuck-loop on the worker pipe → monitor → raiseFinding → await_event surfaces it. Review fixes (audit on the earlier commit): - HIGH: AbortSignal.any (needs Node 20.3, floor is 20) → portable mergeAbortSignals. - forceful interrupt: docstring no longer overpromises (aborts in-flight inference, a tool mid-exec finishes first); interrupted turns no longer count toward maxTurns; added the e2e test (forceful steer aborts the turn, re-plans, aborted turn is free). - answer to a BLOCKING question is now delivered forcefully (interrupt) to unpark the worker immediately, not at its next boundary. - sendDown 'answer' now REQUIRES questionId (overload; no silent ?? '' mask). - tool-step status captured (error vs ok) for the error-streak detector. - stale await_next purged from bench prompts + docs; history() docstring drops 'resume'. - added tests: answer delivered:false + return asserted; await_event idle-on-mismatch. full suite 1014 pass; typecheck/build/lint clean. * feat(supervise): settle-time trajectory analyzer (last mile #2 — reuses agent-eval) createTrajectoryRecorder (supervise/trajectory-recorder.ts) — the post-hoc half of the analyst pipe. Replays a worker's captured tool steps as agent-eval spans (InMemoryTraceStore) and runs its PUBLISHED batch analyzers — buildTrajectory (structured run summary), stuckLoopView (full-run repeated-call view, complementing the online consecutive detector), toolWasteView. No analysis reimplemented; the thin bridge from live tool steps to the substrate trace model. Feeds from the same onToolStep seam as the online monitor. 3 recorder tests (real spans → real agent-eval findings); full suite 1017 pass; typecheck/build/lint clean. Closes both legs: online (mid-run) + settle (post-hoc). * fix(supervise): address audit on 27dd2ce (listener leak, crash-safety, comment accuracy) - mergeAbortSignals listener leak: pre-link external signals ONCE; per-turn add+remove the listener (no accumulation on long-lived signals over maxTurns). - interrupt catch now requires a real AbortError (DOMException) — a network fault coincident with an interrupt is no longer swallowed; rethrown. - corrected the comment: an interrupted+re-planned turn DOES consume a maxTurns slot (bounded backstop, not a hang) — it just doesn't bill a turn. - onToolStep is an observability side-channel: wrapped so a throwing monitor can't crash the worker loop; detector-monitor.observeToolStep also defends argHash on circular/unhashable args. - projectEvent preserves questionId on the answer branch. - stale await_next purged from skills/{supervise,loop-writer}; trimmed CLAUDE.md redundancy; softened the recorder's per-span-duration claim. full suite 1018 pass; typecheck/build/lint clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updates published Tangle agent packages:\n\n- @tangle-network/agent-eval to 0.20.12\n- @tangle-network/agent-knowledge to 1.2.0 where used\n\nLockfiles were refreshed where present.