Skip to content

feat(core): Workflow tool P1 — minimal node:vm sandbox + sequential agent() (#4721)#4732

Merged
LaZzyMan merged 31 commits into
mainfrom
lazzy/lucid-pare-974192
Jun 10, 2026
Merged

feat(core): Workflow tool P1 — minimal node:vm sandbox + sequential agent() (#4721)#4732
LaZzyMan merged 31 commits into
mainfrom
lazzy/lucid-pare-974192

Conversation

@LaZzyMan

@LaZzyMan LaZzyMan commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

What this PR does

Implements P1 of the Dynamic Workflows / Ultracode port — a minimal Workflow tool that runs a model-authored JavaScript script in a node:vm sandbox with args, phase, log, and sequential agent() globals. The tool is registered behind isWorkflowsEnabled() (off by default; opt-in via setting or QWEN_CODE_ENABLE_WORKFLOWS=1).

Architecture is three layers with clean injection seams:

  • WorkflowTool (packages/core/src/tools/workflow/workflow.ts) — BaseDeclarativeTool subclass that owns the tool registration, params validation, and error marshalling
  • WorkflowOrchestrator + createProductionDispatch(config, signal?) (packages/core/src/agents/runtime/workflow-orchestrator.ts) — owns run lifecycle, dispatch construction, and wraps AgentHeadless for the production code path
  • createWorkflowSandbox(opts) (packages/core/src/agents/runtime/workflow-sandbox.ts) — owns the node:vm context, hardened globals, and the determinism stubs (Date.now / Math.random throw)

P2/P3/P5 forward-compatibility seams are already in SandboxOptions (parallel?, pipeline?, budget?) so future phases can wire real implementations without re-opening the sandbox's already-reviewed security surface.

The subagent system prompt is verbatim from the claude-code 2.1.160 binary §XmO constant — five bullets including the "return ONLY the raw JSON" instruction critical for P3 schema mode.

Why it's needed

See #4721 for the full motivation and phased plan. TL;DR: this adds a third tier above the existing /swarm tool (#3433) and Agent Team (#2886, in progress) — model-authored JS that orchestrates many subagents through a small set of primitives, matching upstream's Agent + Workflow coexistence.

P1 is the minimum shippable skeleton:

  • The Workflow tool is registered and gated; the JS sandbox executes scripts
  • Sequential agent() only — parallel / pipeline / workflow / budget are throwing stubs that surface a clear "scheduled for PN" error (not ReferenceError)
  • Date.now() / Math.random() / new Date() throw to guarantee resume determinism
  • Foreground execution (no background queue yet — that's P4)
  • No schema / structured output (P3) / no budget tracking (P5) / no resume (P6) / no UI extension (P4)

Local design artifacts (not committed; they live in .qwen/design/ per repo convention) document the full upstream alignment study:

  • .qwen/design/dynamic-workflow-alignment-claude-code-2.1.160.md (788 lines) — per-axis upstream findings, qwen-code fit matrix, gap analysis
  • .qwen/design/dynamic-workflow-alignment-claude-code-2.1.160-liveprobe-delta.md — API surface confirmation against the real /deep-research workflow script captured from ~/.claude/projects/<session>/
  • .qwen/design/p1-implementation-plan.md — the task-by-task plan this branch executes

Reviewer Test Plan

How to verify

# 1. Run the new P1 tests in isolation (5 files, 80 tests):
cd packages/core && npx vitest run \
  src/agents/runtime/workflow-sandbox.test.ts \
  src/agents/runtime/workflow-orchestrator.test.ts \
  src/tools/workflow/workflow.test.ts \
  src/config/config.workflows.test.ts \
  src/config/config.workflow-registration.test.ts

# Expected: 80 tests pass.

# 2. Full core regression (363 files, 9920 tests):
cd packages/core && npm test

# 3. Type and lint:
cd packages/core && npm run typecheck
cd packages/core && npm run lint

# 4. Confirm the tool is gated by default:
QWEN_CODE_ENABLE_WORKFLOWS=  node -e '
  const { Config, ToolNames } = await import("./packages/core/dist/index.js");
  const cfg = new Config({ /* minimal */ });
  const reg = await cfg.createToolRegistry(undefined, { skipDiscovery: true });
  console.log("workflow registered?", reg.getAllToolNames().includes(ToolNames.WORKFLOW));
' 2>&1
# Expected: "workflow registered? false"
# Then with QWEN_CODE_ENABLE_WORKFLOWS=1 — expected "workflow registered? true"

Notable test categories under workflow-sandbox.test.ts:

  • stripExportMeta — 8 cases covering brace balancing, string-literal escapes
  • createWorkflowSandbox — args verbatim, Date.now/Math.random throw, top-level return capture
  • createWorkflowSandbox security — 18 cases including realm-escape PoCs that have been confirmed-blocked against host process (covers args.constructor.constructor, Math.__proto__, Math.toString.constructor, Math.abs.constructor, Date.constructor, array-args [].constructor, budget.spent.constructor, the 30s vm timeout, log/phases caps, agent({schema/isolation/model/agentType}) throws, and the injection seams for parallel/pipeline/budget)

Evidence (Before & After)

N/A — no user-visible UI change in P1. The tool is registered but the model only emits a Workflow call when explicitly prompted; no Footer//workflows UI lands until P4.

Tested on

OS Status
🍏 macOS
🪟 Windows ⚠️ not tested
🐧 Linux ⚠️ not tested

Environment (optional)

Node v22.21.1, npm 10.x.

Risk & Scope

Main risk or tradeoff: the JS sandbox is node:vm (zero dep), not isolated-vm. node:vm is not a true V8 isolate — six prototype-escape paths were identified and blocked over six review rounds (PoCs in the security test file act as regressions). Two architectural limitations remain documented and not fixed:

  • node:vm does not enforce a memory cap — a hostile script can OOM the host. Operators should configure --max-old-space-size; future phases may move to worker_threads.
  • Synchronous regex catastrophic backtracking can block the event loop up to the 30s vm timeout. The 30s value matches the upstream binary's EW6 constant.

Neither limitation is unique to qwen-code — claude-code 2.1.160 (binary inspected) ships the same trade-off.

Not validated / out of scope:

  • parallel / pipeline / workflow / budget are throwing stubs in P1 (clear "scheduled for PN" errors, not ReferenceError). Real implementations land in P2/P5 — injection seams in SandboxOptions are pre-wired so neither phase needs to modify the sandbox.
  • agent({schema}) throws with a clear "scheduled for P3" message. Until P3 lands, scripts that rely on StructuredOutput (including upstream's /deep-research) will fail at the first such call — by design.
  • No /workflows slash command or progress UI yet. The BackgroundTasksPill does not show workflow runs. All planned for P4.
  • No resume / resumeFromRunId. Planned for P6.

Breaking changes / migration notes: None. The tool is off by default (isWorkflowsEnabled() === false unless setting or env var is set). All changes are additive; no existing tool, slash command, or config field is modified.

Decisions deferred to maintainer sign-off (listed in issue #4721 §"Decisions needed before P1"):

  1. Sandbox choice: node:vm (P1) vs isolated-vm (potential P2+)
  2. workflowKeywordTriggerEnabled default: false (qwen) vs true (upstream)
  3. Whether to add QWEN_CODE_MAX_TOKENS_PER_WORKFLOW in P5
  4. Saved workflows directory: .qwen/workflows/ vs .claude/workflows/
  5. Cross-session resume scope in P6

Resolves the implementation half of #4721 — only P1 lands here; P2–P7 will be separate PRs per the phased plan.

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

📋 Review Summary

This PR implements P1 of the Dynamic Workflows / Ultracode port — a minimal Workflow tool that executes model-authored JavaScript in a node:vm sandbox with sequential agent() dispatch. The implementation demonstrates exceptional security hardening with multiple defense-in-depth layers, comprehensive test coverage (80 tests across 5 files), and clean architectural separation between the tool, orchestrator, and sandbox layers. The code quality is outstanding with detailed inline documentation explaining security rationales.

🔍 General Feedback

  • Exceptional security-first design: The sandbox implementation shows deep understanding of vm escape vectors with multiple independent mitigation layers (JSON roundtrip, deepNullProto, hardenInjected, vm-realm primordials)
  • Clean architectural layering: Three clear layers (WorkflowTool → WorkflowOrchestrator → createWorkflowSandbox) with well-defined injection seams for future phases
  • Outstanding test coverage: 80 tests covering security PoCs, functionality, and edge cases — the security tests document the exact attack vectors being blocked
  • Excellent inline documentation: Comments reference specific FIX numbers, security classifications (SEC-C1, UP-C1), and phase schedules (P1/P2/P3/P5)
  • Forward-compatible design: Injection seams for parallel/pipeline/budget are already in place without compromising P1 security surface
  • Verbatim upstream alignment: Subagent system prompt is copied from claude-code 2.1.160 binary with documentation of the source

🎯 Specific Feedback

🔵 Low

  • File: packages/core/src/agents/runtime/workflow-sandbox.ts:~280 — The deepNullProto function has a depth cap of 64 with a clear error message, which is good. However, consider whether 64 is the right default — a deeply nested but legitimate args object could hit this. Consider documenting what typical nesting depth looks like in practice, or making this configurable via SandboxOptions.

  • File: packages/core/src/agents/runtime/workflow-sandbox.ts:~400 — The MAX_LOG_LINES and MAX_PHASE_ENTRIES caps (both 10,000) are reasonable, but consider whether these should be tuned based on expected workflow behavior. A workflow that logs 10,000 lines is already pathological, so this seems safe, but documenting the memory impact (~10KB-1MB depending on line length) would help future reviewers.

  • File: packages/core/src/tools/workflow/workflow.ts:147 — The error handling correctly suppresses stack traces from the LLM, but the known limitation note about scripts that capture e.stack and return it is worth documenting in user-facing docs as well. Consider adding this to the tool's description string or a separate security guidance doc.

  • File: packages/core/src/config/config.ts:3193 — The feature gate logic is clear, but the precedence order (env var kill switch → env var enable → params.workflowsEnabled) could use a comment documenting why this order was chosen. The test covers it, but a future maintainer might wonder about the rationale.

  • File: packages/core/src/agents/runtime/workflow-orchestrator.ts:17 — The WorkflowRunRequest interface has a good comment explaining why signal was removed, but consider whether this pattern (removing unused fields with documentation) should be called out in a contributing guide as a best practice.

✅ Highlights

  • Security hardening is exemplary: The multi-layer approach to blocking vm escape (JSON roundtrip → deepNullProto → hardenInjected → vm-realm primordials) represents best-in-class secure coding. Each layer is independently documented with the specific attack vector it blocks.

  • Test-driven security: The security PoC tests (TST-C1/C2/C3) that verify each escape vector is blocked demonstrate a mature security mindset. Tests like "blocks args.constructor.constructor realm escape" and "blocks Math.abs.constructor host-Function escape" are textbook examples of how to test security boundaries.

  • Clear phase scheduling: Every unsupported feature (parallel, pipeline, budget, schema, workflow nesting) has a throwing stub with a clear message indicating which phase it's scheduled for. This prevents semantic drift from silent no-ops.

  • Excellent error surfacing: Errors from the sandbox surface only the message (never stack traces with host paths) to the LLM, while still preserving full debuggability via stderr/logging. The distinction between user-facing and debug-facing error info is correctly implemented.

  • Comprehensive exclusion list: Adding WORKFLOW to EXCLUDED_TOOLS_FOR_SUBAGENTS prevents unbounded recursive fan-out — a subtle but critical security decision that shows systems-level thinking.

  • Type-safe forward compatibility: The WorkflowAgentOpts interface documents unsupported fields with clear runtime errors rather than silent dropping, and the type is re-exported from the orchestrator so external consumers get the same type definition.

@LaZzyMan

LaZzyMan commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

Cross-check against upstream (@anthropic-ai/claude-code 2.1.160) — drift notes

Re-checked P1 implementation + phase plan against the local reverse-engineering artifacts. Three findings worth flagging:

1. Subagent disallowedTools missing (minor defense-in-depth gap)

Upstream Tg8 config (packages/core/src/agents/runtime/workflow-orchestrator.ts:71):

{ tools: ['*'], disallowedTools: [SendUserMessage, ExitPlanMode] }

P1 dispatch currently passes only { tools: ['*'] }. The §XmO system prompt explicitly says "Do NOT use SendUserMessage" so behavioral compliance comes from prompt discipline alone, not tool gating. Upstream uses both layers (prompt + tool block). Recommend adding disallowedTools: [ToolNames.SEND_USER_MESSAGE, ToolNames.EXIT_PLAN_MODE] to the AgentHeadless.create() call site in createProductionDispatch — a 1-line change. Holding for reviewer steer on whether to fold into this PR or land separately in P2.

2. runId format diverges from upstream

P1 generates wf_<16 hex chars> from randomBytes(8).toString('hex').
Upstream generates wf_<12 chars> from randomUUID().slice(0,12).

No functional impact today, but cross-tool interop (e.g. P6 resume reading a runId emitted by claude-code) would break. Cheap to align if maintainers prefer.

3. Documentation correction — schema retry count (already fixed in issue #4721)

The original issue #4721 said "Schema-mismatch retries: 5 per agent". This conflated two separate counters in the binary:

  • Schema-mismatch nudges: 2 in-conversation nudges (binary line 307789 — the literal error message text)
  • Stall retries: 5 (binary VOK = 5, fires on no-progress agents, not schema validation)

Issue #4721 has been edited to correct both the "Hard caps" section and the P3 phase description ("2-nudge in-conversation retry"). The fix is documentation-only — no code in this PR is affected since P1 throws on agent({schema}) rather than implementing retries.

4. P4 plan refinement — meta extraction (already fixed in issue #4721)

P1 uses stripExportMeta to remove export const meta = {...} because Node's vm script mode rejects ES-module syntax. P4 needs the meta fields (name, description, whenToUse, phases) for the /workflows UI, so P4 must replace stripExportMeta with extractAndStripMeta(source) → { meta, body } and thread meta to the task registry. Issue #4721 P4 description has been updated to call this out explicitly. No P1 code change needed.


The first two items are flagged for reviewer judgment (fold in here vs. defer). The third and fourth are documentation-only corrections to the issue, no PR change required.

Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts Outdated
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts Outdated
Comment thread packages/core/src/index.ts
Comment thread packages/core/src/tools/workflow/workflow.test.ts
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts Outdated
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts Outdated
Comment thread packages/core/src/agents/runtime/workflow-orchestrator.test.ts Outdated

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fresh review pass at HEAD e767902. CI is green and this is not a self-PR, so no verdict downgrade applies.

Three new Critical issues are posted inline, plus three Suggestions. The headline is a host-realm escape on the success path of agent() (the always-on twin of the thrown-Error leak already flagged at workflow-sandbox.ts:293) — I confirmed it with a PoC that read process.env from inside a sandbox script with no error thrown.

For the record, the three earlier Critical inline comments still reproduce at this HEAD and were each re-verified by PoC — they block merge alongside the new findings:

  • sandbox escape via thrown host Error objects (workflow-sandbox.ts:293)
  • deepNullProto severing array prototypes so for...of / .map / spread on array args throw (workflow-sandbox.ts:175)
  • eager value-export of WorkflowTool vs export type siblings (index.ts:119)

Good things confirmed: typecheck + lint + the 80 new tests all pass, the build is green, the node:vm Math/Date in-realm init is genuinely not subvertible, __proto__ args keys do not pollute host Object.prototype, and the anti-recursion guard (WORKFLOW in EXCLUDED_TOOLS_FOR_SUBAGENTS) holds on the tools: ['*'] path.

— claude-opus-4-8 via Claude Code /qreview

Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts Outdated
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts
Comment thread packages/core/src/agents/runtime/workflow-orchestrator.ts
Comment thread packages/core/src/agents/runtime/workflow-orchestrator.ts Outdated
Comment thread packages/core/src/tools/workflow/workflow.ts Outdated
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Second-opinion review (qwen3.7-max) — R1 findings by claude-opus-4-6 are acknowledged. This round found 1 new Critical + 5 new Suggestions.

Also noted (Nice to have, not posted as inline):

  • 6 new files missing @license headers (workflow-sandbox.ts, workflow-orchestrator.ts, workflow-prompts.ts, and 3 test files). Only workflow.ts and the two config test files have the standard header.
  • Stale comment in config.workflows.test.ts:10 and config.workflow-registration.test.ts:10: references config-session-env.test.ts which does not exist in the codebase.

— qwen3.7-max via Qwen Code /review

Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts Outdated
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts Outdated
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts
Comment thread packages/core/src/tools/workflow/workflow.ts Outdated
Comment thread packages/core/src/agents/runtime/workflow-orchestrator.ts Outdated
@LaZzyMan

LaZzyMan commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator Author

E2E test report — real model (qwen3.7-plus)

Built locally from lazzy/lucid-pare-974192 HEAD and driven through node packages/cli/dist/index.js -p ... --approval-mode yolo --output-format json with QWEN_CODE_ENABLE_WORKFLOWS=1 set (except scenario 0). Raw outputs captured under /tmp/p1-e2e/scenario-*.json. Model: qwen3.7-plus. Total LLM cost across all 7 scenarios ≈ 200k input tokens.

Scenario 0 — Default-off gate

qwen -p "say hi" --output-format json   # no env var

init.tools did not contain workflow. ✅ Default-off gate works as designed.

Scenario 1 — Gate-on registration

QWEN_CODE_ENABLE_WORKFLOWS=1 qwen -p "say hi" --output-format json
| jq '.[] | select(.type=="system") | .tools | map(select(.=="workflow"))'
# → ["workflow"]

workflow appears in the init.tools list. ✅

Scenario 2 — Real sequential agent() dispatch (end-to-end)

Prompt: ask model to invoke workflow with an inline script that calls phase('plan') + two await agent(...) + returns a structured object.

Model emitted (raw tool_use.input):

{"name":"workflow","input":{"script":"\nphase('plan');\nconst a = await agent('Respond with exactly the literal text HELLO and nothing else', { label: 'h1' });\nconst b = await agent('Respond with exactly the literal text WORLD and nothing else', { label: 'h2' });\nreturn { greeting: a + ' ' + b, runIdShape: 'wf_<16hex>' };\n"}}

Tool result (tool_result.content):

{
  "runId": "wf_ea903721a899b1a9",
  "phases": ["plan"],
  "logs": [],
  "result": { "greeting": "HELLO WORLD", "runIdShape": "wf_<16hex>" }
}

Verified:

  • runId matches ^wf_[0-9a-f]{16}$
  • phases: ["plan"]phase('plan') was honored
  • ✅ Two real AgentHeadless subagents fired (telemetry stats.models.bySource keys: ["h1", "h2", "main"])
  • ✅ Subagents returned HELLO and WORLD as instructed by the §XmO system prompt ("Output the literal result")
  • ✅ Script-side concatenation a + ' ' + b"HELLO WORLD" made it through vm.runInContext
  • llmContent carries the script's raw result (not the wrapper); main-loop model summarized it as "greeting: HELLO WORLD" — confirming the FIX-7 (UP-C2) unwrap behaviour

Stats: 2 main-loop turns, 75 168 input / 256 output tokens, 10 001 ms wall-clock (includes 2 subagent LLM round-trips).

Scenario 3 — Date.now() throws (binary §axO verbatim)

{"is_error": true, "content": "Error: Date.now() / new Date() are unavailable in workflow scripts (breaks resume). Stamp results after the workflow returns, or pass timestamps via args."}

✅ Exact binary verbatim message. is_error: true propagates.

Scenario 4 — Math.random() throws (binary §sxO verbatim)

{"is_error": true, "content": "Error: Math.random() is unavailable in workflow scripts (breaks resume). For N independent samples, include the index in the agent label or prompt."}

✅ Exact binary verbatim message.

Scenario 5 — agent({schema}) throws with P3 scheduling note

{"is_error": true, "content": "Error: agent({schema}) is not supported in P1. Schema enforcement / StructuredOutput contract is scheduled for P3."}

✅ Clear "scheduled for P3" message — NOT ReferenceError. Per Round 1 review decision: throw is safer than silently dropping the opt, even though it means scripts like /deep-research won't run until P3 ships.

Scenario 6 — parallel() throws with P2 scheduling note

{"is_error": true, "content": "Error: parallel() is not supported in P1. Sequential agent() is the only execution mode in P1. Concurrent fan-out is scheduled for P2."}

✅ Clear "scheduled for P2" — NOT ReferenceError. Confirms the FIX-F throwing-stub approach reaches the model correctly.

Scenario 7 — pipeline() throws with P2 scheduling note

{"is_error": true, "content": "Error: pipeline() is not supported in P1. Staggered multi-stage execution is scheduled for P2."}

✅ Same pattern as parallel.


Summary matrix

# Behaviour Result
0 Default-off gate (no env var) ✅ tool absent from init.tools
1 QWEN_CODE_ENABLE_WORKFLOWS=1 gate workflow in init.tools
2 Sequential phase() + 2× agent() + structured return ✅ runId, phases, subagent dispatch, llmContent unwrap all verified end-to-end
3 Date.now() ✅ throws — binary §axO verbatim
4 Math.random() ✅ throws — binary §sxO verbatim
5 agent({schema}) ✅ throws — "scheduled for P3"
6 parallel(...) ✅ throws — "scheduled for P2"
7 pipeline(...) ✅ throws — "scheduled for P2"

All 7 P1-scope behaviours verified against a real model on real hardware. The 80 unit tests + 9920 core regression already in the PR cover the implementation seams; this report covers the integration path the user actually sees: model → tool description → tool_use → sandbox → subagent dispatch → tool_result → main-loop summary.

PR moving from draft → ready for review.

@LaZzyMan LaZzyMan marked this pull request as ready for review June 4, 2026 04:02
LaZzyMan added a commit that referenced this pull request Jun 4, 2026
…ening (PR #4732 R1)

Closes T1/T8/T14: thrown Errors and async-function Promises used to leak the
host Function constructor through their prototype chains. PoC:
  agent('x').constructor.constructor('return process')()
  try { throw } catch(e) { e.constructor.constructor('return process')() }
Build every async/sync global (agent, parallel, pipeline, workflow, budget,
console, phase, log, args) inside the vm-realm via the existing init
script. Host only exposes a primitive bridge that the init script reads
once and deletes from globalThis. Both rejection and resolution paths
cross the boundary as vm-realm values.

Closes T2: deepNullProto used to setPrototypeOf(null) on array args,
breaking for-of / .map / .filter / spread / destructuring. Replaced with
vm-realm JSON.parse of an args string — arrays retain vm-realm
Array.prototype methods.

Closes T13: runtime allowlist on agent() opts catches typos like 'scema'.

Closes T9/T16/T17: stripExportMeta now recognises //, /* */, and regex
literals; throws on unbalanced braces instead of silently returning ''.

Closes T6: validateArgs rejects functions, BigInt, circular refs, and
over-deep nesting (previously functions silently disappeared).

Closes T5: regression test for console.log/warn/error → getLogs routing.
LaZzyMan added a commit that referenced this pull request Jun 4, 2026
Production callers use Config.createToolRegistry's registerLazy path which
dynamic-imports './tools/workflow/workflow.js'. The barrel export at index.ts
previously forced eager evaluation of the workflow.js → workflow-orchestrator
→ workflow-sandbox → node:vm module chain for every consumer of
@qwen-code/core, even when workflows are disabled.

Sibling tool exports (AgentTool, SkillTool) are type-only; align WorkflowTool
with the same pattern. SDK consumers can still annotate types; instantiation
happens through the registry, not the barrel.
LaZzyMan added a commit that referenced this pull request Jun 4, 2026
…ls + failure context + defensive serialization (PR #4732 R1)

Closes T10: runReasoningLoop returns terminateMode = CANCELLED|MAX_TURNS|
TIMEOUT|ERROR rather than throwing. Without checking it, await agent(...)
resolved to '' on user cancel and the workflow kept looping. Now check
getTerminateMode() after execute() and throw on non-GOAL — mirrors
AgentTool's existing handling.

Closes T11: workflow subagents previously ran with runConfig: {} (no
max_turns / max_time_minutes guard) and tools: ['*'] without
disallowedTools. A single agent() could loop the model indefinitely. Bound
to 50 turns / 10 minutes; add disallowedTools: [SEND_MESSAGE, EXIT_PLAN_MODE]
to mirror upstream Tg8 — defense in depth with the §XmO system prompt.

Closes T19: phases / logs accumulated before a script failure used to be
discarded with the sandbox instance. WorkflowExecutionError carries them
through the rejection so the user-visible display can show what ran.

Closes T12 / T18: defensive serialization. A successful workflow returning
a BigInt or circular value used to be reported as 'Workflow failed:
Converting circular structure to JSON' because JSON.stringify was inside
the try block. safeStringifyResult / safeStringifyDisplayPayload degrade
to a clear placeholder so a serialization issue doesn't masquerade as a
run failure.

Closes T4: regression test for ToolErrorType.EXECUTION_FAILED assertion.

Closes T7: vi as vitest alias removed (now matches every other test file).
LaZzyMan added a commit that referenced this pull request Jun 4, 2026
…on-env reference (PR #4732 R1)

Closes T20: 6 of 9 new workflow files were missing the standard @license
Apache-2.0 header. Add Qwen-style header (matches sibling tools/agent/agent.ts
and others) to: workflow-sandbox.ts, workflow-sandbox.test.ts,
workflow-orchestrator.ts, workflow-orchestrator.test.ts,
workflow-prompts.ts, workflow.test.ts.

Closes T21: config.workflows.test.ts and config.workflow-registration.test.ts
both contained 'mirrors config-session-env.test.ts' in a setup comment, but
that file does not exist in the repo. Drop the dangling reference.
@LaZzyMan

LaZzyMan commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator Author

Round 1 — all 21 findings addressed

Commits (4): eba53fbad · 7ca5f82cb · 4b53619e5 · b787c9448

Full P1 test suite: 108 passed (was 80 — 28 new regression cases). Full core suite: 9960 passed / 4 skipped / 0 failed. typecheck: 0 errors. lint: clean on workflow files (only pre-existing unrelated errors in .claude/skills/e2e-testing/scripts/ remain).

Critical (all 4 unique roots fixed; 7 inline threads closed)

Root issue Fix Commit
Host-realm escape via thrown Errors (T1) and async Promises (T8/T14) — confirmed PoC read process.env and process.platform All async/sync globals (agent, parallel, pipeline, workflow, budget, console, phase, log, args) are now built inside the vm-realm via the existing init script. Host exposes only a primitive bridge that the init script reads once and deletes from globalThis. Both resolve and reject cross the boundary as vm-realm values eba53fbad
Array iteration broken (T2) — for...of, .map, .filter, spread, destructuring all threw on array args because Object.setPrototypeOf(out, null) removed Array.prototype Args are built from vm-realm JSON.parse, retaining vm-realm Array.prototype (whose constructor chain still cannot reach host process) eba53fbad
WorkflowTool value export defeats lazy loading (T3) Changed to export type to align with AgentTool/SkillTool siblings 7ca5f82cb
stripExportMeta silently deletes script on apostrophe-in-comment / unbalanced braces / regex literal (T9/T16/T17) Parser now recognises //, /* */, regex literals; throws on unbalanced braces instead of returning '' eba53fbad
Cancellation silently non-functional (T10) — runReasoningLoop returns terminateMode rather than throwing createProductionDispatch calls getTerminateMode() after execute() and throws on non-GOAL 4b53619e5

Suggestions (all 12 fixed)

Thread Fix Commit
T4 — missing ToolErrorType.EXECUTION_FAILED assertion added in workflow.test.ts 4b53619e5
T5 — missing console.log/warn/errorgetLogs() routing test added eba53fbad
T6 — missing non-serializable args validation test validateArgs() rejects functions/BigInt/circular/depth; 3 regressions added eba53fbad
T7 — vi as vitest alias inconsistent replaced with vi 4b53619e5
T11 — workflow subagents unbounded + over-privileged runConfig: { max_turns: 50, max_time_minutes: 10 } + disallowedTools: [SEND_MESSAGE, EXIT_PLAN_MODE] (mirrors upstream Tg8) 4b53619e5
T12 / T18 — BigInt / circular masquerades as workflow failure safeStringifyResult / safeStringifyDisplayPayload with their own try/catch + clear placeholder 4b53619e5
T13 — [key:string]: unknown defeats opts guard runtime allowlist inside vm-realm agent() wrapper throws on any opt not in [label, phase, schema, model, isolation, agentType] eba53fbad
T15 — budget injected hardening untested regression "opts.budget: spent/remaining constructors stay vm-realm-safe" added eba53fbad
T19 — phases/logs lost on script failure WorkflowExecutionError carries phases + logs through the rejection; surfaced in returnDisplay 4b53619e5

Nits (review-body)

Thread Fix Commit
T20 — 6 new files missing @license headers added Copyright 2025 Qwen header (matches sibling tools/agent/agent.ts style) b787c9448
T21 — stale mirrors config-session-env.test.ts reference removed dangling reference from both config test files b787c9448

Architecture note

The unified vector for T1/T8/T14 is "any host object handed to the script keeps its host prototype chain." Round 1's fix moves the cross-realm boundary to a primitive-only host bridge: the init script reads a host object exposed at globalThis.__workflowBridge, deletes the binding, and builds every script-visible global (Math, Date, args, agent, parallel, pipeline, workflow, budget, console, phase, log) inside the vm-realm. Closures (prompt) => bridge.hostAgent(prompt) cross via primitive args + primitive returns; vm-realm new Promise((res, rej) => { hostPromise.then(res, rej) }) wraps the host async result; vm-realm new Error(msg) wraps host rejection messages. The script never sees a host object on either the success or failure path.

The same pattern also lets P2's parallel(thunks) and P5's real budget slot in without re-opening the security surface: their host implementations get wired through __workflowBridge and the init-script wrappers stay unchanged.

Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts
LaZzyMan added 17 commits June 4, 2026 15:56
SEC-C1: deep-null-proto + hardenClosure blocks args/closure realm-escape PoC.
SEC-C2: vm 30s timeout kills sync infinite loops.
UP-C1: agent() throws on unsupported opts (schema/isolation/model/agentType).
UP-I1: keep verbatim subagent system prompt comment.
ARCH-C1: thread AbortSignal into buildProductionDispatch → subagent.execute().
UP-C2: llmContent carries script result verbatim; metadata moves to returnDisplay.
SEC-I1: add WORKFLOW to EXCLUDED_TOOLS_FOR_SUBAGENTS to prevent recursive fan-out.
SEC-I2: cap logs[] at 10 000 lines with a truncation marker.
REUSE-I1: use ToolErrorType.EXECUTION_FAILED in workflow error returns.
TST: add security PoC tests, unique-runId, dispatch-rejection, llmContent-unwrap.
TST-I1: remove setter/getter tautology test from config.workflows.test.ts.
- Extract WORKFLOW_SUBAGENT_SYSTEM_PROMPT into workflow-prompts.ts
- Lift buildProductionDispatch() into exported createProductionDispatch(config, signal?)
- WorkflowOrchestrator constructor now takes dispatch directly: (dispatch: WorkflowAgentDispatch)
- Remove WorkflowOrchestratorOptions interface
- WorkflowToolOptions.orchestratorOverrides replaced by WorkflowToolOptions.dispatch
- WorkflowToolInvocation.execute() calls createProductionDispatch() when no override is set
- Tests updated: orchestrator tests inject dispatch directly; production-dispatch tests moved to createProductionDispatch describe block
…Controller refactor (PR #4732)

The previous commit moved the bridging logic into the helper, so the
inline comments restating the helper's contract (parent forwarding,
already-aborted fast path, WeakRef, auto-removal) became redundant —
that's the helper's job to document.

Compress to the load-bearing semantics: the child controller sees
both caller-driven and wall-clock-driven aborts, and `finally` cancels
stragglers on normal completion. No code change.
LaZzyMan added a commit that referenced this pull request Jun 8, 2026
…contract

Three quality findings from the round-3 adversarial pass (after rounds 1+2
fixed 9 issues each). All low-severity but cheap; the runtime was correct,
the file-on-disk + public-API surface needed tightening.

1. **Stale runConfig.max_turns duplicated on every write.** Top-level maxTurns
   was promoted in 9b903ee but the serializer still emitted runConfig.max_turns
   verbatim. A file with both fields kept both indefinitely. The runtime
   already prefers top-level (so behavior is correct), but a third-party tool
   or future viewer reading the legacy nested field would act on stale data.
   Prune nested max_turns from the serialized runConfig when top-level
   maxTurns is set; drop the runConfig block entirely if pruning leaves it
   empty. Three new tests pin the prune, the drop-empty-block path, and the
   no-top-level fallback.

2. **14 internal symbols leaked into the public package API.** The PR
   re-exported every helper from agent-frontmatter-schema.ts via
   subagents/index.ts. grep against packages/cli + packages/vscode-ide-companion
   finds zero cross-package consumers — both internal users (subagent-manager,
   claude-converter) already import directly from the schema file. Locking
   constant names like EFFORT_VALUES and parseEffort in the public API would
   constrain follow-up PRs (e.g. when js-yaml lands and the schema shape
   changes). Removed the entire export block; left a comment explaining the
   choice so a future PR knows the bar for re-introducing it.

3. **convertClaudeAgentConfig silently dropped a standalone approvalMode.**
   Round-2 gated the permissionMode bridge on !claudeAgent.approvalMode for
   precedence parity with the loader, but never added the explicit copy for
   approvalMode-only callers. Result: caller passes {approvalMode: 'plan'} →
   output has no approvalMode at all. The in-tree convertAgentFiles path
   recovered via the unowned-key passthrough, but exported direct-call surface
   needs the explicit copy. Two new tests pin both the standalone case and
   the precedence-when-both case.

Also: pinned yaml-parser limitation tests (added in the round-3 e2e
independent check) document that the lightweight parser cannot round-trip
nested mcpServers/hooks; documentation + types comments updated to call out
the carve-out so users know the field works only for flat forms until
js-yaml lands.

Refs #4821 #4721 #4732
@LaZzyMan

LaZzyMan commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

@wenshao friendly ping for re-review when you have a moment — all 42 review threads across R1–R5 are now resolved, CI 8/8 green at HEAD 610c4005b. The R5 finding (createChildAbortController reuse) was already landed in a2fee2cfe before your review reached me; GitHub auto-marked it isOutdated. PR is currently BLOCKED on the active CHANGES_REQUESTED — could you re-review or dismiss when convenient? Happy to address anything new that comes up.

LaZzyMan added a commit that referenced this pull request Jun 8, 2026
Reduces this PR to ship only the fields that have an end-to-end runtime path
today: permissionMode (bridges to existing approvalMode), maxTurns (wires
into runConfig.max_turns), and a tightened color allowlist. Everything else
gets deferred to follow-up PRs that first land the prerequisite infra they
need.

Removed:
- effort (no model-layer effort param in qwen providers)
- mcpServers / hooks (need nested-aware YAML parser — yaml-parser.ts is
  hand-rolled and only handles 1 level)
- memory (qwen's auto-memory has no user/project/local scope distinction)
- isolation (workflow PR #4732 owns the runtime)
- initialPrompt (needs --agent CLI flag, no main-session-agent infra)
- skills (needs SkillManager consumption of config.skills)

This drops 7 fields from SubagentConfig + types.ts, deletes their parser /
serializer / test code, deletes the unused enum constants and helpers from
agent-frontmatter-schema.ts (EFFORT_VALUES, EFFORT_ALIASES, MEMORY_VALUES,
ISOLATION_VALUES, parseEffort, isMemory, isIsolation), and trims the user
docs to the supported field set with a pointer to the design doc for the
deferred fields.

Why ship the design doc alongside such a narrow v1: the full
reverse-engineering record in docs/declarative-agents-port.md (DL7/Ig5/GN/_Y
constants, error messages, schema parity decisions, coordination matrix with
workflow PR #4732) stays load-bearing for the follow-up PRs. A status table
at the top discloses what's deferred and why.

Net: -559 LOC across 7 files. Final PR is permissionMode bridge + maxTurns
wiring + color allowlist + design reference doc + the yaml-parser
nested-limitation pin tests added during round-3 review.

Refs #4821 #4721 #4732
@wenshao

wenshao commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

✅ Local runtime verification — P1 Workflow tool

I verified this PR end-to-end at runtime (not just the unit suite) on the true post-merge state, as a merge reference.

Setup — tested the merged tree, not the branch in isolation

The branch is 38 commits behind current main, and main independently changed config.ts by +148 lines since the branch point (the same file this PR edits for tool registration). So I merged the PR head into origin/main@6a99ed150 first — clean auto-merge, no conflicts — and ran everything against that merged tree. Node v22, TypeScript 5.8.3.

1. Unit suites — 118 / 118 pass

npx vitest run  (5 workflow suites)
 ✓ workflow-sandbox.test.ts              (79)
 ✓ workflow-orchestrator.test.ts         (17)
 ✓ workflow.test.ts                      (14)
 ✓ config.workflow-registration.test.ts   (4)
 ✓ config.workflows.test.ts               (4)
 Tests  118 passed (118)    Duration 32.4s

2. Typecheck — all 13 changed files clean

tsc --noEmit on packages/core reports zero diagnostics in any file this PR touches.
(The only diagnostic emitted is a pre-existing environmentContext.test.ts(599,1) TS1005 — a file this PR does not modify; its blob is byte-identical to origin/main (b4768393). Reproduces in isolation, so it's a standing main property unrelated to this PR — flagging separately for awareness.)

3. Real runtime exercise (node:vm, driven in tmux) — 33 / 33 checks pass

I drove the actual WorkflowOrchestrator + createWorkflowSandbox with real model-style scripts and an injected mock dispatch (no LLM needed), so this exercises the real sandbox, not mocks:

Area Verified
Sequential agent() A → B call order; B's prompt receives A's output (chaining); args / phase / log captured in order; runId = wf_<hex>; script return value passed through verbatim
Determinism Date.now(), new Date(), Math.random() all throw (resume-safety guarantee)
P1 gating parallel(), pipeline(), workflow(), agent({schema|model|isolation|agentType}), budget.spent() all throw with clear "scheduled for P2/P3/P5" messages; an unknown agent() opt (typo scema) is rejected by the runtime allowlist
Sandbox isolation globalThis.constructor.constructor('return process')(), thrown-Error realm escape, and agent-result realm escape are all blocked (process is not defined); process / require are undefined inside the sandbox
export const meta leading meta is stripped and the body runs; an unbalanced-brace meta is refused (throws, no silent truncation)
Robustness non-JSON args (a function) is rejected; a script failure preserves phases / logs and surfaces a clean message (no realm-leaked Error: prefix); the async wall-clock timeout fires at 800 ms and aborts the linked abortOnTimeout controller

4. Gating (off by default) — confirmed by the config suites

Not registered unless isWorkflowsEnabled(); QWEN_CODE_ENABLE_WORKFLOWS=1 enables it; QWEN_CODE_DISABLE_WORKFLOWS=1 is a kill-switch that overrides; ToolNames.WORKFLOW is in EXCLUDED_TOOLS_FOR_SUBAGENTS, so a subagent can't recursively invoke the workflow.

Verdict

Functionally verified at runtime. The node:vm sandbox, sequential dispatch, determinism guarantees, P1 forward-compat stubs, and the host-realm isolation boundary all behave as designed on the merged tree. No functional blockers found; safe to merge from a runtime-verification standpoint. (Scope: behavior/runtime verification — not a full line-by-line security audit of the sandbox.)

Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts
Comment thread packages/core/src/agents/runtime/workflow-sandbox.ts
Comment thread packages/core/src/agents/runtime/workflow-orchestrator.ts
Comment thread packages/core/src/tools/workflow/workflow.test.ts

@DragonnZhang DragonnZhang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #4732 — feat(core): Workflow tool P1

Deterministic checks

Check Result
TypeScript (tsc --noEmit) PASS (clean)
ESLint (8 changed source files) PASS (clean)
Tests (5 workflow test files, 118 tests) PASS (118/118)

Overall assessment

This is a well-engineered PR with thorough security hardening, comprehensive test coverage, and clean three-layer architecture (WorkflowTool / WorkflowOrchestrator / createWorkflowSandbox). The six rounds of iterative security fixes addressing node:vm prototype-escape vectors are impressive — the PoC regression tests in workflow-sandbox.test.ts are some of the best sandbox-hardening test coverage I have seen in a PR of this scope.

No blocking issues found. Five non-blocking observations below.

Findings (5 non-blocking)

F1. [Security - Suggestion] No script size cap in validateToolParamValues

validateToolParamValues checks only that script is a non-empty string but imposes no maximum length. While mitigated by the ask permission gate and model output token limits, a defense-in-depth cap (e.g., 256KB) would prevent a hostile or hallucinating model from submitting a multi-megabyte script that pressures the vm parser or the stripExportMeta brace walker.

// packages/core/src/tools/workflow/workflow.ts — validateToolParamValues
const MAX_SCRIPT_LENGTH = 256_000; // ~256KB
if (params.script.length > MAX_SCRIPT_LENGTH) {
  return `WorkflowTool: script exceeds max length of ${MAX_SCRIPT_LENGTH} characters.`;
}

Not blocking — the opt-in feature gate and ask permission provide meaningful mitigation already.

F2. [Security - Informational] Async microtask leak after wall-clock timeout

After Promise.race rejects on wall-clock timeout, an in-script async microtask loop (e.g., (async () => { while(true) await 0 })()) continues consuming microtasks on the host event loop. node:vm provides no mechanism to halt async execution once started. The PR documents this limitation and correctly identifies worker_threads isolation as the proper fix for a future phase.

Acceptable for P1 given the opt-in gate and the 30-minute default cap. Should be tracked as a known limitation for the issue (#4721).

F3. [Security - Informational] No memory cap for vm context

node:vm does not enforce a memory limit. A script like const a=[]; while(true) a.push(new ArrayBuffer(1e8)) can OOM the host process. The PR documents this and suggests --max-old-space-size as an operator mitigation. Same disposition as F2 — acceptable for P1, tracked for future worker_threads migration.

F4. [Correctness - Nit] Copyright header inconsistency

config.workflows.test.ts and config.workflow-registration.test.ts carry Copyright 2025 Google LLC while the other six new workflow source/test files carry Copyright 2025 Qwen. Cosmetic only — worth aligning if there is a project convention on copyright holder.

F5. [Performance - Informational] Per-call dynamic import in dispatch closure

createProductionDispatch does await import('./agent-headless.js') inside every dispatch invocation. Node caches module resolution after the first import, so subsequent calls resolve in a single microtask tick. Negligible overhead in practice. No action needed — the dynamic import pattern is load-bearing for test mockability.

Reverse audit (things NOT changed that might need attention)

  • EXCLUDED_TOOLS_FOR_SUBAGENTS in agent-core.ts: Correctly updated with ToolNames.WORKFLOW. Anti-recursion test in config.workflow-registration.test.ts locks this in.
  • tool-names.ts: Both ToolNames.WORKFLOW and ToolDisplayNames.WORKFLOW added. No migration mapping needed (new tool).
  • index.ts barrel export: Changed to export type only — avoids eager loading of the node:vm module chain. Correct.
  • config.ts: Feature gate follows the established isForkSubagentEnabled pattern (env var + settings + kill switch). Registration uses registerLazy. Correct.

Summary

Axis Verdict
Correctness PASS — architecture is sound, cross-realm Error handling is correct, AbortController chain (caller -> child -> dispatch -> subagent) is well-designed
Security PASS with notes — six rounds of prototype-escape hardening with PoC regression tests; node:vm limitations are honestly documented; opt-in gate is the primary defense
Performance PASS — no hot-path concerns; log/phase caps bound memory; wall-clock timer is unref'd
Test coverage STRONG — 118 tests across 5 files covering security PoCs, edge cases, error paths, and injection seams

LGTM with optional improvements (F1 script size cap being the most actionable).

…#4732 R7 F4)

Both files were derived from a template (the stale `config-session-env.test.ts`
reference cleaned up in R1 T21) and retained the upstream `Copyright 2025
Google LLC` header. The other six new workflow source/test files in this
PR carry `Copyright 2025 Qwen`. Align for same-PR consistency.

Per DragonnZhang R7 F4. No behavior change; header text only.
@LaZzyMan

LaZzyMan commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

R6 + R7 summary — HEAD a008fd648

Two reviews in this batch, both positive on PR readiness:

@wenshao — runtime verification (top-level comment above)

Thanks for the end-to-end local verification on the merged tree (118/118 unit + 33/33 real-runtime checks + sandbox isolation PoCs). The verdict "✅ Functionally verified at runtime. Safe to merge from a runtime-verification standpoint" is the most rigorous external check this PR has received and addresses any "did the unit tests actually exercise the integration?" doubt directly.

The 4 inline findings (T45-T48) are all [Suggestion]-class and have been replied to individually:

# File Disposition
T45 sandbox.ts:478 (Object.keys allowlist) ❌ out of scope — explicit throws cover P1 surface (you noted this); preemptive hardening belongs in P2/P3 PR
T46 sandbox.ts:36 (stripExportMeta leading comments) ❌ out of scope — reverses my R4 T33 design decision (clean SyntaxError as constraint signal)
T47 orchestrator.ts:46 (stale cause JSDoc) ❌ out of scope — R6+ inline-doc polish defaults to overthinking; folded into future doc-pass
T48 workflow.test.ts:57 (external-abort layer test) ❌ out of scope — R6+ test-coverage additions default to overthinking; helper-contract tests already in utils/abortController.test.ts

@DragonnZhang — LGTM with optional improvements

Thanks for the independent code review. The "LGTM with optional improvements" plus the deterministic checks (tsc clean / eslint clean / 118/118) is great signal alongside @wenshao's runtime verification.

Disposition of F1-F5:

# Disposition
F1 (script size cap) ❌ out of scope — hostile-provider hardening; you noted opt-in gate + ask permission are the meaningful mitigation already
F2 (async microtask leak) ✅ documented as known P1 limitation in #4721 (#4721) — worker_threads migration tracked for future phase
F3 (vm memory cap) ✅ documented as known P1 limitation in #4721 alongside F2
F4 (copyright header inconsistency) fixed in a008fd648 — both config.workflows.test.ts and config.workflow-registration.test.ts now Copyright 2025 Qwen to match the other 6 PR-new files
F5 (per-call dynamic import) ✅ acknowledged — your assessment ("No action needed — dynamic import is load-bearing for test mockability") is correct

Status: HEAD a008fd648, CI 8/8 green, all 46 review threads across R1-R7 resolved, known P1 limitations documented in parent #4721. Pending: re-review or CHANGES_REQUESTED dismissal to unblock merge.

@qwen-code-ci-bot

Copy link
Copy Markdown
Collaborator

Qwen Code review did not complete successfully: Qwen review timed out after 55 minutes. See workflow logs.

`Workflow subagent did not complete (terminate mode: ${mode}).`,
);
}
return subagent.getFinalText();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] subagent.getFinalText() can return '' (empty string) on GOAL mode — the dispatch silently returns "" with no warning.

Looking at agent-core.ts, finalText starts as '' and is only updated when a round produces text output. If the model returns GOAL with no text (tool-only execution, content-filtered response, empty output), the script receives "" from await agent("..."). A subsequent JSON.parse(agentResult) or string operation would throw a confusing error with no indication that the root cause was an empty agent response.

Suggested change
return subagent.getFinalText();
const text = subagent.getFinalText();
if (!text) {
throw new Error(
`Workflow subagent '${opts?.label ?? 'workflow-agent'}' completed with GOAL mode but returned empty text. This usually indicates a model-side issue (content filter, empty response, or tool-only execution).`,
);
}
return text;

— qwen3.7-max via Qwen Code /review

protected override validateToolParamValues(
params: WorkflowParams,
): string | null {
if (typeof params.script !== 'string' || params.script.length === 0) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] No upper bound on script size. The only check is script.length === 0 — a multi-megabyte model-generated script is accepted and passed to new vm.Script(), which compiles synchronously and is NOT covered by the 30s vm timeout (that only applies to runInContext).

A model hallucination loop could produce a 5MB+ script that blocks the event loop during V8 compilation with no timeout and no abort signal.

Suggested change
if (typeof params.script !== 'string' || params.script.length === 0) {
if (typeof params.script !== 'string' || params.script.length === 0) {
return 'WorkflowTool: `script` parameter is required and must be a non-empty string.';
}
const MAX_SCRIPT_LENGTH = 256_000;
if (params.script.length > MAX_SCRIPT_LENGTH) {
return `WorkflowTool: script is ${params.script.length} chars, exceeds limit of ${MAX_SCRIPT_LENGTH}. This usually indicates a model generation loop.`;
}

— qwen3.7-max via Qwen Code /review

return /[{[(,;:=!&|?+\-*/%^~<>]/.test(prev);
}

import * as vm from 'node:vm';

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] import * as vm from 'node:vm' appears after ~120 lines of code (two exported functions: stripExportMeta and isRegexContext). Every other file in the codebase — including the two sibling files added in this PR — clusters all imports at the top, immediately after the license header.

ESM hoists imports so this works at runtime, but it is visually inconsistent. Move to line 7, directly below the license header.

— qwen3.7-max via Qwen Code /review

return this.workflowsEnabled;
}

setWorkflowsEnabled(enabled: boolean): void {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] workflowsEnabled is the only mutable (non-readonly) feature flag on Config. Every other *Enabled field (cronEnabled, forkSubagentEnabled, computerUseEnabled, experimentalZedIntegration) is private readonly. The setWorkflowsEnabled setter is the only public setter for any feature flag on the entire Config class.

This breaks Config's post-construction immutability contract. If a caller mutates this flag mid-session, isWorkflowsEnabled() could return a different value than what the tool registry was built with (constructed once at startup via registerLazy).

Make workflowsEnabled readonly like its siblings. Tests that need workflowsEnabled: true can pass it via ConfigParameters (as the existing "honors workflowsEnabled passed via ConfigParameters" test already demonstrates).

— qwen3.7-max via Qwen Code /review

signal?: AbortSignal,
): WorkflowAgentDispatch {
return async (prompt, opts) => {
const { AgentHeadless, ContextState } = await import('./agent-headless.js');

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] await import('./agent-headless.js') is inside the returned closure, so every agent() call re-executes the dynamic import(). Node's module cache prevents re-loading from disk, but each call still goes through the async module-resolution pipeline. Every other consumer (subagent-manager.ts, background-agent-resume.ts) uses a static import.

Hoist to the factory level so it resolves once per createProductionDispatch call:

Suggested change
const { AgentHeadless, ContextState } = await import('./agent-headless.js');
export function createProductionDispatch(
config: Config,
signal?: AbortSignal,
): WorkflowAgentDispatch {
const modulePromise = import('./agent-headless.js');
return async (prompt, opts) => {
const { AgentHeadless, ContextState } = await modulePromise;

This preserves the test-mock property (dynamic import lets mocks swap the module) while eliminating repeated resolution.

— qwen3.7-max via Qwen Code /review

const llmText = (result.llmContent as Array<{ text: string }>)[0]!.text;
expect(llmText).toBe('(workflow returned no value)');
});
});

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Missing test for dispatchController.abort() in the finally block at workflow.ts:174. Every test passes new AbortController().signal but never inspects whether the internal child controller created via createChildAbortController is actually aborted after execution.

A regression that removes the finally { dispatchController.abort() } block would pass all existing tests. In production this leaks: every successful workflow would leave the dispatch controller un-aborted, meaning any straggler subagent keeps burning tokens until its own max_time_minutes limit.

Add a test that captures the dispatch signal and asserts signal.aborted === true after execute() resolves on both success and failure paths.

— qwen3.7-max via Qwen Code /review

@wenshao

wenshao commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Verification Report — local runtime build & drive

Verdict: PASS ✅ (one documentation/wiring note below)

Built the branch from a clean checkout and drove the real bundled CLI in tmux, exercising the Workflow tool through the model at its actual surface (tool-call → node:vm sandbox → sequential agent() dispatch). This is runtime observation, not a test re-run.

Environment

  • Linux (Debian 13, kernel 6.12), Node v22.22.2, npm 10.9.7 — complements the macOS-only matrix in the PR (Linux was marked ⚠️ not tested).
  • Worktree off PR head a008fd648; real npm ci + npm run bundledist/cli.js (Qwen Code v0.17.1, glm-4.7 backend). Build clean.

What was verified (all at the live CLI surface)

1. Gating — off by default
/tools with no env var → 29 tools, no Workflow. ✅

2. Gating — opt-in via env var
QWEN_CODE_ENABLE_WORKFLOWS=1/tools now lists Workflow (30 tools). ✅

3. Kill switch
QWEN_CODE_ENABLE_WORKFLOWS=1 QWEN_CODE_DISABLE_WORKFLOWS=1Workflow absent again. The disable flag correctly wins over enable. ✅

4. Happy path — single sequential agent
Script phase('Greet'); log('starting'); const reply = await agent('…PONG…'); …; return { agentSaid: reply, ok: true }. Tool prompted for approval (getDefaultPermission: 'ask' confirmed), then returned:

{ "runId": "wf_31ebe7f9247fa960", "phases": ["Greet","Wrap"],
  "logs": ["starting","agent replied: PONG"],
  "result": { "agentSaid": "PONG", "ok": true } }

Real subagent dispatched and its value flowed back into the script. The LLM received the unwrapped result; the full metadata went to the display payload. ✅

5. Headline feature — two agents in order
Script with phase('A'); one = await agent('…1'); phase('B'); two = await agent('…2'); …; return {first, second}

{ "runId": "wf_3663a65961a670d4", "phases": ["A","B"],
  "logs": ["got 1 then 2"], "result": { "first": "1", "second": "2" } }

Two subagents executed sequentially, phases and log captured in order. ✅

Probes (off the happy path)

  • 🔍 Date.now()Date.now() / new Date() are unavailable in workflow scripts (breaks resume)… — exact determinism error, message only, no stack leak. ✅
  • 🔍 parallel([...])parallel() is not supported in P1. … Concurrent fan-out is scheduled for P2. — clean P2 stub, not a ReferenceError. ✅
  • 🔍 agent('hi', {schema:{…}})agent({schema}) is not supported in P1. … scheduled for P3. — clean P3 stub. ✅
  • 🔍 Realm escapeargs.constructor.constructor('return process')() with args={x:1}process is not defined. The vm-realm Function constructor cannot reach the host process. Sandbox boundary holds. ✅
  • 🔍 Syntax error — malformed return (((Unexpected token '}', surfaced as a graceful tool error (no crash). ✅

Findings

  • ⚠️ opt-in via setting is not wired in the bundled CLI. The PR text says workflows are enabled "via setting or QWEN_CODE_ENABLE_WORKFLOWS=1". The env-var path works (verified live). But there is no experimental.workflows field in settingsSchema.ts and no line mapping workflowsEnabled from settings in packages/cli/src/config/config.ts — its siblings cronEnabled (settings.experimental?.cron) and emitToolUseSummaries are wired there; workflows is not. Config.workflowsEnabled / setWorkflowsEnabled() exist for programmatic embedders, but a user editing settings.json cannot currently turn the tool on through the CLI. Not a blocker for P1 (the security-critical default-off property holds and the env var works), but either the one-line settings mapping should be added or the description should say "env var only" until then.
  • 👍 Error hygiene is consistently good: every failure path surfaced only the message (no vm stack frames) to both the LLM and the display, matching the duck-typed extractErrorMessage intent.
  • 👍 The subagent-recursion guard is real — WORKFLOW is in EXCLUDED_TOOLS_FOR_SUBAGENTS, so the spawned agent() can't re-enter Workflow.

Out of scope (per PR, not exercised)

pipeline/workflow/budget stubs, resume, /workflows UI, background queue — all deferred to P2–P7 and explicitly not in this PR.


Verified on Linux via a tmux-driven bundled CLI against the real model backend. Evidence = the tool-result payloads captured above, reproduced verbatim from the running app.

@tanzhenxin tanzhenxin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second approval based on the converged review history: the multi-round sandbox-escape findings were all addressed, wenshao's runtime verification passed (118/118 unit, 33/33 runtime, isolation PoCs), and the feature is off by default so the blast radius is contained.

@tanzhenxin tanzhenxin enabled auto-merge (squash) June 10, 2026 02:40
@tanzhenxin tanzhenxin disabled auto-merge June 10, 2026 02:42
@LaZzyMan LaZzyMan merged commit 7757d87 into main Jun 10, 2026
12 of 13 checks passed
wenshao pushed a commit that referenced this pull request Jun 10, 2026
…+ maxTurns wiring + color allowlist (CC 2.1.168 parity) (#4842)

* docs: add declarative agents port design doc for #4821

* feat(core): add declarative agent frontmatter schema constants and 9 new fields

Adds agent-frontmatter-schema.ts as the single source of truth for the CC 2.1.168
declarative-agent enum constants (EFFORT_VALUES, PERMISSION_MODE_VALUES,
MEMORY_VALUES, ISOLATION_VALUES, COLOR_VALUES) and lenient parsers (parseEffort,
parseMaxTurns, parseBackground, parseStringOrArray) that mirror DL7's warn-and-drop
posture. Also adds a permissionModeToApprovalMode bridge to map CC's permission
modes onto qwen-code's existing approvalMode semantics.

Extends SubagentConfig with 9 optional fields carried verbatim from CC frontmatter:
permissionMode, effort, maxTurns, skills, initialPrompt, memory, isolation,
mcpServers, hooks. Runtime semantics for the metadata-only fields are deferred
to follow-up PRs; this lands the data carrier.

Refs #4821 #4721 #4732

* feat(core): parse and serialize 9 new declarative agent frontmatter fields

Extends parseSubagentContent and serializeSubagent to round-trip the 9 CC
2.1.168 fields previously added to the SubagentConfig type. Parsing follows
DL7's lenient warn-and-drop posture (vs the existing strict throw posture
used for approvalMode), so a Claude Code agent file with an invalid optional
field still parses with that field dropped — matching CC behavior so users
can drop CC agent files into .qwen/agents/ unchanged.

Adds a permissionMode → approvalMode bridge: when frontmatter has
permissionMode but no approvalMode, the bridge resolves the approvalMode at
parse time using the same mapping as claude-converter.ts. If both are set,
approvalMode wins (already-explicit values take precedence over inferred
ones).

Tests:
- 22 new parser cases covering happy paths, invalid drops, permissionMode
  bridge precedence, color allowlist (with auto sentinel preserved), and
  loose-validation passthrough for mcpServers / hooks.
- 9 new serializer cases asserting round-trip and omit-when-unset behavior.

Refs #4821 #4721 #4732

* feat(core): promote top-level maxTurns to runConfig.max_turns at convert time

Wires the new top-level `maxTurns` field into the existing runtime configuration
pipeline so the value declared in agent frontmatter actually limits the agent's
turn budget at run time. When both the top-level `maxTurns` and the legacy
nested `runConfig.max_turns` are set, the top-level field wins (more specific
and matches the CC frontmatter shape upstream agent files use). When only the
nested field is set, behavior is unchanged.

Refs #4821 #4721 #4732

* refactor(core): use shared permissionMode bridge and parseStringOrArray in claude-converter

Replaces the inline claudeToQwenMode table and the local parseStringOrArray
helper in claude-converter.ts with the shared exports from
agent-frontmatter-schema.ts. One source of truth for the CC ↔ qwen mapping
keeps the two import paths (.qwen/agents/*.md frontmatter and Claude plugin
import) in sync — when the upstream CC schema changes, only one table needs
updating.

Adds explicit tests for the bridge mapping (six known modes + unknown
fallback). The fallback now happens at the call site rather than inside the
table, but the observable behavior is unchanged.

Refs #4821

* docs(subagents): document CC-compatible frontmatter fields

Adds a Claude Code Compatibility Fields section to the user-facing subagents
reference, covering the 9 new frontmatter fields landed in this PR. The intent
is for users with existing Claude Code agents to know they can drop the files
into `.qwen/agents/` and have them parse identically.

Refs #4821

* refactor(core): reuse parseBackground in declarative agent loader

Replaces the inline boolean-or-string lenient parse for the background field
with the shared parseBackground util added in the schema module. Same observable
behavior; one fewer place to keep in sync when the upstream shape changes.

Refs #4821

* fix(core): address adversarial review of declarative agents PR

Four self-review findings caught and fixed before opening the PR:

1. **mcpServers/hooks round-trip claim was broken.** The local yaml-parser only
   formats one level of nesting, so serializing an agent with nested mcpServers
   or hooks would mangle the value into '[object Object]'. The fields are still
   carried verbatim in memory (read path is fine), but the serializer no longer
   emits them — losing the field through '/agents' edits is strictly safer than
   corrupting it. Documented the limitation in docs/users/features/sub-agents.md.

2. **approvalMode throw vs permissionMode lenience asymmetry.** The pre-existing
   approvalMode parser threw on invalid values, killing the entire agent file.
   With the new permissionMode bridge, a CC-imported file with 'permissionMode:
   bypassPermissions' + 'approvalMode: tpyo' would reject the file instead of
   dropping the typo and using the bridge. Demoted approvalMode to the same
   DL7-parity warn-and-drop posture all the new fields use; the bridge now runs
   when approvalMode is invalid.

3. **parseEffort missed numeric strings.** CC's DL7 falls back to parseInt for
   non-enum effort strings so 'effort: "5"' (quoted YAML) round-trips like
   'effort: 5'. Added the same fallback and tests for floats / partial numeric
   strings / valid numeric strings.

4. **convertAgentFiles stripped 6/9 of the new fields.** The Claude plugin
   import path read frontmatter into a fixed ClaudeAgentConfig shape, so
   'effort', 'maxTurns', 'initialPrompt', 'memory', 'isolation', 'mcpServers',
   and 'background' were silently dropped when installing a CC plugin agent.
   Build the rewritten frontmatter from the original keys first, then overlay
   the converter's transformations for the keys it owns — unknown CC fields
   now passthrough verbatim, future-proofing against later CC additions.

Refs #4821 #4721 #4732

* fix(core): round-2 self-review fixes for declarative agents PR

Second adversarial review surfaced 5 more findings; all fixed:

1. **convertAgentFiles passthrough corrupted nested objects.** Round-1 fixed
   serializeSubagent to skip emitting mcpServers/hooks (the local yaml-parser
   collapses nested values to '[object Object]'). The plugin-import passthrough
   I added to claude-converter.ts had the same mangle bug — a CC plugin agent
   with nested mcpServers got written to disk as 'mcpServers:\n  - [object
   Object]'. Now skips both fields via a NESTED_FIELDS_NOT_ROUND_TRIPPABLE set.

2. **parseEffort accepted 0 and negative integers.** Docs say 'positive integer'
   and parseMaxTurns already rejects <= 0. parseEffort didn't, accepting
   '0' / '-5' / 0 / -1 as valid efforts. Aligned both helpers and added tests
   for the boundary values.

3. **Function name collision: permissionModeToApprovalMode.** packages/core/src/
   tools/agent/agent.ts has a module-private permissionModeToApprovalMode that
   maps the qwen PermissionMode enum to ApprovalMode enum (different domain
   entirely). My new exported helper used the same name; IDE auto-import would
   silently return undefined for every qwen enum value. Renamed the export to
   claudePermissionModeToApprovalMode and updated 4 callers.

4. **color: 'auto' round-trip asymmetry.** Round-1 added a test pinning that
   the parser preserves the legacy 'auto' sentinel. But serializeSubagent
   already had a pre-existing 'skip emit when auto' branch, so a parse →
   serialize → parse cycle dropped the sentinel. The CLI helpers
   (shouldShowColor / getColorForDisplay) already treat 'auto' identically to
   undefined, so normalizing 'auto' to undefined at parse time has no
   downstream effect AND makes round-trip cleanly idempotent.

5. **Converter approvalMode precedence diverged from loader.** When a CC source
   file had both 'permissionMode: bypassPermissions' AND 'approvalMode: default',
   the convertClaudeAgentConfig path wrote 'approvalMode: yolo' (bridge wins).
   The loader's rule is 'approvalMode wins over bridge'. Extended
   ClaudeAgentConfig with an approvalMode field and gated the bridge emit on
   'source approvalMode is unset', aligning convert and load precedence.

Refs #4821 #4721 #4732

* fix(core): round-3 self-review fixes — drift, API hygiene, converter contract

Three quality findings from the round-3 adversarial pass (after rounds 1+2
fixed 9 issues each). All low-severity but cheap; the runtime was correct,
the file-on-disk + public-API surface needed tightening.

1. **Stale runConfig.max_turns duplicated on every write.** Top-level maxTurns
   was promoted in 9b903ee but the serializer still emitted runConfig.max_turns
   verbatim. A file with both fields kept both indefinitely. The runtime
   already prefers top-level (so behavior is correct), but a third-party tool
   or future viewer reading the legacy nested field would act on stale data.
   Prune nested max_turns from the serialized runConfig when top-level
   maxTurns is set; drop the runConfig block entirely if pruning leaves it
   empty. Three new tests pin the prune, the drop-empty-block path, and the
   no-top-level fallback.

2. **14 internal symbols leaked into the public package API.** The PR
   re-exported every helper from agent-frontmatter-schema.ts via
   subagents/index.ts. grep against packages/cli + packages/vscode-ide-companion
   finds zero cross-package consumers — both internal users (subagent-manager,
   claude-converter) already import directly from the schema file. Locking
   constant names like EFFORT_VALUES and parseEffort in the public API would
   constrain follow-up PRs (e.g. when js-yaml lands and the schema shape
   changes). Removed the entire export block; left a comment explaining the
   choice so a future PR knows the bar for re-introducing it.

3. **convertClaudeAgentConfig silently dropped a standalone approvalMode.**
   Round-2 gated the permissionMode bridge on !claudeAgent.approvalMode for
   precedence parity with the loader, but never added the explicit copy for
   approvalMode-only callers. Result: caller passes {approvalMode: 'plan'} →
   output has no approvalMode at all. The in-tree convertAgentFiles path
   recovered via the unowned-key passthrough, but exported direct-call surface
   needs the explicit copy. Two new tests pin both the standalone case and
   the precedence-when-both case.

Also: pinned yaml-parser limitation tests (added in the round-3 e2e
independent check) document that the lightweight parser cannot round-trip
nested mcpServers/hooks; documentation + types comments updated to call out
the carve-out so users know the field works only for flat forms until
js-yaml lands.

Refs #4821 #4721 #4732

* refactor(core): shrink declarative agents PR to vertically-sliced v1

Reduces this PR to ship only the fields that have an end-to-end runtime path
today: permissionMode (bridges to existing approvalMode), maxTurns (wires
into runConfig.max_turns), and a tightened color allowlist. Everything else
gets deferred to follow-up PRs that first land the prerequisite infra they
need.

Removed:
- effort (no model-layer effort param in qwen providers)
- mcpServers / hooks (need nested-aware YAML parser — yaml-parser.ts is
  hand-rolled and only handles 1 level)
- memory (qwen's auto-memory has no user/project/local scope distinction)
- isolation (workflow PR #4732 owns the runtime)
- initialPrompt (needs --agent CLI flag, no main-session-agent infra)
- skills (needs SkillManager consumption of config.skills)

This drops 7 fields from SubagentConfig + types.ts, deletes their parser /
serializer / test code, deletes the unused enum constants and helpers from
agent-frontmatter-schema.ts (EFFORT_VALUES, EFFORT_ALIASES, MEMORY_VALUES,
ISOLATION_VALUES, parseEffort, isMemory, isIsolation), and trims the user
docs to the supported field set with a pointer to the design doc for the
deferred fields.

Why ship the design doc alongside such a narrow v1: the full
reverse-engineering record in docs/declarative-agents-port.md (DL7/Ig5/GN/_Y
constants, error messages, schema parity decisions, coordination matrix with
workflow PR #4732) stays load-bearing for the follow-up PRs. A status table
at the top discloses what's deferred and why.

Net: -559 LOC across 7 files. Final PR is permissionMode bridge + maxTurns
wiring + color allowlist + design reference doc + the yaml-parser
nested-limitation pin tests added during round-3 review.

Refs #4821 #4721 #4732

* refactor(core): revert round-X residuals to hit v1's actual minimum LOC

The previous shrink left round-1/2/3 fixes in place that were originally
for the wider 9-field scope; with the carry-only fields gone they're no
longer needed. Reverting them all back to pre-PR behaviour:

- approvalMode strict throw (round-1 demoted to lenient; defer the
  symmetry fix to a separate PR if wanted)
- runConfig.max_turns prune on serialize (round-3 cleanup; defer)
- color: 'auto' normalised to undefined (round-2 round-trip fix; defer)
- claude-converter NESTED_FIELDS_NOT_ROUND_TRIPPABLE + CONVERTER_OWNED_KEYS
  + passthrough loop (round-1/2 — all for the deleted carry-only fields)
- ClaudeAgentConfig.approvalMode + precedence gating (round-3)
- parseBackground in shared schema module (the dedup wasn't pulling weight)
- parseStringOrArray in shared schema module (same)

The shared bridge claudePermissionModeToApprovalMode + the disambiguated
name stay — converter and loader use the same map and the name collision
risk is real.

Net: -336 LOC vs the prior shrink commit.

* docs: sync user-facing docs to actual v1 behaviour

The previous shrink left two stale claims in docs/users/features/sub-agents.md
that referenced behaviour reverted in the round-X-revert commit:

- maxTurns row claimed 'the legacy nested value is pruned from the on-disk
  file on save to avoid two sources of truth' — the prune was reverted
- color row claimed 'auto is accepted on read and normalised to undefined
  for round-trip parity' — the normalisation was reverted; auto is preserved
  as-is for backward compat

Updated both rows to match what the parser actually does now.

The design doc (docs/declarative-agents-port.md) is unchanged: its top-line
disclaimer already labels the body as reference material for follow-up PRs,
so the references to the wider plan (D7 bridge, P1-P4 phases) are
accurate-as-reference even though some details are deferred.

* fix(core): use Map.get in claudePermissionModeToApprovalMode (prototype-chain footgun)

The PERMISSION_MODE_TO_APPROVAL_MODE table was a plain object accessed with
`[key]`. Calling claudePermissionModeToApprovalMode('__proto__') walked the
prototype chain and returned Object.prototype — a non-string value that
violated the declared return type. The current loader path was safe because
isPermissionMode() filtered prototype-key strings before the bridge ran, but
the exported function itself was a latent footgun for any future caller
that didn't know to pre-filter.

Switched the lookup table to a Map so prototype keys cannot be reached.
Added explicit tests for __proto__, constructor, hasOwnProperty, toString.

Refs #4821

* fix(core): address round-2 review on declarative agents PR #4842

Two findings:

1. `serializeSubagent` wrote both `permissionMode` and the bridge-derived
   `approvalMode` into the same frontmatter block. On the next load the parser
   takes `approvalMode` (explicit wins over bridge), so `permissionMode`
   silently became dead frontmatter — a user editing it later in the file
   would be ignored. Skip the `permissionMode` emit when `approvalMode` is
   also being written; `permissionMode` still round-trips when it's the
   user's only intent. Two new tests pin both branches.

2. `subagents/index.ts` had a NOTE comment referencing `EFFORT_VALUES` and
   `parseEffort` as example schema helpers that intentionally aren't
   re-exported. Those symbols were removed during the v1 scope shrink
   (637b8b7) and don't exist in the current module. Updated the comment to
   reference `claudePermissionModeToApprovalMode`, `parseMaxTurns`, and
   `isPermissionMode`, which are the actual current contents.

Sibling-drift note: the same dual-emit pattern exists for `maxTurns` vs
`runConfig.max_turns` — round-X revert (70a876d) explicitly deferred the
`runConfig.max_turns` prune. Not addressed here per that earlier decision.

Refs #4842
wenshao pushed a commit that referenced this pull request Jun 13, 2026
…'worktree'}) (#4721) (#5034)

* feat(core): Workflow P3 — agent({schema, agentType, model, isolation:'worktree'}) (#4721)

Adds the P3 dispatch options to the workflow runtime, completing the
contract qwen-code's workflow tool matches against upstream Claude Code
2.1.168. P1/P2 stubs (workflow-sandbox.ts:508-527) are replaced with
production paths routed through `SubagentManager.createAgentHeadless` so
per-call model overrides go through `buildRuntimeContentGeneratorView`
(provider routing), per-agent MCP servers / hooks get isolated
lifecycles, and worktree-isolated subagents run against a rebound Config.

- agent({agentType: 'X'}) resolves against the declarative-agents
  registry (#4842 + #4996) via findSubagentByName; unresolved names throw
  "agent({agentType}): agent type 'X' not found" verbatim from upstream.
- agent({model: 'qwen3-max'}) is threaded into SubagentConfig.model so
  the runtime view sees it (modelConfigOverrides alone would only swap
  the model name within the existing provider's view).
- Workflow's disallowed-tool floor [SendMessage, ExitPlanMode] is unioned
  with the agentType's own disallowedTools so a permissive agentType
  cannot re-enable them for a workflow subagent.
- agent({isolation: 'worktree'}) provisions a fresh worktree via
  GitWorktreeService.createUserWorktree (slug agent-<7hex>, mirrors
  AgentTool 1849-1963), rebinds cwd/getTargetDir/getFileService/
  getWorkspaceContext on a prototype-chained Config override, and on
  completion auto-removes the worktree if clean or preserves the path +
  branch (appended to the result string) when the subagent left changes.
  Parent-dirty trees are refused with a clear error to avoid silently
  running the subagent against a stale HEAD.
- agent({isolation: 'remote'}) throws "agent({isolation:'remote'}) is
  not available in this build" verbatim (upstream 2.1.168 parity).
- agent({schema: S}) injects a per-call SyntheticOutputTool (existing
  tools/syntheticOutput.ts, AJV-backed) into a fresh per-subagent
  ToolRegistry built via rebuildToolRegistryOnOverride, then watches
  AgentEventEmitter TOOL_CALL/TOOL_RESULT events for `structured_output`
  invocations. A successful call's args are captured as the dispatch
  return value (object, not string); after two failed attempts the
  third failure aborts the dispatch and throws "subagent completed
  without calling StructuredOutput (after 2 in-conversation nudges)"
  verbatim. No agent-core.ts changes — the entire 2-nudge counter
  lives in the dispatch layer so the shared subagent loop is unaffected.

The sandbox's agent() wrapper now revives per-call object returns into
the vm realm (JSON round-trip inside the vm runInContext block), closing
the same T1/T8/T14 host-prototype-escape vector that P2's per-element
revival closed for parallel/pipeline. Two new sandbox security tests
(constructor-chain probe + non-JSON-serializable collapse) regress this.

WorkflowAgentResult widens from `string` to `string | object`; the
fast-path (no agentType/model/isolation/schema) is preserved byte-for-byte
to keep P1/P2 zero-overhead.

Tests: 159 workflow-suite tests + 217 adjacent (subagents / syntheticOutput /
agent-override) all green. Real-LLM E2E follow-up planned (mirroring P2's
13/13 qwen3-max validation).

Related #4721 (parent design — multi-phase, not closed by this PR)
Related #4732 (P1 merged) #4947 (P2 merged) #4842 #4996 (declarative agents)

* chore(core): P3 self-review R1 — align worktree suffix wording + 6 test gaps

R1 of pre-push adversarial self-review on PR #5034 surfaced 6 confirmed
findings across 6 diverse lenses (correctness / security / reuse-altitude
/ self-invariant / consumer-breakage / test-gaps). Each finding faced 2
independent skeptics defaulting to refuted=true; 6 survived majority
challenge.

Source code:
- Worktree-preserved suffix wording now matches AgentTool's
  formatWorktreeSuffix (agent.ts:1700-1719) verbatim, including the
  `git worktree add <path> <branch>` recovery hint for the directory-
  removed-but-branch-preserved race.

Test gaps closed:
- schema-mode success after 1 nudge (round-2 args captured)
- schema-mode success after 2 nudges (round-3 args captured)
- schema-mode + agentType together — floor disallowedTools still unioned
- schema-mode caller-abort takes priority over the StructuredOutput
  terminal error (signal.aborted check at workflow-orchestrator.ts:489-490)
- override path dispose() runs in finally on the success path
- override path dispose() runs in finally on the terminate-mode-error path

Declined R1 finding: negative tests for invalid opt types (schema/model/
agentType passed null/number/empty-string). Adding upfront type
validation is scope creep — upstream does not, P1/P2 do not, and the
workflow tool is model-authored where these inputs are extremely
unlikely. Existing AJV / SubagentManager downstream errors are descriptive
enough. Will revisit if R2 makes a stronger case.

166/166 tests pass (workflow suite + adjacent + workflow-orchestrator).
typecheck + lint clean across packages/core, packages/cli,
integration-tests, sdk, webui.

* chore(core): P3 self-review R2 — vm-realm opts revive + error-msg sanitize + 12 tests

R2 of pre-push adversarial self-review on PR #5034. 6 diverse-lens
finders (60 agents, ~2.5M tokens, 24 min) over the R1-fix-applied
code, with 2 independent skeptics defaulting to refuted=true.
12 confirmed survivors after adversarial verify; decisions below.

Security (FIX):
- agent() wrapper in workflow-sandbox.ts now JSON-revives agentOpts
  inside the vm runInContext block BEFORE passing them to the host
  dispatch. Closes a Proxy/inherited-getter escape that P3 introduced
  along with the user-supplied schema object: a script could have
  wrapped agentOpts.schema in a Proxy whose getter ran host-side code
  during SyntheticOutputTool construction / AJV compile. Same
  mechanism as args / parallel-result revival.
- runOverridePath now sanitizes opts.agentType through
  sanitizeForErrorMessage() (control chars → space) before
  interpolation into the "agent type 'X' not found" error message.
  Prevents a model-authored agentType containing CRLF / NUL from
  fragmenting a single-line error across log records / OTLP fields.

Reuse-altitude (FIX):
- Added JSDoc block to WorkflowWorktreeIsolation interface
  documenting each field's role for cleanup.

Test gaps (FIX, 12 new tests):
- agentType control-char sanitization regression
- dispose() runs in finally when subagent.execute throws
- isolation:'worktree' provision error branches (5):
  nested parent / git unavailable / not a git repo / parent dirty /
  createUserWorktree returns failure
- isolation:'worktree' cleanup branches (3):
  removeUserWorktree fails / branchPreserved race / removeUserWorktree
  throws — each preserves the worktree (or branch) with the right
  user-facing suffix
- combinations (2): model + isolation:'worktree' threads model AND
  provisions worktree; schema + isolation:'worktree' returns
  structured payload verbatim (preserved suffix only on string return)

Test infrastructure: vi.mock'd GitWorktreeService at the module level
(partial mock; preserves the existing exports the unrelated
worktreeCleanup.ts depends on) with a per-test beforeEach reset.

Declined R2 findings (kept the R1 line):
- [major] Schema parameter upfront validation: same scope-creep
  decline as R1. Upstream doesn't do it; AJV's downstream error is
  descriptive enough.
- [major] Worktree provision extracted to shared util with AgentTool:
  agreed in principle but out of P3 scope. A separate refactor PR
  should land that with AgentTool maintainers in the loop.

178/178 tests pass (workflow + adjacent suites). typecheck + lint
clean across packages/core, packages/cli, integration-tests, sdk,
webui.

* fix(core): address wenshao R1+R2 review on Workflow P3 (PR #5034)

Round 1 (15:41) + Round 2 (17:24) review from wenshao surfaced 7 inline
findings across schema-mode dispatch correctness, worktree cleanup
coverage, and error attribution. Each fix is paired with a regression
test that was RED before the change landed.

T0 [Critical] Worktree leak when schema setup throws after provision
  workflow-orchestrator.ts: outer try MOVED to start immediately after
  provisionWorkflowWorktree. Previously the try opened only after
  createSchemaConfigOverride / createSchemaModeState / signal listener
  attachment — so any throw in those three (broken MCP server during
  the per-call ToolRegistry rebuild was the trigger wenshao cited)
  orphaned the just-provisioned worktree under .qwen/worktrees/.
  Test: "isolation:'worktree' + schema setup throws → worktree is
  still cleaned up" — simulates createToolRegistry failure during
  createSchemaConfigOverride; asserts removeUserWorktree was called.

T1 [Critical] / T4 [H1] agentType + schema silently dead-ended
  workflow-orchestrator.ts: schema-mode augmented config now (a)
  appends ToolNames.STRUCTURED_OUTPUT to baseConfig.tools when the
  allowlist is restricted (no '*' and doesn't already contain it), so
  prepareTools / getFunctionDeclarationsFiltered doesn't filter
  structured_output out of the subagent's surface; (b) preserves the
  resolved agentType's persona by APPENDING the schema-contract
  instruction block instead of replacing the systemPrompt outright.
  Replace remains only on the ephemeral no-agentType path where
  baseConfig.systemPrompt IS WORKFLOW_SUBAGENT_SYSTEM_PROMPT (schema
  variant is its strict superset; avoids two near-identical prompts).
  Tests: structured_output appears in the allowlist alongside the
  agentType's existing tools; persona prompt is contained in the
  effective systemPrompt.

T2 [Suggestion] / T5 [M1] Parent-abort listener leaked per schema call
  workflow-orchestrator.ts: named listener stored at outer scope,
  removed in the outer finally regardless of how the dispatch ended.
  Previous `{ once: true }` only auto-removed on actual parent abort;
  the happy-path schema dispatch — success capture / 3-failure abort
  fires the CHILD controller without the parent ever aborting — left
  the listener stuck on the per-run signal. With N schema calls per
  workflow N listeners + N child-controller closures accumulated.
  Test: 5 sequential schema dispatches over the same parent signal
  end with zero live listeners.

T6 [M2] Terminate mode misdiagnosed as nudge exhaustion
  workflow-orchestrator.ts: schema path now distinguishes
  terminateMode before attributing failure to schema mode. TIMEOUT /
  MAX_TURNS / ERROR throw the existing "did not complete (terminate
  mode: X)" message that the non-schema path uses. Only the actual
  schema-failure cases produce schema wording, and those are split:
  attempts > 2 keeps the upstream-verbatim "(after 2 in-conversation
  nudges)" wording; attempts === 0 throws an accurate "no validation
  attempt — model produced plain-text content" instead of misleadingly
  citing nudges that never happened. (The existing 0-call test was
  updated to match the new accurate message; the 3-failure test
  retains the verbatim wording.)
  Tests: parametric over TIMEOUT/MAX_TURNS/ERROR asserting "did not
  complete"; companion test pinning the verbatim wording to the
  3-failure path.

T3 [Suggestion] Schema-mode JSON revival sentinel — clarified
  workflow-sandbox.ts: added a block comment documenting that the
  JSON-round-trip + null-on-throw is a SECURITY backstop (errors-as-data
  convention from parallel/pipeline) rather than a contract path —
  unreachable in production schema mode because the host return is
  LLM tool_call args, always JSON-serializable. No behavior change.

Tests: 75/75 orchestrator + 111/111 sandbox/tool/limiter green.
typecheck + lint clean across packages/core and packages/cli.

R1+R2 self-review commits (e1c5ec7 / 62624a9) precede this commit
on the same branch — they predate wenshao's review and address
distinct findings; reviewer L1 (worktree-lifecycle unit coverage) is
already closed by R2's 11 worktree tests.
doudouOUC pushed a commit that referenced this pull request Jun 15, 2026
…'worktree'}) (#4721) (#5034)

* feat(core): Workflow P3 — agent({schema, agentType, model, isolation:'worktree'}) (#4721)

Adds the P3 dispatch options to the workflow runtime, completing the
contract qwen-code's workflow tool matches against upstream Claude Code
2.1.168. P1/P2 stubs (workflow-sandbox.ts:508-527) are replaced with
production paths routed through `SubagentManager.createAgentHeadless` so
per-call model overrides go through `buildRuntimeContentGeneratorView`
(provider routing), per-agent MCP servers / hooks get isolated
lifecycles, and worktree-isolated subagents run against a rebound Config.

- agent({agentType: 'X'}) resolves against the declarative-agents
  registry (#4842 + #4996) via findSubagentByName; unresolved names throw
  "agent({agentType}): agent type 'X' not found" verbatim from upstream.
- agent({model: 'qwen3-max'}) is threaded into SubagentConfig.model so
  the runtime view sees it (modelConfigOverrides alone would only swap
  the model name within the existing provider's view).
- Workflow's disallowed-tool floor [SendMessage, ExitPlanMode] is unioned
  with the agentType's own disallowedTools so a permissive agentType
  cannot re-enable them for a workflow subagent.
- agent({isolation: 'worktree'}) provisions a fresh worktree via
  GitWorktreeService.createUserWorktree (slug agent-<7hex>, mirrors
  AgentTool 1849-1963), rebinds cwd/getTargetDir/getFileService/
  getWorkspaceContext on a prototype-chained Config override, and on
  completion auto-removes the worktree if clean or preserves the path +
  branch (appended to the result string) when the subagent left changes.
  Parent-dirty trees are refused with a clear error to avoid silently
  running the subagent against a stale HEAD.
- agent({isolation: 'remote'}) throws "agent({isolation:'remote'}) is
  not available in this build" verbatim (upstream 2.1.168 parity).
- agent({schema: S}) injects a per-call SyntheticOutputTool (existing
  tools/syntheticOutput.ts, AJV-backed) into a fresh per-subagent
  ToolRegistry built via rebuildToolRegistryOnOverride, then watches
  AgentEventEmitter TOOL_CALL/TOOL_RESULT events for `structured_output`
  invocations. A successful call's args are captured as the dispatch
  return value (object, not string); after two failed attempts the
  third failure aborts the dispatch and throws "subagent completed
  without calling StructuredOutput (after 2 in-conversation nudges)"
  verbatim. No agent-core.ts changes — the entire 2-nudge counter
  lives in the dispatch layer so the shared subagent loop is unaffected.

The sandbox's agent() wrapper now revives per-call object returns into
the vm realm (JSON round-trip inside the vm runInContext block), closing
the same T1/T8/T14 host-prototype-escape vector that P2's per-element
revival closed for parallel/pipeline. Two new sandbox security tests
(constructor-chain probe + non-JSON-serializable collapse) regress this.

WorkflowAgentResult widens from `string` to `string | object`; the
fast-path (no agentType/model/isolation/schema) is preserved byte-for-byte
to keep P1/P2 zero-overhead.

Tests: 159 workflow-suite tests + 217 adjacent (subagents / syntheticOutput /
agent-override) all green. Real-LLM E2E follow-up planned (mirroring P2's
13/13 qwen3-max validation).

Related #4721 (parent design — multi-phase, not closed by this PR)
Related #4732 (P1 merged) #4947 (P2 merged) #4842 #4996 (declarative agents)

* chore(core): P3 self-review R1 — align worktree suffix wording + 6 test gaps

R1 of pre-push adversarial self-review on PR #5034 surfaced 6 confirmed
findings across 6 diverse lenses (correctness / security / reuse-altitude
/ self-invariant / consumer-breakage / test-gaps). Each finding faced 2
independent skeptics defaulting to refuted=true; 6 survived majority
challenge.

Source code:
- Worktree-preserved suffix wording now matches AgentTool's
  formatWorktreeSuffix (agent.ts:1700-1719) verbatim, including the
  `git worktree add <path> <branch>` recovery hint for the directory-
  removed-but-branch-preserved race.

Test gaps closed:
- schema-mode success after 1 nudge (round-2 args captured)
- schema-mode success after 2 nudges (round-3 args captured)
- schema-mode + agentType together — floor disallowedTools still unioned
- schema-mode caller-abort takes priority over the StructuredOutput
  terminal error (signal.aborted check at workflow-orchestrator.ts:489-490)
- override path dispose() runs in finally on the success path
- override path dispose() runs in finally on the terminate-mode-error path

Declined R1 finding: negative tests for invalid opt types (schema/model/
agentType passed null/number/empty-string). Adding upfront type
validation is scope creep — upstream does not, P1/P2 do not, and the
workflow tool is model-authored where these inputs are extremely
unlikely. Existing AJV / SubagentManager downstream errors are descriptive
enough. Will revisit if R2 makes a stronger case.

166/166 tests pass (workflow suite + adjacent + workflow-orchestrator).
typecheck + lint clean across packages/core, packages/cli,
integration-tests, sdk, webui.

* chore(core): P3 self-review R2 — vm-realm opts revive + error-msg sanitize + 12 tests

R2 of pre-push adversarial self-review on PR #5034. 6 diverse-lens
finders (60 agents, ~2.5M tokens, 24 min) over the R1-fix-applied
code, with 2 independent skeptics defaulting to refuted=true.
12 confirmed survivors after adversarial verify; decisions below.

Security (FIX):
- agent() wrapper in workflow-sandbox.ts now JSON-revives agentOpts
  inside the vm runInContext block BEFORE passing them to the host
  dispatch. Closes a Proxy/inherited-getter escape that P3 introduced
  along with the user-supplied schema object: a script could have
  wrapped agentOpts.schema in a Proxy whose getter ran host-side code
  during SyntheticOutputTool construction / AJV compile. Same
  mechanism as args / parallel-result revival.
- runOverridePath now sanitizes opts.agentType through
  sanitizeForErrorMessage() (control chars → space) before
  interpolation into the "agent type 'X' not found" error message.
  Prevents a model-authored agentType containing CRLF / NUL from
  fragmenting a single-line error across log records / OTLP fields.

Reuse-altitude (FIX):
- Added JSDoc block to WorkflowWorktreeIsolation interface
  documenting each field's role for cleanup.

Test gaps (FIX, 12 new tests):
- agentType control-char sanitization regression
- dispose() runs in finally when subagent.execute throws
- isolation:'worktree' provision error branches (5):
  nested parent / git unavailable / not a git repo / parent dirty /
  createUserWorktree returns failure
- isolation:'worktree' cleanup branches (3):
  removeUserWorktree fails / branchPreserved race / removeUserWorktree
  throws — each preserves the worktree (or branch) with the right
  user-facing suffix
- combinations (2): model + isolation:'worktree' threads model AND
  provisions worktree; schema + isolation:'worktree' returns
  structured payload verbatim (preserved suffix only on string return)

Test infrastructure: vi.mock'd GitWorktreeService at the module level
(partial mock; preserves the existing exports the unrelated
worktreeCleanup.ts depends on) with a per-test beforeEach reset.

Declined R2 findings (kept the R1 line):
- [major] Schema parameter upfront validation: same scope-creep
  decline as R1. Upstream doesn't do it; AJV's downstream error is
  descriptive enough.
- [major] Worktree provision extracted to shared util with AgentTool:
  agreed in principle but out of P3 scope. A separate refactor PR
  should land that with AgentTool maintainers in the loop.

178/178 tests pass (workflow + adjacent suites). typecheck + lint
clean across packages/core, packages/cli, integration-tests, sdk,
webui.

* fix(core): address wenshao R1+R2 review on Workflow P3 (PR #5034)

Round 1 (15:41) + Round 2 (17:24) review from wenshao surfaced 7 inline
findings across schema-mode dispatch correctness, worktree cleanup
coverage, and error attribution. Each fix is paired with a regression
test that was RED before the change landed.

T0 [Critical] Worktree leak when schema setup throws after provision
  workflow-orchestrator.ts: outer try MOVED to start immediately after
  provisionWorkflowWorktree. Previously the try opened only after
  createSchemaConfigOverride / createSchemaModeState / signal listener
  attachment — so any throw in those three (broken MCP server during
  the per-call ToolRegistry rebuild was the trigger wenshao cited)
  orphaned the just-provisioned worktree under .qwen/worktrees/.
  Test: "isolation:'worktree' + schema setup throws → worktree is
  still cleaned up" — simulates createToolRegistry failure during
  createSchemaConfigOverride; asserts removeUserWorktree was called.

T1 [Critical] / T4 [H1] agentType + schema silently dead-ended
  workflow-orchestrator.ts: schema-mode augmented config now (a)
  appends ToolNames.STRUCTURED_OUTPUT to baseConfig.tools when the
  allowlist is restricted (no '*' and doesn't already contain it), so
  prepareTools / getFunctionDeclarationsFiltered doesn't filter
  structured_output out of the subagent's surface; (b) preserves the
  resolved agentType's persona by APPENDING the schema-contract
  instruction block instead of replacing the systemPrompt outright.
  Replace remains only on the ephemeral no-agentType path where
  baseConfig.systemPrompt IS WORKFLOW_SUBAGENT_SYSTEM_PROMPT (schema
  variant is its strict superset; avoids two near-identical prompts).
  Tests: structured_output appears in the allowlist alongside the
  agentType's existing tools; persona prompt is contained in the
  effective systemPrompt.

T2 [Suggestion] / T5 [M1] Parent-abort listener leaked per schema call
  workflow-orchestrator.ts: named listener stored at outer scope,
  removed in the outer finally regardless of how the dispatch ended.
  Previous `{ once: true }` only auto-removed on actual parent abort;
  the happy-path schema dispatch — success capture / 3-failure abort
  fires the CHILD controller without the parent ever aborting — left
  the listener stuck on the per-run signal. With N schema calls per
  workflow N listeners + N child-controller closures accumulated.
  Test: 5 sequential schema dispatches over the same parent signal
  end with zero live listeners.

T6 [M2] Terminate mode misdiagnosed as nudge exhaustion
  workflow-orchestrator.ts: schema path now distinguishes
  terminateMode before attributing failure to schema mode. TIMEOUT /
  MAX_TURNS / ERROR throw the existing "did not complete (terminate
  mode: X)" message that the non-schema path uses. Only the actual
  schema-failure cases produce schema wording, and those are split:
  attempts > 2 keeps the upstream-verbatim "(after 2 in-conversation
  nudges)" wording; attempts === 0 throws an accurate "no validation
  attempt — model produced plain-text content" instead of misleadingly
  citing nudges that never happened. (The existing 0-call test was
  updated to match the new accurate message; the 3-failure test
  retains the verbatim wording.)
  Tests: parametric over TIMEOUT/MAX_TURNS/ERROR asserting "did not
  complete"; companion test pinning the verbatim wording to the
  3-failure path.

T3 [Suggestion] Schema-mode JSON revival sentinel — clarified
  workflow-sandbox.ts: added a block comment documenting that the
  JSON-round-trip + null-on-throw is a SECURITY backstop (errors-as-data
  convention from parallel/pipeline) rather than a contract path —
  unreachable in production schema mode because the host return is
  LLM tool_call args, always JSON-serializable. No behavior change.

Tests: 75/75 orchestrator + 111/111 sandbox/tool/limiter green.
typecheck + lint clean across packages/core and packages/cli.

R1+R2 self-review commits (e1c5ec7 / 62624a9) precede this commit
on the same branch — they predate wenshao's review and address
distinct findings; reviewer L1 (worktree-lifecycle unit coverage) is
already closed by R2's 11 worktree tests.
LaZzyMan added a commit that referenced this pull request Jun 16, 2026
…5094)

* feat(core): Workflow P4a — extractAndStripMeta + meta on RunOutcome (#4721)

First half of P4 (per the refined #4721 plan). Extracts the script's
`export const meta = {...}` declaration into a typed object so the
workflow tool's display payload, and the future /workflows command +
phase-tree UI, can read it without re-parsing the script source. The
other half of P4 (slash command + KIND_NAMES extension + phase-tree
UI + WorkflowTaskRegistry) is queued as a follow-up PR.

Architecture: reuse the P1 brace-walker (zero-dep, no parser deps) to
locate the meta object literal's source range, then evaluate the literal
inside a fresh `vm.createContext(Object.create(null))` — null-prototyped
globalThis, no host bridge (no `args` / `process` / `require` / workflow-
sandbox globals). The vm realm still exposes its OWN intrinsics
(`Object` / `Math` / `Date` / `JSON`), which is fine: meta extraction is
one-shot at tool invocation, not replayed on resume. validateMeta walks
the eval result field-by-field and copies into a fresh host-realm plain
object — no JSON round-trip needed because every contract field is a
primitive.

User-visible additions:
- `extractAndStripMeta(source)` exported from workflow-sandbox.ts
- `WorkflowMeta` interface (`{ name, description, whenToUse?, phases?: Array<{title, detail?, model?}> }`) — verbatim shape from upstream Claude Code 2.1.168
- `WorkflowSandbox.getMeta()` accessor alongside `getPhases()` / `getLogs()`
- `WorkflowRunOutcome.meta: WorkflowMeta | null` (non-breaking add)
- `WorkflowExecutionError.meta: WorkflowMeta | null` so the failure
  display shows the workflow's name / description / phases even when
  the script body throws
- `WorkflowTool.execute` adds `meta` to the returnDisplay payload when
  present (omitted when the script had no meta)

Error messages verbatim from upstream where applicable:
- `meta.name must be a non-empty string`
- `meta.description must be a non-empty string`

Refactor: P1's `stripExportMeta` is preserved as a thin wrapper around
a new `findMetaBlockBounds` helper that both old and new functions
share. All 86 existing sandbox tests pass unchanged (no behavior
regression in the strip path).

Tests:
- 11 new `extractAndStripMeta` unit tests covering happy path, optional
  fields, missing-required validation, malformed shape, vm-eval failure,
  null-prototype globalThis (no `args` / `process` / `require`), and
  unbalanced braces
- 3 new `createWorkflowSandbox.getMeta()` integration tests
- 3 new `WorkflowOrchestrator` outcome.meta tests (null path, parsed path,
  meta-survives-body-throw on the error path)
- 3 new `WorkflowTool` display payload tests (meta in payload, omitted
  when absent, present on failure path)

Suite: 207/207 workflow + adjacent regression green; typecheck +
lint clean on packages/core. (Pre-existing acp test type errors in
packages/cli are unrelated; CI will confirm.)

Related #4721 (parent design — multi-phase, not closed by this PR)
Related #4732 (P1) #4947 (P2) #5034 (P3) — all merged
P4b follow-up: /workflows command + TaskKind workflow union + BackgroundTasksPill KIND_NAMES + phase-tree UI + WorkflowTaskRegistry

* test(core): close P4a adversarial-review gaps + add real-LLM E2E (#4721)

After PR #5094 opened without an E2E run, ran a 3-lens adversarial
review (correctness / security / completeness) of the meta-extraction
assertion strength against extractAndStripMeta and the meta-on-outcome
threading path. All 3 reviewers refuted the claim that the existing
assertions catch realistic regressions. Triage:

- 18 of 24 findings already covered by workflow-sandbox.test.ts
  (string-with-brace, comments-inside-meta, phases[].model,
  missing-description error text, args/process/require unreachability,
  Promise/Math.constructor escape, etc.)
- 4 findings (regex literal / template literal / `/m` flag / spread)
  are host-side parse-path branches the brace walker handles
  structurally but without explicit negative tests
- 2 truly novel gaps closed here:

  1. HIGH × 3 lenses: a regression in validateMeta that returns the
     vm-realm `raw` value directly (skipping the host-realm copy at
     workflow-sandbox.ts:283-294) would re-open T1/T8/T14 realm
     escape via outcome.meta.constructor.constructor('return process')().
     Vitest toEqual is structural and does NOT check prototype
     identity, so every prior assertion in the suite would still pass.
     Add returned-meta + phases array + phase entries prototype-
     identity check in workflow-sandbox.test.ts; mirror end-to-end
     in the live test's scenario A.

  2. MEDIUM: meta-shaped result collision — if a script returns
     `{ name, description, phases }`, the safeStringifyDisplayPayload
     spread must keep `meta` and `result` distinct at the top level.
     Add a workflow.test.ts case that returns a meta-shaped object and
     asserts both display.meta and display.result hold their own
     distinct values.

Also add the real-LLM E2E harness at workflow-p4a-meta-live.live.test.ts
(6 scenarios: meta+agent, no-meta, malformed-meta short-circuit,
body-throw with meta preservation, parallel() fan-out with meta phases,
pipeline() multi-stage with meta phases). The suite is gated by
DASHSCOPE_API_KEY — describe.skip when absent, so CI without the
env shows 0 tests in this file rather than failing. Verified locally
6/6 against qwen3-coder-plus via DashScope OpenAI-compatible endpoint.

Final test count: 129/129 (89 sandbox + 34 tool + 6 live).

* fix(core): P4a meta-literal Promise crash + live-test typecheck + prettier (#4721)

Round 3 review fixes:

1. **(Critical, wenshao R1)** A Promise — typically from `import('node:fs')`
   inside a meta literal — used to crash the host process. `runInContext`
   evaluates the literal synchronously and returns; `validateMeta` drops
   the non-contract field silently; the workflow returns its result;
   THEN the dangling unhandled rejection terminates the process under
   Node's default `--unhandled-rejections=throw`, decoupled from the run
   that triggered it. Wenshao reproduced on Node 22.22 with:

       export const meta = { name:'x', description:'d', extra: import('node:fs') }
       return 1
       → run returns 1, process exits with code 1.

   Mitigation: after `vm.Script(...).runInContext(...)`, walk the eval
   result recursively, call `.catch(() => {})` on any thenable to mark
   the rejection handled, and throw an explicit
   "meta values must not be Promises" so the malformed meta is rejected
   before validation continues. Recursion covers `phases[]` entries
   embedding `import()` below the top level. Two RED-first regression
   tests in workflow-sandbox.test.ts (top-level + nested-in-phases).

2. **(Critical, wenshao R2/R3)** `tsc --noEmit` failed with 11 errors in
   the new live E2E test file, blocking CI Lint + all 3 Test jobs:

   - TS2459 (L36): `WorkflowAgentOpts` is exported from
     `workflow-sandbox.js`, not from `workflow-orchestrator.js` —
     fixed import path.
   - TS2322 (L108/165/220): typing `liveDispatch` as
     `WorkflowAgentDispatch` widens the return to `string | object`,
     which doesn't fit `lastText: string`. Dropped the type annotation;
     the inferred `Promise<string>` is still assignment-compatible with
     `WorkflowAgentDispatch` (string ⊂ string | object).
   - TS2345 (×6): `WorkflowRunRequest.args` is required (`args: unknown`,
     not optional). Added `args: undefined` to every `orch.run({ script })`
     call.

3. Prettier: `--write` on the 4 touched files. R2 also flagged this;
   pre-commit lint-staged would normally cover it but the live test
   file's TS errors short-circuited it.

Final local verification:
- `tsc --noEmit`: 0 errors
- 209/209 tests pass across workflow-sandbox + workflow-orchestrator +
  workflow.test.ts + live (6 scenarios against qwen3-coder-plus via DashScope)

* feat(core+cli): Workflow P4b — /workflows command + phase-tree UI + WorkflowRunRegistry (#4721)

P4b completes phase P4 of the Dynamic Workflows port. P4a (already on
this branch, commits 5b56c39 / 55c23a0 / 402df8f) locked the
meta contract: outcome.meta / err.meta / display payload. P4b adds the
consumer side — visible workflow runs in the TUI.

## Core (4 changes, 1 new file)

- `TaskKind` widened in `packages/core/src/agents/tasks/types.ts` from
  3 → 4 variants (adds `'workflow'`). `TaskState` union picks up
  `WorkflowTask` automatically.
- New `WorkflowRunRegistry` (`packages/core/src/agents/workflow-run-
  registry.ts`) — sibling of `BackgroundTaskRegistry` /
  `BackgroundShellRegistry` / `MonitorRegistry`. Same register / cancel /
  get / list / on('statusChange') shape; per-kind state holds runId,
  meta, current phase, phase history, dispatch counters, recent logs.
  Eviction: `MAX_RETAINED_TERMINAL_WORKFLOWS = 10` mirrors monitor cap.
- `Config.getWorkflowRunRegistry()` exposed via the same Object.create
  override pattern as the other registries.
- `WorkflowOrchestratorEmitter` interface added to workflow-sandbox.ts —
  fires `phaseStarted` (from sandbox safePhase), `agentDispatched` /
  `agentCompleted` (from orchestrator countedDispatch), and
  `logAppended` (from sandbox safeLog). Defensive try/catch around
  every emit so a subscriber error never bubbles into the script.
  Orchestrator accepts optional `runId` in WorkflowRunRequest so
  callers can pre-generate the id and register the run BEFORE run()
  resolves.
- `WorkflowTool` now registers the run with the registry at execute()
  start, wires the emitter to the registry's update methods + the
  tool's _updateOutput callback, flips `canUpdateOutput` to `true`
  for live phase-tree rendering, and routes terminals to
  registry.complete / fail / cancel (cancel on signal.aborted so
  user intent stays distinct from script bugs).
- `WorkflowRunRegistry` exported from core/index.ts.

## CLI (5 changes, 2 new files)

- `BackgroundTasksPill.tsx` `KIND_NAMES` gains `workflow:
  { singular, plural }`. Counts accumulator + sort order updated:
  `shell → agent → monitor → workflow → dream` (user-initiated
  before system-initiated).
- `useBackgroundTaskView.ts` subscribes to the workflow registry
  alongside the existing three; `entryId` switch adds
  `case 'workflow': return entry.runId`; cleanup unsubscribes.
- `BackgroundTasksDialog.tsx` adds `WorkflowDetailBody` (inline,
  matches MonitorDetailBody style) — renders workflow name,
  description, status, runtime, current phase, agent dispatch
  counts (M/N), the phase tree (capped at MAX_VISIBLE_PHASES=20
  with "+N more above"), and the log tail (capped at
  MAX_VISIBLE_LOG_LINES=10). rowLabel switch surfaces
  `[workflow] <name> · <phase> (M/N)`. DetailBody + statusVerb
  switches gain workflow cases.
- `BackgroundTaskViewContext.tsx` cancelSelected: `case 'workflow':
  registry.cancel(runId, Date.now())`. Idempotent with the
  WorkflowTool's signal.aborted catch path.
- `BackgroundTasksDialog.test.tsx` entryId mock gains workflow case.

## New /workflows slash command

- `packages/cli/src/ui/commands/workflowsCommand.ts` + tests.
  Bare `/workflows` lists active + completed runs (running first,
  then terminal by endTime DESC). `/workflows <runId>` opens a
  per-run detail dump (meta block, status, runtime, phase tree,
  recent logs, errors).
- Gated by `Config.isWorkflowsEnabled()` in BuiltinCommandLoader —
  command vanishes from typeahead when the flag is off. Already-
  defined env-var overrides (`QWEN_CODE_ENABLE_WORKFLOWS` opt-in,
  `QWEN_CODE_DISABLE_WORKFLOWS` kill switch) inherited for free.
- Interactive mode adds a "Tip: focus the Background tasks pill"
  redirect; non-interactive / acp modes omit the tip since they
  have no dialog.

## Scope deferrals

- **ACP daemon protocol widening** (acp-bridge bridgeTypes / status /
  tasksSnapshot) is deferred to a follow-up PR. Workflows remain
  invisible to SDK + web-shell consumers in P4b; the CLI-internal
  surface is complete.
- **Phase-tree token rollup** (per-phase token totals in the detail
  body) needs P5's budget tracker. The infrastructure (registry
  records, emitter fire sites) is ready for the column when P5 lands.
- **Save / inspect subcommands** (`/workflows save <runId>` to
  materialize a script) are future enhancements; the slash command
  ships with list + detail only.

## Verification

- 223/223 tests pass on the workflow surface — 14 new registry tests,
  6 real-LLM E2E scenarios against qwen3-coder-plus (DashScope),
  plus all existing P3/P4a tests continuing to pass with the new
  emitter wiring.
- 73/73 CLI tests pass across BackgroundTasksPill, BackgroundTasks
  Dialog, useBackgroundTaskView, workflowsCommand.
- `tsc --noEmit` clean (0 P4b errors in core + cli).
- `prettier --check` + `eslint` clean on all touched files.

* fix(core): bound rejectThenablesInMeta against cyclic meta input (#4721)

Round 4 review fix.

**(Suggestion, wenshao R4)** The R3 thenable walker recursed without a
cycle guard. A meta literal that builds a cyclic object via spread
overflows the call stack:

    export const meta = {
      name: 'x',
      description: 'y',
      ...(function () { const a = {}; a.self = a; return a; })(),
    }

vm-eval returns the cyclic object cleanly; `rejectThenablesInMeta`
walks `Object.values(...)` and recurses into `a.self === a` forever,
producing `RangeError: Maximum call stack size exceeded`. The
walker exists to reject Promises before they leave a dangling
rejection, but the walk itself must terminate on any shape vm-eval
can return — not just on the happy-path acyclic shape.

Fix: thread an optional `seen = new WeakSet<object>()` parameter,
early-return on `seen.has(value)`. Bounds the recursion against
both cycles AND shared subgraphs (where the same node is reached
through multiple keys), and keeps the walker O(N) on the eval'd
size.

Two RED-first regression tests:
- Direct self-reference via spread: `{ ...{self: itself} }`
- Cycle reached through nested arrays/objects: `{ ...{items: [{ref: outer}]} }`

Both previously threw RangeError; both now succeed (validateMeta
silently drops the non-contract `self` / `items` fields, so the
returned meta is just `{ name, description }` — only reachable if
the walker terminates first).

`validateMeta` does NOT recurse into nested objects (it walks the
top-level contract fields + iterates `phases[]` one level deep with
direct property access), so no sibling drift — only the thenable
walker needed the guard.

* test(cli): stub isWorkflowsEnabled in BuiltinCommandLoader mock config (#4721)

CI fix for c3f9d84 (Workflow P4b).

P4b added a gated `workflowsCommand` to BuiltinCommandLoader:

    this.config?.isWorkflowsEnabled() ? workflowsCommand : null,

The optional-chain only guards `config` being null/undefined — once
config is truthy, `.isWorkflowsEnabled()` invokes the method directly.
The existing `mockConfig` in BuiltinCommandLoader.test.ts stubbed
`isLspEnabled` / `getFolderTrust` / `getManagedAutoMemoryEnabled`
but never `isWorkflowsEnabled`, so the new call hit `undefined()` and
threw `TypeError: this.config?.isWorkflowsEnabled is not a function`.
10 tests (×3 OS) red on `e9ad07683` for this single reason.

Add `isWorkflowsEnabled: vi.fn().mockReturnValue(false)` to the mock,
matching the existing pattern. Default to `false` so the loader does
not add `workflowsCommand` to the assertions that count exact builtin
output — those tests are unchanged.

* test(core): cover P4b registry integration + orchestrator emitter (#4721)

Round 5 fix for the two Critical findings on the P4b commit.

## workflow.test.ts — registry integration seam (+3 tests)

\`fakeConfig()\` returns \`{}\`, so \`config.getWorkflowRunRegistry?.()\`
short-circuits to undefined in every existing test. The whole P4b
integration path inside \`WorkflowTool.execute()\` — \`register()\` on
start, the emitter closure firing into the registry, post-run
\`complete()\`, catch-arm \`fail()\` / \`cancel()\` branching — is
never exercised.

Add a \`configWithRegistry()\` helper that builds a config holding a
real \`WorkflowRunRegistry\` and returns the registry handle for
inspection. Three new tests pin:

- **success path**: registry entry transitions to \`completed\` with
  meta synthesised from \`meta.name\` (the tool fast-tracks
  description = meta.name when default = runId), correct phases
  array, agent counts \`1/1\`, script result mirrored, \`endTime\` set.
- **failure path**: registry entry transitions to \`failed\`, error
  message recorded verbatim, phases up to the throw preserved.
- **abort path**: pre-aborted signal causes the catch arm to record
  \`cancelled\` (not \`failed\`) so the dialog distinguishes
  user-initiated stops from script bugs.

## workflow-orchestrator.test.ts — emitter callbacks (+3 tests)

The \`emitter\` field on \`WorkflowRunRequest\` and its firing sites
(sandbox \`safePhase\` / \`safeLog\`, orchestrator \`countedDispatch\`
before + after) are the only channel keeping the registry record in
sync with the live run. Three new tests pin:

- **happy-path ordering**: with all four callbacks wired to an event
  log, the script \`phase('Plan') → log('starting') → agent →
  phase('Build') → agent\` emits seven events in expected order
  with expected payloads (\`label\` threaded through both
  \`agentDispatched\` and \`agentCompleted\`).
- **rejection path**: dispatch throwing \`dispatch-boom\` fires
  \`agentCompleted(label, 'dispatch-boom')\` — pins the symmetric
  emit-on-throw at workflow-orchestrator.ts:1124 catch arm.
- **defensive try/catch**: every callback throwing should be
  swallowed so orchestration still completes. Pins each emit site's
  try/catch wrapper individually.

Total: 6 new tests, all green. typecheck 0, prettier clean.

* fix(core+cli): R7 review fixes — 6 substantive + tmux re-verified (#4721)

Wenshao's R7 review (with real build + tmux verification on the
merged state) approved the PR but surfaced 6 valid findings. All
fixed, all RED-first tested, all verified end-to-end against
qwen3-coder-plus via DashScope.

## 1. Dialog-cancel drops accumulated logs (Critical)

`registry.setRecentLogs(...)` previously guarded `status === 'running'`
only. Dialog-initiated cancel marks `status='cancelled'` synchronously
BEFORE the tool's catch arm tries to write logs — so cancelled runs
always showed an empty Logs section in the dialog. Allow the write
after a `'cancelled'` transition too; keep `'completed'` and `'failed'`
as final-state rejects. 2 regression tests: cancel-then-logs-still-
writes, and complete/fail-still-reject.

## 2. WorkflowRunRegistry missing session-reset wiring (Critical)

Sibling drift miss from P4b: `BackgroundTaskRegistry` /
`BackgroundShellRegistry` / `MonitorRegistry` all exposed `reset()` +
`abortAll()`, and `backgroundWorkUtils` (`hasBlockingBackgroundWork`,
`resetBackgroundStateForSessionSwitch`) wired all three. `Workflow
RunRegistry` had neither. Result: `/clear` and session-resume ran
while a workflow was mid-run (orphaned dispatch loop), and terminal
rows leaked from session to session in the pill / dialog /
`/workflows` list.

Added `hasRunningEntries()`, `reset()` (drops entries, no controller
touch), `abortAll()` (cancels every running entry + aborts its
controller). Wired into both `backgroundWorkUtils` helpers + updated
their tests for the 4-sibling shape.

## 3. Phase dedup inconsistency — sandbox vs registry (Critical)

`phase('X'); phase('X')` previously yielded `outcome.phases = ['X','X']`
(sandbox `safePhase` unconditional push) but `entry.phases = ['X']`
(registry `onPhaseStarted` collapsed). The same run showed different
phase counts in the terminal `returnDisplay` JSON vs the live UI.
The `agent({phase})` wrapper already deduped (`__b.lastPhase()`); my
docstring on `safePhase` claimed it deduped too, but it didn't.

Fix at the sandbox layer (single source of truth): `safePhase` skips
when `phases[last] === t`. Registry-side dedup is now redundant but
harmless (defense in depth, doesn't double-collapse). Updated test
in workflow-sandbox.test.ts pins
`phase('X'); phase('X'); phase('Y'); phase('X')` → `['X','Y','X']`.

## 4. /workflows tip pointed at non-functional path (UX/docs)

`/workflows` tip text said *"focus the Background tasks pill in the
footer (use ↓ from an empty composer) and press Enter for the
interactive dialog with phase tree + live updates."* But
`setPillFocused(true)` doesn't exist anywhere in the codebase
(confirmed by wenshao's grep, and reproducible on my own tmux runs
where `↓ Enter` never opened the dialog — I previously misattributed
to a tmux limitation). The dialog IS reachable through other paths
but the tip's specific instructions are wrong.

Soften the tip to point at the actually-working text-mode detail
view: `Tip: use /workflows <runId> for the per-run detail view
(name, description, phase tree, recent logs).` — same information,
working instructions.

## 5. `runId` validation comment was aspirational (docs)

Comment on `workflow-orchestrator.ts` `run()` claimed *"validates
the shape (`wf_<hex>`)"* but the code is `const runId = req.runId
?? generateRunId();` — no validation. Caller (`WorkflowTool`) does
use the same `wf_<8hex>` generator as `generateRunId()`, so the
behavior is safe in practice. Fixed the comment to describe what
the code actually does (trusts the caller, no validation).

## 6. Duplicate extractAndStripMeta test (test hygiene)

Two tests at workflow-sandbox.test.ts:227 and :242 used identical
source `{ name: args.x, description: 'd' }` — copilot R1 originally
flagged this, I declined as bot finding, wenshao re-confirmed. The
intent was to pin two distinct things: (a) generic unknown
identifier throws, (b) the bridge global `args` specifically is
not reachable. Updated the first test to use `totallyUnknown` (a
genuine unknown name) and kept the second as the explicit `args`
regression — now the two tests pin different things.

## Verification

- 231/231 core workflow tests pass (registry + sandbox + orchestrator
  + tool) — +6 new from R7 RED-first regression tests
- 81/81 CLI ripple tests pass (pill + dialog + hook + command +
  backgroundWorkUtils)
- tsc 0 errors on core + cli (after rebuilding core dist for the
  new registry methods)
- prettier + eslint clean on all touched files
- **tmux re-verified end-to-end** against qwen3-coder-plus via
  DashScope on a fresh `npm run bundle`:
  - Phase dedup: `phase("Phase A"); phase("Phase A")` →
    `outcome.phases = ["Phase A", "Phase B"]` (was `["Phase A",
    "Phase A", "Phase B"]` pre-fix) — confirmed both in the tool
    result JSON AND in `/workflows wf_xxx · 2 phases`
  - New `/workflows` tip text rendered as expected, no more
    advertising broken pill focus path
  - `/workflows <runId>` detail dump still works: name, status,
    runtime, phases tree, agent counts

* test(core): R7 dialog-cancel race integration test (#4721)

The unit test in workflow-run-registry.test.ts pins the setRecentLogs
guard widening in isolation. This integration test stands up the full
production wiring — real WorkflowTool, real WorkflowRunRegistry, real
sandbox, real emitter — and reproduces the exact dialog-cancel race
that the R7 fix targets:

1. Start execute() with a controllable dispatch that hangs until
   externally rejected.
2. Wait for the run to register + the dispatch to be in flight + at
   least one log() call to have accumulated.
3. Simulate the dialog: call registry.cancel(runId) directly. This
   is the exact entry point cancelSelected() uses in
   BackgroundTaskViewContext.cancelSelected for kind='workflow'.
4. Cascade the dispatch rejection (the production path: the registry's
   abortController abort propagates through dispatchController → the
   orchestrator's limiter → the in-flight dispatch).
5. Await execute() — the tool's catch arm runs setRecentLogs with the
   accumulated logs.
6. Assert: status='cancelled' AND recentLogs contains the script's
   log('before agent dispatch') entry.

RED→GREEN verified: temporarily reverted the setRecentLogs guard to
the pre-R7 single-state form, the test failed with
`AssertionError: expected 0 to be greater than 0` (recentLogs was
empty because the guard rejected). Restored fix, test passes.

Production reachability note: the dialog itself is not currently
reachable through the TUI pill focus chain (setPillFocused(true)
does not exist anywhere in the codebase — wenshao R7 verification
finding #1, pre-existing infra gap out of P4 scope). This
integration test is the closest available real-scenario
verification for the fix without modifying out-of-scope code; the
test drives the EXACT registry.cancel + dispatch rejection +
catch-arm sequence the dialog would trigger.

* test(cli): stub getWorkflowRunRegistry in clearCommand + useResumeCommand mocks (#4721)

CI fix for b53bc4e (the R7 fix commit).

R7 (commit b53bc4e) wired WorkflowRunRegistry's new reset() /
abortAll() / hasRunningEntries() methods into backgroundWorkUtils.ts
(hasBlockingBackgroundWork + resetBackgroundStateForSessionSwitch).
clearCommand.ts and useResumeCommand.ts call both utils, but their
mock configs didn't stub getWorkflowRunRegistry — so once the util
started calling it, every clearCommand test threw
`TypeError: config.getWorkflowRunRegistry is not a function`. CI
ubuntu/macos/windows × 10 tests × clearCommand.test.ts went red.

This is the same sibling-drift miss as the BuiltinCommandLoader fix
in 2491911 (R7 pre-cursor): I added a new Config method, every
existing mock that ships a Config-shaped object goes stale until
stubbed.

Fix: add the same `getWorkflowRunRegistry` stub shape to:
- clearCommand.test.ts: 3 sites (default mock + non-interactive mock
  + blocked-background mock)
- useResumeCommand.test.ts: 4 sites (all 4 mock configs)

Stub shape mirrors the 3 sibling registries' interfaces exposed via
backgroundWorkUtils: `hasRunningEntries`, `reset`, `abortAll`.
Returns false / vi.fn() so default behavior matches "no workflow
running" + "reset is observable". Existing tests that exercise the
blocked-background path (hasUnfinalizedTasks: true) keep working
because workflow's hasRunningEntries=false still allows the
agent-side block to trigger.

Verification: 24/24 in clearCommand+useResumeCommand pass locally;
the 7-file P4 + impacted suite pass 105/105 (registries + commands +
hooks + dialog + pill + workflowsCommand + backgroundWorkUtils).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants