docs: v2 content audit — source-of-truth resync, tools catalogue, troubleshooting#35
Conversation
…ubleshooting Reviewed every user-facing page from the persona of an engineer who wants to create a zombie. Reconciled against zombiectl/src + src/errors/ + src/zombie/. CLI surface fixed: - workspace add takes optional [name], not <repo-url> (workspace.js:27-37) - tenant provider section added then removed per BYOK descope - credential add heading uses safe --data=@- form - install --from now correctly described as read-and-upload, not write - zombiectl events example uses concrete zmb_2041 + zombiectl list pointer Quickstart: - new "Create a workspace" step (was missing — first-run hit ERR_NO_WORKSPACE) - skill install via curl from usezombie.sh/skills.md (replaces broken npx form) - BYOK setup step removed; sign-in copy reflects platform-managed default - op inject example for credential rotation with op signin caveat - workflow_run filter visibility note linking authoring Error codes resynced against error_registry.zig + error_entries.zig: - removed retired UZ-ZMB-007 (now UZ-VAULT-002) - corrected UZ-ZMB-010 title to "Zombie already stopped or killed" - added UZ-ZMB-011 (SKILL.md/TRIGGER.md name disagree) - added Vault section (UZ-VAULT-001/002) - added Memory section (UZ-MEM-001/002/003) - added Integration grants section (UZ-GRANT-001/002/003 — all 403) - fictional UZ-TRIGGER-001 in cli/flags replaced with real UZ-ZMB-008 - WORKSPACE_NOT_FOUND in api-reference/introduction → canonical UZ-WORKSPACE-001 - memory.mdx: UZ-MEM-001 corrected (it's scope-denied, not backend-down → UZ-MEM-003) - Credentials section reframed as inference-path errors with cross-link to UZ-VAULT and UZ-ZMB-003 since BYOK setup is no longer documented New pages: - zombies/tools.mdx — 24 user-callable tools grouped into 9 categories, descriptions sourced from nullclaw tool_description constants. Authoring page's inline catalogue (which surfaced only 3 of 24) replaced with a pointer. Pushover dropped from catalogue (env vars not provisioned on hosted v2; tool silently no-ops for end users). - zombies/troubleshooting.mdx — 6 symptom decision-tree mapping to wire codes and CLI diagnostics. Every command grounded in cli/zombiectl.mdx surface; no invented flags. Authoring + webhooks: - trigger.type table expanded: webhook | cron | api with use-when guidance. chain removed from user-facing surface (parser-accepted but no runtime dispatch — verified zero .chain => switch arms in worker/http source). Continuation note clarifies they're runtime-emitted, not declared. - webhooks: natively normalized providers table (slack/github/linear) derived from webhook_verify.zig PROVIDER_REGISTRY. Generic-HMAC subsection references jira + agentmail (only providers with test fixtures in config_helpers.zig). Workspace + zombie overview: - workspace add corrected in 3 places to optional [name] - mermaid case alignment in zombies/overview lifecycle diagram DX additions: - Mintlify feedback block enabled in docs.json (thumbsRating, suggestEdits, raiseIssue) — readers can flag wrong content without leaving the site - Discord link surfaced in index Explore grid + troubleshooting "When to escalate" with concrete fast-vs-durable framing Out of scope (parked, not in this commit): - Custom-zombie tutorial (no tested non-platform-ops example yet) - Dashboard page (Mission Control referenced but not separately documented) - Migration guide v1 → v2 (deferred to v3 milestone) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…--from There's no dry-run — the call uploads on success. Reframed as "catch schema errors before committing" with the no-`--validate-only` caveat kept inline. Addresses Greptile P2 on PR #35. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greptile P1 caught two webhook codes cited in troubleshooting that weren't in
the registry. Both exist in src/errors/error_registry.zig:117-118 and
error_entries.zig:149-155 — they got missed in the original resync sweep.
Also fixed my own bug: troubleshooting listed UZ-WH-020 as HTTP 400, but
source says .unauthorized (401). Description now matches the runtime detail
string ("No webhook credential is configured for this zombie's source"),
and UZ-WH-030 size limit pinned to the actual 1 MiB cap from source.
Addresses Greptile P1 on PR #35.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…rb (remove → delete) Greptile flagged four leftover repo-binding statements in workspaces/managing.mdx and workspaces/overview.mdx that contradicted the new tenant-scoped-container model from the v2 audit sweep. While verifying against zombiectl/src/commands/ workspace.js I also caught: 1. The CLI verb is `workspace delete` (workspace.js:182), NOT `workspace remove`. `remove` isn't even an alias — five doc references would have errored at the user's terminal. Fixed across managing.mdx, overview.mdx, cli/zombiectl.mdx. 2. The actual `workspace list` output (workspace.js:90-101) has three columns: ACTIVE marker, WORKSPACE (id), NAME. The doc's sample showed REPO/STATUS/PLAN columns with `acme/backend`-style row content — completely fictional. 3. The actual `workspace show` fields (workspace.js:145-148) are workspace_id, active, name, created_at. The doc claimed "repo URL" and "default branch." Changes: - managing.mdx: list output rewritten to match real CLI columns; show + delete sections rewritten to match real fields; Warning explains the practical consequence (zombie_ids regenerate, upstream webhooks must be re-wired) - overview.mdx: intro and "What a workspace contains" no longer mention GitHub App binding or repository connection; lifecycle step 4 says "Deleted" with the UZ-WH-001 consequence for stale upstream webhooks - cli/zombiectl.mdx: command-groups table + section heading + example all switch from `workspace remove` to `workspace delete`; show fields corrected Addresses Greptile findings 1-4 on PR #35. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Greptile round 3 — all four workspace findings applied in Source-of-truth pre-checksBefore fixing, I verified actual CLI behaviour against
Fixes applied (commit 2629540)Finding #1 — Replaced fictional ```diff
|
The user-facing --help table in zombiectl/src/program/io.js is curated and omits several dispatched commands. Switching the doc baseline from --help to the actual dispatcher (routes.js + commands/*.js) caught nine errors and three missing surfaces. Wrong verbs (would fail at user's terminal): - agent create → agent add (agent.js:16) - grant revoke → grant delete (grant.js:22) Removed (don't exist anywhere in source): - workspace upgrade-scale (plan changes are dashboard-only) - workspace billing (replaced by tenant-level `billing show`) - "up" command was never a thing (already absent, table claim removed) - --follow flag — neither logs nor events streams; was wishful Added (exist in routes.js but were never documented): - stop / resume / kill / delete zombie lifecycle as a four-row table with reversibility column. Killing without delete keeps the webhook URL reserved (returns same response shape but routes nowhere new); delete releases the URL and any upstream pointing at it starts hitting UZ-WH-001. - billing show — tenant wallet balance + charge history, replaces the fictional `workspace billing` reference. Per-zombie spend stays via `logs --json` aggregated client-side. - ZOMBIE_API_KEY env var (machine-bound service auth from agent add) + NO_COLOR — both real per io.js:64-67, missing from configuration.mdx. Logs vs events split clarified: - logs <zombie_id> is the simple tail in --help (--limit, --cursor, --json) - events <zombie_id> is the power filter surface (--actor, --since, --cursor, --limit, --json). Wired in routes.js:23 and zombie.js:32 but intentionally omitted from the curated --help. Documented per user's source-of-truth-over-help directive — the troubleshooting decision tree depends on actor filtering to disambiguate webhook failure shapes. Files changed: cli/zombiectl.mdx (heavy rewrite of zombie + agent + grant + workspaces sections, new billing section), zombies/running.mdx (new lifecycle table + logs/events split), quickstart, webhooks, overview, managing, cli/install (events --follow → logs everywhere), billing/plans (drop upgrade-scale CLI), billing/budgets (workspace billing → billing show), cli/configuration (env var additions). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Round 4 — CLI surface resync against the dispatcher source-of-truth ( The user-facing Wrong verbs (would have errored at the user's terminal)
Removed (don't exist anywhere in source)
Added (exist in
|
…install rejection Greptile P1 surfaced a deeper bug in my own original claim. tools.mdx Validation section had asserted that `zombiectl install --from <path>` rejects unknown tool names with UZ-TOOL-005. Source check (tool_bridge.zig:178-184) shows install does not reject — unknown names get logged at warn level and silently skipped during tool building. UZ-TOOL-005 is registered as .bad_request in error_entries.zig:165 but no code path emits it as an HTTP response. Net: UZ-TOOL-005 is a server-side observability signal, not a wire code returned to a webhook caller, and definitely not something a user will see during install. tools.mdx Validation section rewritten to describe the actual behaviour: typos pass install, surface as silent skips at runtime, audit via zombiectl logs. Calls out UZ-TOOL-004 (Tool not attached) as the runtime guard the agent actually hits when it tries to invoke a missing tool. troubleshooting.mdx §3 wire-code table replaces UZ-TOOL-005 with UZ-TOOL-004 (the real runtime surface), plus a Note explaining that typos surface as log warnings rather than wire codes, with a pointer to the Validation section. This resolves Greptile's "stale section placement" finding by fixing the underlying mis-claim that produced it: UZ-TOOL-005 isn't an install-time error, so it never belonged in §1, and isn't a wire-stage error, so it doesn't belong in §3 either. Addresses Greptile P1 #r3190015354 on PR #35. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greptile R6 P1+P1. Both findings valid; verifying against source surfaced deeper bugs. Issue 1 (network.allow per tool): - http_request: gated by network.allow: (PolicyHttpRequestTool wraps and enforces via tool_builders.zig:184-200) - web_search: platform-managed provider chain (searxng/duckduckgo/etc), bypasses network.allow: - web_fetch: built with empty allowed_domains in tool_builders.zig:225, which NullClaw treats as allow-all (web_fetch.zig:23). Currently unconstrained on hosted v2 — known limitation. Network section table now has a dedicated "network.allow: enforced?" column with honest answers per row, plus a Note that calls out the three different stories. Issue 2 (browser tool overlap): - browser action 'screenshot' explicitly redirects to standalone tool (browser.zig:42) - browser actions 'click'/'type'/'scroll' fail with "CDP not available" (browser.zig:47-52) - browser action 'open' shells out to OS browser — useless on headless cloud executor - browser_open: empty allowed_domains in tool_builders.zig:252, browser_open.zig:50-51 fails closed on empty. Non-functional on hosted v2 today (same shape as Pushover earlier). - screenshot: captures executor display, which doesn't exist on hosted v2 cloud workers Browser section rewrites the table to: (a) acknowledge the three tools are independent (declaring 'browser' doesn't unlock 'screenshot' or 'browser_open'), (b) add an "actually works on hosted v2" column documenting the headless-executor limitations honestly, (c) add a Warning explaining the declaration semantics so users don't pick the wrong set. Addresses Greptile #r3190015354 follow-up findings on PR #35. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Round 6 — both Issue 1 —
|
| Tool | Builder (tool_builders.zig) |
Effective behaviour |
|---|---|---|
http_request |
Lines 184-200 — wrapped in PolicyHttpRequestTool which enforces policy.network_policy.allow |
Gated by network.allow: |
web_search |
Lines 210-219 — uses platform-managed provider chain (searxng/duckduckgo/brave/firecrawl/tavily/perplexity/exa/jina), no allowlist field exposed |
Bypasses network.allow: entirely |
web_fetch |
Line 225 — built with allowed_domains = &.{} (empty), and NullClaw's web_fetch.zig:23 documents // empty = allow all |
Currently unconstrained on hosted v2 |
web_fetch's lack of host filter is a real gotcha. An agent declaring web_fetch can fetch arbitrary HTTPS URLs regardless of network.allow:. The Network table now has a dedicated "network.allow: enforced?" column with the honest answer per row, plus a Note that calls out the three different stories.
Issue 2 — Browser tools
Greptile spotted the overlap, but the deeper finding is that all three browser tools are partially or fully non-functional on hosted v2:
-
browser— schema advertises six actions (open,screenshot,click,type,scroll,read). Source check (browser.zig:37-56):screenshotaction: explicitly returns"Use the screenshot tool instead"— redirects to the standalone toolclick/type/scroll: always fail with"requires CDP (Chrome DevTools Protocol) which is not available"open: shells out toopen/xdg-open/cmd.exe start— needs a display, which a headless cloud executor doesn't haveread: works (curl-based, capped at 8 KB), butweb_fetchis strictly better
-
screenshot— captures executor's screen. The hosted v2 executor is headless; there's no display. -
browser_open— non-functional on hosted v2. Same shape as Pushover from earlier. Platform builder passes emptyallowed_domains(tool_builders.zig:252). The underlying tool fails closed on empty (browser_open.zig:50-51:"No allowed_domains configured for browser_open"). The tool is effectively unreachable until the platform wires the user'snetwork.allow:through to it.
The Browser section now has an "actually works on hosted v2" column for each tool, plus a Warning making the declaration semantics explicit:
Declaring
browserdoes NOT unlockscreenshotorbrowser_opencapabilities. Each tool is independent; the sub-action names overlap by accident.
Decisions for you
Two of these tools have zero working code path for end users on hosted v2:
browser_open— same shape as Pushover (which we already dropped). Empty allowlist, fails closed, no path for users to fix it. Recommend dropping from the catalogue until the platform wires the allowlist through.screenshot— needs a display the cloud executor doesn't have. Recommend dropping unless self-hosting is on the near-term roadmap.
browser has one working path (read) that's strictly inferior to web_fetch. Could go either way — keep with the warning, or drop the whole section.
For now I've documented the limitations rather than removing the tools, so a curious reader gets the honest story. Tell me whether to drop any of these like we did with Pushover.
Pattern note
Round 6 is the second time a Greptile-flagged surface review has uncovered a deeper "tool builder passes empty allowlist; underlying tool fails closed" bug (first was Pushover env vars, now browser_open). Worth a runtime fix: the executor should either pass the user's network.allow: through to all network-aware tools, or refuse to dispatch tools whose builder produces a guaranteed-fail configuration. Source-side issue, not docs.
Same shape as the earlier Pushover removal. The platform builder passes empty allowed_domains (tool_builders.zig:252) and the underlying tool fails closed on empty (browser_open.zig:50-51 returns "No allowed_domains configured for browser_open"). Zero working code path for end users today; advertising it in the catalogue is misleading. Removed the row from zombies/tools.mdx Browser section and updated the Warning to drop browser_open from the declaration-semantics callout. Tool count: 24 → 23 in tools.mdx and authoring.mdx. If/when the platform wires the user's network.allow: through to browser_open's allowed_domains, the row can come back. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Decision applied: dropped Changes:
Source-side note for whoever owns the executor: two tools now silently dropped from the user-facing catalogue solely because the builder produces a guaranteed-fail configuration. The Pushover case (no env var provisioning) and |
…veat Greptile R7 P1: I cited UZ-TOOL-004 in tools.mdx and troubleshooting.mdx without adding it to the registry. Same pattern as the earlier UZ-WH-020/030 finding — body prose references that don't resolve in error-codes.mdx. Source check turned up the same issue I hit with UZ-TOOL-005: registered in error_registry.zig:123 + error_entries.zig:163 (.bad_request, 400), but no code path emits it as a wire response. Grep returns only the registry/entry definitions. The user-facing failure for "agent tried to call a tool not in tools:" is a NullClaw "no such tool" message in the activity stream, not a UZ-TOOL-004 wire code. Three fixes: 1. error-codes.mdx Tool section: add UZ-TOOL-004 row alongside UZ-TOOL-005, plus a Note explaining both codes are registered for meaning but user-facing failures land in the activity stream rather than as wire responses today. Treats the registry as authoritative for semantics while being honest about emission surface. 2. troubleshooting.mdx §3 wire-code table: drop the UZ-TOOL-004 row (it's not a wire code today). Expand the existing UZ-TOOL-005 Note to cover both codes — typos vs missing-tool invocations both surface in the activity stream, not as wire responses. 3. tools.mdx Validation: drop the false "runtime guard the agent actually hits" claim about UZ-TOOL-004. Reframe honestly as "underlying agent runtime returns no-such-tool error in-prompt" with a pointer to the error-codes registry. Pattern: this is the third round catching "registered codes that aren't emitted." The pattern's worth a runtime-side note — the executor team might want to either start emitting these as wire codes (so the registry matches user-facing reality) or mark them as internal-only. Addresses Greptile #r3190187792 on PR #35. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Reviewed every user-facing page from the persona of an engineer who wants to create a zombie. Reconciled docs against
zombiectl/src/+src/errors/+src/zombie/source-of-truth. Added the two missing-but-needed pages (tools catalogue, troubleshooting). Surfaced the existing Discord channel and enabled per-page feedback.DX scorecard: ~4/10 → ~7/10 across the eight DX dimensions. Biggest jumps: error-message correctness (3→8 — the prior state cited fictional codes like `UZ-TRIGGER-001` and used inconsistent schemes like `WORKSPACE_NOT_FOUND`) and DX measurement (2→7 — Mintlify feedback widgets enabled).
What changed
CLI surface alignment with `zombiectl/src/commands/*.js`
Quickstart — new "Create a workspace" step (without it, first-run hit `ERR_NO_WORKSPACE` with no diagnostic context); skill install via `curl https://usezombie.sh/skills.md\` (replaces broken `npx skills add` form); BYOK setup removed per descope; `op inject` example for credential rotation with the silent-failure caveat.
Error codes resynced against `src/errors/error_registry.zig` + `error_entries.zig` + `error_entries_runtime.zig`
New pages
Authoring + webhooks
DX additions
What's deliberately not in this PR
Test plan
🤖 Generated with Claude Code
Greptile Summary
This PR performs a comprehensive v2 documentation audit across 21 files, reconciling every user-facing page against the
zombiectlsource, error registry, and tool catalogue. Error codes are resynced fromsrc/errors/, CLI commands are corrected to matchzombiectl/src/commands/*.js, and two new pages (zombies/tools.mdx,zombies/troubleshooting.mdx) fill the most significant documentation gaps.UZ-WH-020/030,UZ-TOOL-004,UZ-ZMB-011,UZ-GRANT-001-003,UZ-VAULT-001-002,UZ-MEM-001-003; retiresUZ-ZMB-007; replaces fictionalUZ-TRIGGER-001andWORKSPACE_NOT_FOUNDwith canonical codes throughout.workspace add [name](not<repo-url>),workspace delete(notremove),agent add(notcreate),grant delete(notrevoke),billing showintroduced; stop/resume/kill/delete irreversibility ladder documented;--followflag removal noted with polling workaround.Confidence Score: 4/5
Safe to merge with one fix:
UZ-VAULT-001in the §1 install-rejected table points users at an error code thatzombiectl install --fromnever emits, leaving them unable to match the row during a real install failure.The
UZ-VAULT-001row inzombies/troubleshooting.mdx§1 is framed as a wire code fromzombiectl install --from --jsonoutput, but the inline note explicitly states it comes from a priorcredential addcall rather than from install. A user debugging a failed install who searches their JSON output for a matching code will not find this code and will never land on the row, defeating the diagnostic purpose of the decision tree.zombies/troubleshooting.mdx (§1 install-rejected table) and zombies/tools.mdx (terminology).
Important Files Changed
UZ-VAULT-001is listed in the §1 install-rejected table but the inline note says it's not returned byinstall, making that row unreachable via the described diagnostic path.npx skills addwith curl install, removes BYOK step, adds op inject credential rotation example with silent-failure caveat.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD Install["zombiectl install --from path"] --> Alive["Alive\nwebhook live"] Alive -->|event arrives| Processing["Processing\nagent reasoning"] Processing -->|stage exits| Alive Alive -->|zombiectl stop| Stopped["Stopped\npausable"] Stopped -->|zombiectl resume| Alive Alive -->|zombiectl kill| Killed["Killed\nterminal — row/history/URL persist"] Killed -->|zombiectl delete| Deleted["Deleted\nrow + URL gone → UZ-WH-001"] Killed -->|zombiectl install --from path| AliveComments Outside Diff (5)
workspaces/overview.mdx, line 6-16 (link)This PR updated lifecycle step 1 (line 36) to say workspaces are "not bound to a specific GitHub repo at creation time," but the page intro (line 6: "It binds a GitHub repository to usezombie via the usezombie GitHub App") and the "What a workspace contains" list (line 12: "Repository connection — a 1:1 binding to a GitHub repo via the usezombie GitHub App") still describe the old GitHub-App-binding model. Step 4 of the lifecycle (line 39) also retains "GitHub App connection uninstalled for that repo." A reader of this page now gets three contradictory descriptions of what a workspace is.
Prompt To Fix With AI
workspaces/managing.mdx, line 28-35 (link)workspace listoutput still shows a REPO column after the no-repo-binding changeThe "Add a workspace" section on this same page was updated in this PR to say "There is no repo URL — workspaces are tenant-scoped containers," yet the
workspace listoutput table (lines 28-35) still shows aREPOcolumn populated withacme/backend,acme/frontend, andacme/infra, and the prose above it says the command "Displays … connected repository." A user reading straight down the page gets a direct contradiction: workspaces have no repo URL, butworkspace listshows one. The sample output and the description need to be updated to match the new model (e.g. replace theREPOcolumn withNAMEor drop it, and update the description to remove "connected repository").Prompt To Fix With AI
workspaces/managing.mdx, line 28-35 (link)workspace listdescription and sample output still reference the old repo-binding modelLine 28 says the command displays "status, connected repository, and plan tier," and the sample output block (lines 30–35) still shows a
REPOcolumn populated withacme/backend,acme/frontend, andacme/infra. This directly contradicts the updated text on line 12 of this same file ("There is no repo URL — workspaces are tenant-scoped containers"). A reader who follows the page top-to-bottom gets three statements in conflict. The description and sample output need to drop theREPOcolumn (e.g. replace withNAME) and remove the "connected repository" wording.Prompt To Fix With AI
workspaces/managing.mdx, line 59-78 (link)workspace showandRemovesections still describe the old repo-binding / GitHub App modelworkspace show(line 59) says it prints "repo URL" and "default branch" — neither concept exists in the new model where workspaces are not bound to a repo at creation. TheRemovesection (line 74) says "Uninstalls the GitHub App connection for that repository," and the Warning (line 77) says "reinstall the GitHub App to reconnect" — both are leftovers from the old flow. A user who hits a removal error and reads this for guidance will take the wrong remediation path.Prompt To Fix With AI
workspaces/overview.mdx, line 6-16 (link)Line 6 says a workspace "binds a GitHub repository to usezombie via the usezombie GitHub App." Line 12 lists "Repository connection — a 1:1 binding to a GitHub repo via the usezombie GitHub App" as a first-class attribute of a workspace. Both statements contradict lifecycle step 1 (line 36 of this file), which was updated in this PR to say workspaces are not bound to a specific GitHub repo at creation time. The intro sentence and the "What a workspace contains" bullet need to be updated to reflect the new model — workspace as a tenant-scoped container, with repo webhooks wired later per-zombie.
Prompt To Fix With AI
Prompt To Fix All With AI
Reviews (7): Last reviewed commit: "docs(error-codes): add UZ-TOOL-004, docu..." | Re-trigger Greptile
Context used: