Lifecycle orchestration for Claude Code — sets the rules, Superpowers executes.
12 skills that add lifecycle management, strategic decisions, and quality gates
on top of built-in Superpowers. Drop a config file, get project-aware workflows.
English | 正體中文
Claude Code's built-in Superpowers handles tactics well — TDD, debugging, planning, code review. But it lacks:
- Lifecycle management — no task sizing, no project tracking, no session start/end discipline
- Strategic decisions — no multi-perspective debate, no dual-agent research
- Quality gates — no unified pipeline enforcing test → scan → completeness → review
- Self-improvement — no knowledge capture, no retrospectives, no "what's next?" recommendations
- Project-specific context — no mechanism to inject your project's tools, conventions, and known gotchas
Autopilot adds lifecycle orchestration and strategic intelligence on top of Superpowers:
| Skill | What It Does | Superpowers Handles |
|---|---|---|
| dev-flow | Sizes tasks (S/L/H), sets session rules for config injection and quality gates, manages project tracking | Planning, TDD, debugging, code review (tactical execution) |
| survey | Dual-agent research (researcher + skeptic) | — (no equivalent) |
| think-tank | 6-role debate for strategic decisions | brainstorming (different level — requirements exploration) |
| think-tank-dialectic | Hegelian dialectic for irreversible / high-stakes decisions with LOW consensus. 4 职能 + 2 adversarial roles (Popper falsifier + Munger inverter). NOT a "better think-tank" — a different tool for a different situation | — (no equivalent) |
| ceo-agent | Autonomous execution with CEO-level judgment | — (no equivalent) |
| quality-pipeline | Unified quality gate: test → scan → completeness → review | verification-before-completion (partial) |
| finish-flow | Size-aware closing forcing function — TaskCreates discrete L-5 / H-9 / Fix / S-Lite sub-tasks so nothing gets silently compressed | — (no equivalent) |
| project-lifecycle | Plan → bootstrap → structure → archive | finishing-a-development-branch (partial) |
| learn | Auto-records knowledge from failures; knowledge health audit | — (no equivalent) |
| retro | Engineering retrospective from git history | — (no equivalent) |
| next | Scan all work sources, recommend highest-priority task | — (no equivalent) |
| audit | Systematic comparison between implementations | — (no equivalent) |
Autopilot doesn't compete with Superpowers — it sets the rules that Superpowers operates within:
autopilot:dev-flow sets session rules:
→ "When debugging, read .claude/debug-config.md for project context"
→ "Before committing, run autopilot:quality-pipeline"
→ "On session end, update project tracking"
superpowers skills execute within those rules:
→ systematic-debugging (with project config in context)
→ test-driven-development (with project test conventions)
→ writing-plans (within dev-flow's sizing framework)
Autopilot provides three distinct cognitive modes for different situations:
dev-flow — Guided Development (default)
The entry point for all development tasks. Evaluates task size and routes accordingly:
You: "Add WebSocket compression"
Claude (with dev-flow):
1. Size assessment: L (crosses network + protocol + client modules)
2. → Creates plan, project directory, feature branch
3. → Implements phase by phase, quality gate each phase
4. → Archives project on completion
Claude (without dev-flow):
→ Starts grep-ing the codebase immediately
→ No plan, no phases, no quality gates
dev-flow also handles the session lifecycle — health checks on startup, knowledge review, goal alignment on context continuation. You don't invoke these separately; dev-flow absorbs them.
ceo-agent — Autonomous Execution
When you want outcomes, not involvement. The agent becomes CEO; you become the Board.
You: "CEO mode. Handle the reconnect system. Level 3, you decide everything."
CEO startup:
1. OKR — concrete success criteria (not vague "make it work")
2. Involvement level — how often to report (every step / phase / just results)
3. Scope mode — Expand / Selective / Hold / Reduce
4. No-go zones — what's absolutely off-limits
Then: autonomous execution within DOA (Delegation of Authority)
The CEO agent applies 10 cognitive patterns from great CEOs (Bezos's two-way doors, Munger's inversion reflex, Jobs's focus as subtraction) and follows the Boil the Lake principle — AI makes completeness nearly free, so always choose the complete implementation over shortcuts.
CEO cannot self-audit. Like corporate governance, quality-pipeline and code-review run independently.
think-tank — Multi-Perspective Debate
For strategic decisions where a single perspective isn't enough. 6 roles debate in parallel:
You: "Should we rewrite the auth system or patch it?"
Think Tank assembles:
- CTO (technical feasibility)
- Product Director (user impact)
- QA Lead (risk assessment)
- Security Architect (threat model)
- Customer Advocate (user experience)
- Operations (deployment/maintenance)
Output: Decision Brief with consensus, dissenting views, and recommendation
user task
│
▼
dev-flow ──────────────────────────────────────────────┐
│ │
├─ S (small): implement → quality-pipeline → commit │
│ │
└─ L (large): project-lifecycle (bootstrap) │
│ → per-phase implement │
│ → quality-pipeline per phase │
│ → project-lifecycle (archive) │
│ │
├─ needs research? ──→ survey │
├─ strategic decision? ──→ think-tank │
├─ user says "handle it"? ──→ ceo-agent │
│ │
└─ session end ──→ learn (capture knowledge) │
retro (periodic review) │
│
what's next? ──→ next (scan → rank → recommend) │
│
◄───────────────────────────────────────────────────────┘
| Decision Type | Use This |
|---|---|
| Technical choice (X library vs Y) | survey — external research with dual perspective |
| Strategic choice (should we? how big? what first?) | think-tank — internal multi-role debate |
| User wants outcome, not involvement | ceo-agent — autonomous execution |
| User wants to participate | dev-flow — guided workflow with checkpoints |
/plugin marketplace add cookys/autopilot
/plugin install autopilot@autopilotThat's it. All 12 skills are available immediately as autopilot:dev-flow, autopilot:survey, etc.
Skills work out of the box with sensible defaults. For project-specific behavior, drop a markdown file into your project's .claude/ directory — the skill reads it at invocation time via Claude Code's !command`` preprocessor.
┌─────────────────────────────────┐
│ Plugin (shared, read-only) │ Autopilot skills live here.
│ ~/.claude/plugins/cache/ │ Same for all projects.
│ autopilot/skills/dev-flow/ │
│ └── SKILL.md ─────────┼──┐
└─────────────────────────────────┘ │
│ At invocation, SKILL.md runs:
│ !`cat .claude/dev-flow-config.md`
│
┌─────────────────────────────────┐ │
│ Your Project (per-repo) │ │
│ my-project/.claude/ │◄─┘ Reads from YOUR project's
│ ├── dev-flow-config.md │ .claude/ directory
│ ├── quality-gate-config.md │
│ ├── skill-routing.md │ These files are plain markdown.
│ └── team-config.md │ No schema. No YAML. Natural language.
└─────────────────────────────────┘
The !command`` syntax is a Claude Code preprocessor — it runs a shell command and inlines the output into the skill body before the LLM sees it. This means:
- No config file? Silent pass-through — the skill works normally without extra noise. Zero friction.
- Config is natural language. A markdown file is more expressive than YAML — you can write rules, exceptions, and rationale in prose.
- Config is project-local. Each repo has its own
.claude/directory. The same autopilot plugin adapts to a C++ game server, a React app, or a Rust CLI — all through different config files. - Session rules inject config for ALL activities. dev-flow sets rules like "when debugging, read
.claude/debug-config.md" — so even Superpowers' debugging skill gets your project context.
| Config File | Customizes | Template |
|---|---|---|
.claude/dev-flow-config.md |
Size rules, quality gates, build commands, special rules | template |
.claude/finish-flow-config.md |
L-5 / H-9 closing sequence overrides (merge target, archive proc, per-size quality gate) | template |
.claude/quality-gate-config.md |
Test, scan, and review commands | template |
.claude/project-lifecycle-config.md |
Project paths, bootstrap/archive scripts | template |
.claude/next-config.md |
Work source paths for the next skill | template |
.claude/team-config.md |
Team role templates for your tech stack | template |
.claude/test-strategy-config.md |
Test commands, pyramid ratios, coverage thresholds | template |
.claude/debug-config.md |
Project-specific debug tools and log paths | template |
.claude/profiling-config.md |
Profiling tools and metrics collection | template |
.claude/skill-routing.md |
Map keywords to your project's domain skills | template |
.claude/model-routing-config.md |
Subagent model/mode per role (planner, reviewer, etc.) | template |
# Dev Flow — TWGameServer Config
## Size Rules
- **S**: single module, no interface change → direct commit
- **L**: 3+ modules, public interface, Feature Flag → plan + project + PR
## Quality Gate
- S: `node .claude/scripts/quality-pipeline.js --size S`
- L: `node .claude/scripts/quality-pipeline.js --size L` per phase
## Build & Deploy
- Build: `../deploy/scripts/dev.sh build`
- Build+Restart: `../deploy/scripts/dev.sh br`
## Special Rules
- Commit 前必須跑 E2E if 改了遊戲邏輯
- Proto 改動要重編譯 SDK# Skill Routing
| Keyword | Invoke |
|---------|--------|
| MJ / mahjong | `twgs-game-dev` → references/mj.md |
| crash / core dump | `twgs-debug` |
| proto / protobuf | `twgs-protobuf` |
| stress / 10K | `twgs-stress-test` |This lets autopilot's dev-flow automatically invoke your project's domain-specific skills when it encounters relevant keywords — bridging the generic workflow layer with project-specific knowledge.
Add to your project's .claude/settings.json so team members get prompted to install:
Claude Code plugins are pinned to a specific commit at install time. /plugin update may not detect new versions. To get the latest:
/plugin uninstall autopilot@autopilot
/plugin marketplace remove autopilot
/plugin marketplace add cookys/autopilot
/plugin install autopilot@autopilotSee anthropics/claude-code#31462 for details.
Why a plugin, not copy-paste skills?
Copy-pasted skills drift within weeks. A plugin gives you a single source of truth — update once, everyone gets it via /plugin update.
Why 12 skills + 14 hooks, not more skills?
v2 removed 4 skills (debug, test-strategy, team, profiling) that overlapped with built-in Superpowers. Their methodology is now handled by Superpowers; their project-specific config injection is now handled by dev-flow's Session Rules. v2.2 added think-tank-dialectic as a different tool (not an upgrade) for irreversible decisions. v2.5 added 14 hooks for runtime enforcement — discipline that was previously only in markdown rules. Hooks and skills serve different layers: skills set rules at conversation time; hooks enforce them at tool-call time.
Why !command`` injection, not config files?
In the Claude Code world, "configuration" is natural language. A markdown file read at invocation time is more expressive than YAML, requires no schema, and degrades gracefully when absent.
How does it work with Superpowers?
Autopilot is the rule-setter; Superpowers is the executor. They coexist through a layered triggering design:
Layer 1 — CLAUDE.md routing table (project-level)
"新功能規劃 → autopilot:dev-flow"
"技術調研 → autopilot:survey"
Maps project context to skills. Written by the user.
Layer 2 — using-superpowers skill (session-level)
"Check skills BEFORE any response. Even 1% chance = invoke."
This is what makes skill triggering work. Without it,
the model answers directly and never checks skills.
Layer 3 — Skill description (skill-level)
"Use when: 'compare X with Y', 'check X against Y'..."
User-intent trigger phrases help the model match
the user's words to the right skill.
All three layers must work together. Layer 2 (Superpowers' using-superpowers) creates the habit of checking skills; Layer 1 (CLAUDE.md) provides project-specific routing; Layer 3 (descriptions) provides the semantic match. Autopilot never dispatches to or wraps Superpowers skills — they share the session, not a call chain.
Why do descriptions use quoted trigger phrases?
Skill descriptions serve Layer 3 — they're the last-mile match between user intent and skill selection. We write them in user-intent language ("what should I work on", "get it done", "let's debate this") rather than internal mechanics ("global work recommender", "autonomous execution mode") because the model matches user messages against descriptions. The closer the description mirrors what users actually say, the more reliable the trigger.
Autopilot ships three read-only methodology agents (v2.4.0) that carry Three Red Lines discipline into agent-level execution. Autopilot skills dispatch them automatically; you rarely invoke them directly.
| Agent | Purpose | Model | Dispatched by |
|---|---|---|---|
autopilot:reviewer |
Pre-commit / pre-merge review, security audit, plan critique. Severity-tiered findings with file:line citations and ✅ Verified Clean section |
opus | quality-pipeline, ceo-agent, finish-flow |
autopilot:debugger |
Evidence-first root-cause analysis. 5-phase methodology with PUA trigger on 2+ failures. Produces Proposed Fix as diff, never applies patches |
opus | quality-pipeline (round-trip), ceo-agent, dev-flow |
autopilot:planner |
Six-element Task Prompt decomposition for L-size work (goal / scope / input / output / acceptance / boundaries). Cannot write code | sonnet | dev-flow, think-tank |
All three are physically read-only — their tools frontmatter excludes Edit and Write, so Claude Code mechanically prevents them from patching files. They produce findings, proposals, or plans, and hand off to the calling skill via a unified ### Handoff section with an enum-based Next consumer field.
The three agents carry autopilot's Three Red Lines into the agent layer:
- Closure — every finding has impact + fix direction, no open-ended output
- Fact-driven — every claim cites
file_path:line_number; "probably" / "likely" are violations - Exhaustiveness — full checklists run; clean items explicitly listed; silent omission is a violation
See agents/README.md for dispatch boundary, unified Output Contract, enum grammar, and the "autopilot methodology / voltagent role / project-specific" layer cake.
Autopilot is self-sufficient for methodology and lifecycle — install autopilot alone and you get all 12 skills + 3 methodology agents. For role-specialized work (language experts, database admins, Kubernetes specialists, frontend designers), we recommend installing voltagent alongside:
/plugin install voltagent@...
Autopilot and voltagent are orthogonal by design:
| Layer | What it does | Where to look |
|---|---|---|
| Methodology | Three Red Lines discipline, evidence-first debugging, six-element Task Prompts, lifecycle orchestration | autopilot (this plugin) |
| Role | Language experts, infra specialists, domain experts (80+ agents) | voltagent |
| Project | Your tech stack's pitfalls, team conventions, domain-specific agents | <project>/.claude/agents/ |
Dispatch boundary:
- Going through an autopilot skill (
quality-pipeline,dev-flow,ceo-agent) auto-dispatchesautopilot:reviewer/:debugger/:plannerto carry methodology discipline into every invocation - Directly invoking an agent via the
Agenttool — voltagent's role agents (voltagent-qa-sec:code-reviewer,voltagent-lang:rust-engineer,voltagent-data-ai:postgres-pro, etc.) are usually the better primary choice because they have broader domain coverage
Two workflows, two dispatch paths, zero overlap in practice.
Autopilot does not runtime-detect voltagent. Autopilot skills name autopilot agents directly in their prose. If you want a different reviewer for a specific task, invoke the alternate explicitly via the Agent tool — that is a user-layer choice, not graceful degradation inside autopilot skills.
Autopilot ships 14 hooks that enforce development discipline at the Claude Code runtime layer — no self-discipline required.
These activate automatically when the plugin is installed. All are non-destructive and safe for any project.
| Hook | Event | What It Does |
|---|---|---|
| large-file-warner | PreToolUse/Read | Warns at 500KB, hard blocks at 2MB. Bypasses if offset/limit specified |
| suggest-compact | PostToolUse/Write|Edit | Counts tool calls per session; suggests /compact at 50, then every 25 |
| cost-tracker | Stop | Logs token usage + estimated USD cost to ~/.claude/metrics/costs.jsonl |
| audit-log | PostToolUse/Bash | Logs bash commands with auto secret redaction |
| session-summary | Stop | Writes git status + recent commits to ~/.claude/sessions/ |
| log-error | PostToolUse/* | Detects error keywords in tool output, appends to ~/.claude/error-log.md |
| commit-secret-scan | PreToolUse/Bash | Scans staged content for API keys/tokens. Hard blocks git commit if found |
| branch-protection | PreToolUse/Bash | Blocks force push and direct commits on main/master (anchored regex, env override) |
Enable individually by copying entries from settings.example.json to your settings.json.
| Hook | Event | What It Does |
|---|---|---|
| config-protection | PreToolUse/Write|Edit | Blocks edits to linter/formatter config files |
| check-console | Stop | Warns about console.log in modified JS/TS files |
| accumulator + batch-format | PostToolUse + Stop | Batch Prettier + tsc on all edited files at session end |
| test-runner | PostToolUse/Write|Edit | Auto-runs sibling test files on edit (vitest/jest) |
| design-quality | PostToolUse/Write|Edit | Warns on generic template UI patterns |
| mcp-health | PreToolUse + PostToolUseFailure | Exponential backoff for unhealthy MCP servers |
commit-secret-scan and audit-log share a unified secret pattern module (hooks/_shared/secret-patterns.js) covering: OpenAI, Anthropic, GitHub (PAT/OAuth/App), AWS, Google API, Slack, Stripe tokens + inline --token/password/Authorization patterns.
- Disable a Tier A hook: set
autopilot.<hookName> = falseinsettings.json - Custom protected branches: set
AUTOPILOT_PROTECTED_BRANCHESenv var orautopilot.protectedBranchesin settings - Disable cost tracking: set
autopilot.costTracker = false
- gstack — Garry Tan's skill suite for Claude Code. The CEO agent's cognitive patterns (Bezos doors, Munger inversion, Jobs subtraction), Boil the Lake completeness principle, and scope mode system are adapted from gstack's
plan-ceo-reviewskill. - Council of High Intelligence — 0xNyk's 18-thinker multi-persona deliberation skill. The
think-tank-dialecticskill's enforcement mechanisms (Dissent Quota, Counterfactual Trigger at >70%, Problem Restate Gate, Minority Report as first-class verdict section, Epistemic Diversity Scorecard) are adapted from Council's 7-step protocol and agent frontmatter conventions. The key meta-insight — every thinking style must carry its own fail-safe — comes from observing that 100% of Council's 18 agents have aGrounding Protocolsection with self-constraining hard rules. - Agora — Professor Li's 6-room, 31-thinker extension of Council. The
think-tank-dialecticskill's Hegelian Arc structure (Thesis → Antithesis → Synthesis with forced non-compromise synthesis proposal), Adaptive Depth Gate, Tacit Knowledge Extraction protocol (Polanyi), and "different tool, not better tool" framing are adapted from Agora's 8-step deliberation protocol and the/forgeengineering room's verdict template. - my-claude-devteam — NYCU-Chung's 12-agent + 15-hook engineering team plugin for Claude Code. The
v2.4.0methodology agents (reviewer/debugger/planner) absorb the Three Red Lines discipline (closure / fact-driven / exhaustiveness), six-element Task Prompt contract, evidence-first debug methodology, PUA stress-mode trigger, and physical tool-restriction pattern (read-only methodology agents) from devteam's P7/P9/P10 framework. Thev2.5.0hooks layer absorbs 14 of devteam's 15 hooks (8 default-on Tier A + 6 opt-in Tier B) with Ship A review adjustments: anchored branch-protection regex (C1 fix), unified secret-patterns module (mi1 fix), cost-tracker opt-out, and 8/8 Tier A testing coverage. The layered split — autopilot owns methodology, voltagent owns role specialization — is a deliberate divergence from devteam's all-in-one approach to stay orthogonal to the voltagent role-agent ecosystem.
To contribute or customize skills locally:
# 1. Install once via the normal flow (required)
/plugin marketplace add cookys/autopilot
/plugin install autopilot@autopilot
# 2. Clone and switch to dev mode
git clone git@github.com:cookys/autopilot.git ~/projects/autopilot
cd ~/projects/autopilot
./scripts/dev-setup.shDev mode symlinks the plugin cache to your local clone. Edits to skills/ take effect immediately after /reload-plugins — no reinstall needed.
Push/pull works normally across machines. Each machine runs step 1 once, then dev-setup.sh once.
Note: Dev mode sets
version: "dev"in the plugin registry. To revert to the release version, run/plugin update autopilot@autopilot.
The plugin cache lives at ~/.claude/plugins/cache/autopilot/autopilot/. Each entry is either a versioned directory (snapshot from install/update) or a symlink to a local clone:
~/.claude/plugins/cache/autopilot/autopilot/
├── develop -> ~/projects/autopilot # symlink — live edits, /reload-plugins to sync
└── 2.0.0 # snapshot — created by install or reload
Symlink naming: The cache directory name does not need to be a semver string. You can use semantic names like develop, nightly, or local to distinguish dev symlinks from release snapshots. Claude Code resolves the symlink and reads plugin.json inside for the actual version.
Stale cache cleanup: After upgrading, old version directories may linger. Remove them manually:
rm -rf ~/.claude/plugins/cache/autopilot/autopilot/<old-version>| Branch | Purpose | plugin.json version |
|---|---|---|
main |
Stable releases, tagged (e.g. v1.4.5) |
Matches latest tag |
develop |
Next version development | Next major/minor (e.g. 2.0.0) |
When developing: checkout develop, symlink points to it, /reload-plugins picks up changes. Remember to bump plugin.json version before tagging a release.
/plugin update autopilot@autopilotDistilled from 100+ completed projects using AI-driven development.
MIT — see LICENSE for details.
{ "extraKnownMarketplaces": { "autopilot": { "source": { "source": "github", "repo": "cookys/autopilot" } } } }