Autopilot

Lifecycle orchestration for Claude Code — sets the rules, Superpowers executes.
12 skills that add lifecycle management, strategic decisions, and quality gates
on top of built-in Superpowers. Drop a config file, get project-aware workflows.

English | 正體中文

The Problem

Claude Code's built-in Superpowers handles tactics well — TDD, debugging, planning, code review. But it lacks:

Lifecycle management — no task sizing, no project tracking, no session start/end discipline
Strategic decisions — no multi-perspective debate, no dual-agent research
Quality gates — no unified pipeline enforcing test → scan → completeness → review
Self-improvement — no knowledge capture, no retrospectives, no "what's next?" recommendations
Project-specific context — no mechanism to inject your project's tools, conventions, and known gotchas

The Solution

Autopilot adds lifecycle orchestration and strategic intelligence on top of Superpowers:

Skill	What It Does	Superpowers Handles
dev-flow	Sizes tasks (S/L/H), sets session rules for config injection and quality gates, manages project tracking	Planning, TDD, debugging, code review (tactical execution)
survey	Dual-agent research (researcher + skeptic)	— (no equivalent)
think-tank	6-role debate for strategic decisions	brainstorming (different level — requirements exploration)
think-tank-dialectic	Hegelian dialectic for irreversible / high-stakes decisions with LOW consensus. 4 职能 + 2 adversarial roles (Popper falsifier + Munger inverter). NOT a "better think-tank" — a different tool for a different situation	— (no equivalent)
ceo-agent	Autonomous execution with CEO-level judgment	— (no equivalent)
quality-pipeline	Unified quality gate: test → scan → completeness → review	verification-before-completion (partial)
finish-flow	Size-aware closing forcing function — TaskCreates discrete L-5 / H-9 / Fix / S-Lite sub-tasks so nothing gets silently compressed	— (no equivalent)
project-lifecycle	Plan → bootstrap → structure → archive	finishing-a-development-branch (partial)
learn	Auto-records knowledge from failures; knowledge health audit	— (no equivalent)
retro	Engineering retrospective from git history	— (no equivalent)
next	Scan all work sources, recommend highest-priority task	— (no equivalent)
audit	Systematic comparison between implementations	— (no equivalent)

The Rule-Setter Model

Autopilot doesn't compete with Superpowers — it sets the rules that Superpowers operates within:

autopilot:dev-flow sets session rules:
  → "When debugging, read .claude/debug-config.md for project context"
  → "Before committing, run autopilot:quality-pipeline"
  → "On session end, update project tracking"

superpowers skills execute within those rules:
  → systematic-debugging (with project config in context)
  → test-driven-development (with project test conventions)
  → writing-plans (within dev-flow's sizing framework)

Key Features

Three Modes of Operation

Autopilot provides three distinct cognitive modes for different situations:

dev-flow — Guided Development (default)

The entry point for all development tasks. Evaluates task size and routes accordingly:

You: "Add WebSocket compression"

Claude (with dev-flow):
  1. Size assessment: L (crosses network + protocol + client modules)
  2. → Creates plan, project directory, feature branch
  3. → Implements phase by phase, quality gate each phase
  4. → Archives project on completion

Claude (without dev-flow):
  → Starts grep-ing the codebase immediately
  → No plan, no phases, no quality gates

dev-flow also handles the session lifecycle — health checks on startup, knowledge review, goal alignment on context continuation. You don't invoke these separately; dev-flow absorbs them.

ceo-agent — Autonomous Execution

When you want outcomes, not involvement. The agent becomes CEO; you become the Board.

You: "CEO mode. Handle the reconnect system. Level 3, you decide everything."

CEO startup:
  1. OKR — concrete success criteria (not vague "make it work")
  2. Involvement level — how often to report (every step / phase / just results)
  3. Scope mode — Expand / Selective / Hold / Reduce
  4. No-go zones — what's absolutely off-limits

Then: autonomous execution within DOA (Delegation of Authority)

The CEO agent applies 10 cognitive patterns from great CEOs (Bezos's two-way doors, Munger's inversion reflex, Jobs's focus as subtraction) and follows the Boil the Lake principle — AI makes completeness nearly free, so always choose the complete implementation over shortcuts.

CEO cannot self-audit. Like corporate governance, quality-pipeline and code-review run independently.

think-tank — Multi-Perspective Debate

For strategic decisions where a single perspective isn't enough. 6 roles debate in parallel:

You: "Should we rewrite the auth system or patch it?"

Think Tank assembles:
  - CTO (technical feasibility)
  - Product Director (user impact)
  - QA Lead (risk assessment)
  - Security Architect (threat model)
  - Customer Advocate (user experience)
  - Operations (deployment/maintenance)

Output: Decision Brief with consensus, dissenting views, and recommendation

How They Work Together

 user task
    │
    ▼
 dev-flow ──────────────────────────────────────────────┐
    │                                                    │
    ├─ S (small): implement → quality-pipeline → commit   │
    │                                                    │
    └─ L (large): project-lifecycle (bootstrap)          │
         │         → per-phase implement                 │
         │         → quality-pipeline per phase           │
         │         → project-lifecycle (archive)          │
         │                                               │
         ├─ needs research? ──→ survey                   │
         ├─ strategic decision? ──→ think-tank            │
         ├─ user says "handle it"? ──→ ceo-agent         │
         │                                               │
         └─ session end ──→ learn (capture knowledge)    │
                            retro (periodic review)      │
                                                         │
 what's next? ──→ next (scan → rank → recommend)         │
                                                         │
 ◄───────────────────────────────────────────────────────┘

Skill Boundaries

Decision Type	Use This
Technical choice (X library vs Y)	`survey` — external research with dual perspective
Strategic choice (should we? how big? what first?)	`think-tank` — internal multi-role debate
User wants outcome, not involvement	`ceo-agent` — autonomous execution
User wants to participate	`dev-flow` — guided workflow with checkpoints

Install

/plugin marketplace add cookys/autopilot
/plugin install autopilot@autopilot

That's it. All 12 skills are available immediately as autopilot:dev-flow, autopilot:survey, etc.

Cross-Repository Configuration (Injection)

Skills work out of the box with sensible defaults. For project-specific behavior, drop a markdown file into your project's .claude/ directory — the skill reads it at invocation time via Claude Code's !command`` preprocessor.

How Injection Works

┌─────────────────────────────────┐
│  Plugin (shared, read-only)     │   Autopilot skills live here.
│  ~/.claude/plugins/cache/       │   Same for all projects.
│  autopilot/skills/dev-flow/     │
│           └── SKILL.md ─────────┼──┐
└─────────────────────────────────┘  │
                                     │  At invocation, SKILL.md runs:
                                     │  !`cat .claude/dev-flow-config.md`
                                     │
┌─────────────────────────────────┐  │
│  Your Project (per-repo)        │  │
│  my-project/.claude/            │◄─┘  Reads from YOUR project's
│    ├── dev-flow-config.md       │     .claude/ directory
│    ├── quality-gate-config.md   │
│    ├── skill-routing.md         │     These files are plain markdown.
│    └── team-config.md           │     No schema. No YAML. Natural language.
└─────────────────────────────────┘

The !command`` syntax is a Claude Code preprocessor — it runs a shell command and inlines the output into the skill body before the LLM sees it. This means:

No config file? Silent pass-through — the skill works normally without extra noise. Zero friction.
Config is natural language. A markdown file is more expressive than YAML — you can write rules, exceptions, and rationale in prose.
Config is project-local. Each repo has its own .claude/ directory. The same autopilot plugin adapts to a C++ game server, a React app, or a Rust CLI — all through different config files.
Session rules inject config for ALL activities. dev-flow sets rules like "when debugging, read .claude/debug-config.md" — so even Superpowers' debugging skill gets your project context.

Available Config Files

Config File	Customizes	Template
`.claude/dev-flow-config.md`	Size rules, quality gates, build commands, special rules	template
`.claude/finish-flow-config.md`	L-5 / H-9 closing sequence overrides (merge target, archive proc, per-size quality gate)	template
`.claude/quality-gate-config.md`	Test, scan, and review commands	template
`.claude/project-lifecycle-config.md`	Project paths, bootstrap/archive scripts	template
`.claude/next-config.md`	Work source paths for the next skill	template
`.claude/team-config.md`	Team role templates for your tech stack	template
`.claude/test-strategy-config.md`	Test commands, pyramid ratios, coverage thresholds	template
`.claude/debug-config.md`	Project-specific debug tools and log paths	template
`.claude/profiling-config.md`	Profiling tools and metrics collection	template
`.claude/skill-routing.md`	Map keywords to your project's domain skills	template
`.claude/model-routing-config.md`	Subagent model/mode per role (planner, reviewer, etc.)	template

Example: C++ Game Server Config

# Dev Flow — TWGameServer Config

## Size Rules
- **S**: single module, no interface change → direct commit
- **L**: 3+ modules, public interface, Feature Flag → plan + project + PR

## Quality Gate
- S: `node .claude/scripts/quality-pipeline.js --size S`
- L: `node .claude/scripts/quality-pipeline.js --size L` per phase

## Build & Deploy
- Build: `../deploy/scripts/dev.sh build`
- Build+Restart: `../deploy/scripts/dev.sh br`

## Special Rules
- Commit 前必須跑 E2E if 改了遊戲邏輯
- Proto 改動要重編譯 SDK

Example: Skill Routing

# Skill Routing

| Keyword | Invoke |
|---------|--------|
| MJ / mahjong | `twgs-game-dev` → references/mj.md |
| crash / core dump | `twgs-debug` |
| proto / protobuf | `twgs-protobuf` |
| stress / 10K | `twgs-stress-test` |

This lets autopilot's dev-flow automatically invoke your project's domain-specific skills when it encounters relevant keywords — bridging the generic workflow layer with project-specific knowledge.

Team Setup

Add to your project's .claude/settings.json so team members get prompted to install:

{
  "extraKnownMarketplaces": {
    "autopilot": {
      "source": { "source": "github", "repo": "cookys/autopilot" }
    }
  }
}

Known Limitation

Claude Code plugins are pinned to a specific commit at install time. /plugin update may not detect new versions. To get the latest:

/plugin uninstall autopilot@autopilot
/plugin marketplace remove autopilot
/plugin marketplace add cookys/autopilot
/plugin install autopilot@autopilot

See anthropics/claude-code#31462 for details.

Design Philosophy

Why a plugin, not copy-paste skills? Copy-pasted skills drift within weeks. A plugin gives you a single source of truth — update once, everyone gets it via /plugin update.

Why 12 skills + 14 hooks, not more skills? v2 removed 4 skills (debug, test-strategy, team, profiling) that overlapped with built-in Superpowers. Their methodology is now handled by Superpowers; their project-specific config injection is now handled by dev-flow's Session Rules. v2.2 added think-tank-dialectic as a different tool (not an upgrade) for irreversible decisions. v2.5 added 14 hooks for runtime enforcement — discipline that was previously only in markdown rules. Hooks and skills serve different layers: skills set rules at conversation time; hooks enforce them at tool-call time.

Why !command`` injection, not config files? In the Claude Code world, "configuration" is natural language. A markdown file read at invocation time is more expressive than YAML, requires no schema, and degrades gracefully when absent.

How does it work with Superpowers?

Autopilot is the rule-setter; Superpowers is the executor. They coexist through a layered triggering design:

Layer 1 — CLAUDE.md routing table (project-level)
  "新功能規劃 → autopilot:dev-flow"
  "技術調研 → autopilot:survey"
  Maps project context to skills. Written by the user.

Layer 2 — using-superpowers skill (session-level)
  "Check skills BEFORE any response. Even 1% chance = invoke."
  This is what makes skill triggering work. Without it,
  the model answers directly and never checks skills.

Layer 3 — Skill description (skill-level)
  "Use when: 'compare X with Y', 'check X against Y'..."
  User-intent trigger phrases help the model match
  the user's words to the right skill.

All three layers must work together. Layer 2 (Superpowers' using-superpowers) creates the habit of checking skills; Layer 1 (CLAUDE.md) provides project-specific routing; Layer 3 (descriptions) provides the semantic match. Autopilot never dispatches to or wraps Superpowers skills — they share the session, not a call chain.

Why do descriptions use quoted trigger phrases?

Skill descriptions serve Layer 3 — they're the last-mile match between user intent and skill selection. We write them in user-intent language ("what should I work on", "get it done", "let's debate this") rather than internal mechanics ("global work recommender", "autonomous execution mode") because the model matches user messages against descriptions. The closer the description mirrors what users actually say, the more reliable the trigger.

Methodology Agents

Autopilot ships three read-only methodology agents (v2.4.0) that carry Three Red Lines discipline into agent-level execution. Autopilot skills dispatch them automatically; you rarely invoke them directly.

Agent	Purpose	Model	Dispatched by
`autopilot:reviewer`	Pre-commit / pre-merge review, security audit, plan critique. Severity-tiered findings with `file:line` citations and `✅ Verified Clean` section	opus	`quality-pipeline`, `ceo-agent`, `finish-flow`
`autopilot:debugger`	Evidence-first root-cause analysis. 5-phase methodology with PUA trigger on 2+ failures. Produces `Proposed Fix` as diff, never applies patches	opus	`quality-pipeline` (round-trip), `ceo-agent`, `dev-flow`
`autopilot:planner`	Six-element Task Prompt decomposition for L-size work (goal / scope / input / output / acceptance / boundaries). Cannot write code	sonnet	`dev-flow`, `think-tank`

All three are physically read-only — their tools frontmatter excludes Edit and Write, so Claude Code mechanically prevents them from patching files. They produce findings, proposals, or plans, and hand off to the calling skill via a unified ### Handoff section with an enum-based Next consumer field.

The three agents carry autopilot's Three Red Lines into the agent layer:

Closure — every finding has impact + fix direction, no open-ended output
Fact-driven — every claim cites file_path:line_number; "probably" / "likely" are violations
Exhaustiveness — full checklists run; clean items explicitly listed; silent omission is a violation

See agents/README.md for dispatch boundary, unified Output Contract, enum grammar, and the "autopilot methodology / voltagent role / project-specific" layer cake.

Recommended Companions

Autopilot is self-sufficient for methodology and lifecycle — install autopilot alone and you get all 12 skills + 3 methodology agents. For role-specialized work (language experts, database admins, Kubernetes specialists, frontend designers), we recommend installing voltagent alongside:

/plugin install voltagent@...

Autopilot and voltagent are orthogonal by design:

Layer	What it does	Where to look
Methodology	Three Red Lines discipline, evidence-first debugging, six-element Task Prompts, lifecycle orchestration	autopilot (this plugin)
Role	Language experts, infra specialists, domain experts (80+ agents)	voltagent
Project	Your tech stack's pitfalls, team conventions, domain-specific agents	`<project>/.claude/agents/`

Dispatch boundary:

Going through an autopilot skill (quality-pipeline, dev-flow, ceo-agent) auto-dispatches autopilot:reviewer / :debugger / :planner to carry methodology discipline into every invocation
Directly invoking an agent via the Agent tool — voltagent's role agents (voltagent-qa-sec:code-reviewer, voltagent-lang:rust-engineer, voltagent-data-ai:postgres-pro, etc.) are usually the better primary choice because they have broader domain coverage

Two workflows, two dispatch paths, zero overlap in practice.

Autopilot does not runtime-detect voltagent. Autopilot skills name autopilot agents directly in their prose. If you want a different reviewer for a specific task, invoke the alternate explicitly via the Agent tool — that is a user-layer choice, not graceful degradation inside autopilot skills.

Hooks (v2.5.0)

Autopilot ships 14 hooks that enforce development discipline at the Claude Code runtime layer — no self-discipline required.

Tier A — Default-On (8 hooks)

These activate automatically when the plugin is installed. All are non-destructive and safe for any project.

Hook	Event	What It Does
large-file-warner	PreToolUse/Read	Warns at 500KB, hard blocks at 2MB. Bypasses if `offset`/`limit` specified
suggest-compact	PostToolUse/Write\|Edit	Counts tool calls per session; suggests `/compact` at 50, then every 25
cost-tracker	Stop	Logs token usage + estimated USD cost to `~/.claude/metrics/costs.jsonl`
audit-log	PostToolUse/Bash	Logs bash commands with auto secret redaction
session-summary	Stop	Writes git status + recent commits to `~/.claude/sessions/`
log-error	PostToolUse/*	Detects error keywords in tool output, appends to `~/.claude/error-log.md`
commit-secret-scan	PreToolUse/Bash	Scans staged content for API keys/tokens. Hard blocks `git commit` if found
branch-protection	PreToolUse/Bash	Blocks force push and direct commits on `main`/`master` (anchored regex, env override)

Tier B — Opt-In (6 hooks)

Enable individually by copying entries from settings.example.json to your settings.json.

Hook	Event	What It Does
config-protection	PreToolUse/Write\|Edit	Blocks edits to linter/formatter config files
check-console	Stop	Warns about `console.log` in modified JS/TS files
accumulator + batch-format	PostToolUse + Stop	Batch Prettier + tsc on all edited files at session end
test-runner	PostToolUse/Write\|Edit	Auto-runs sibling test files on edit (vitest/jest)
design-quality	PostToolUse/Write\|Edit	Warns on generic template UI patterns
mcp-health	PreToolUse + PostToolUseFailure	Exponential backoff for unhealthy MCP servers

Secret Detection

commit-secret-scan and audit-log share a unified secret pattern module (hooks/_shared/secret-patterns.js) covering: OpenAI, Anthropic, GitHub (PAT/OAuth/App), AWS, Google API, Slack, Stripe tokens + inline --token/password/Authorization patterns.

Override

Disable a Tier A hook: set autopilot.<hookName> = false in settings.json
Custom protected branches: set AUTOPILOT_PROTECTED_BRANCHES env var or autopilot.protectedBranches in settings
Disable cost tracking: set autopilot.costTracker = false

Inspired By

gstack — Garry Tan's skill suite for Claude Code. The CEO agent's cognitive patterns (Bezos doors, Munger inversion, Jobs subtraction), Boil the Lake completeness principle, and scope mode system are adapted from gstack's plan-ceo-review skill.
Council of High Intelligence — 0xNyk's 18-thinker multi-persona deliberation skill. The think-tank-dialectic skill's enforcement mechanisms (Dissent Quota, Counterfactual Trigger at >70%, Problem Restate Gate, Minority Report as first-class verdict section, Epistemic Diversity Scorecard) are adapted from Council's 7-step protocol and agent frontmatter conventions. The key meta-insight — every thinking style must carry its own fail-safe — comes from observing that 100% of Council's 18 agents have a Grounding Protocol section with self-constraining hard rules.
Agora — Professor Li's 6-room, 31-thinker extension of Council. The think-tank-dialectic skill's Hegelian Arc structure (Thesis → Antithesis → Synthesis with forced non-compromise synthesis proposal), Adaptive Depth Gate, Tacit Knowledge Extraction protocol (Polanyi), and "different tool, not better tool" framing are adapted from Agora's 8-step deliberation protocol and the /forge engineering room's verdict template.
my-claude-devteam — NYCU-Chung's 12-agent + 15-hook engineering team plugin for Claude Code. The v2.4.0 methodology agents (reviewer / debugger / planner) absorb the Three Red Lines discipline (closure / fact-driven / exhaustiveness), six-element Task Prompt contract, evidence-first debug methodology, PUA stress-mode trigger, and physical tool-restriction pattern (read-only methodology agents) from devteam's P7/P9/P10 framework. The v2.5.0 hooks layer absorbs 14 of devteam's 15 hooks (8 default-on Tier A + 6 opt-in Tier B) with Ship A review adjustments: anchored branch-protection regex (C1 fix), unified secret-patterns module (mi1 fix), cost-tracker opt-out, and 8/8 Tier A testing coverage. The layered split — autopilot owns methodology, voltagent owns role specialization — is a deliberate divergence from devteam's all-in-one approach to stay orthogonal to the voltagent role-agent ecosystem.

Development

To contribute or customize skills locally:

# 1. Install once via the normal flow (required)
/plugin marketplace add cookys/autopilot
/plugin install autopilot@autopilot

# 2. Clone and switch to dev mode
git clone git@github.com:cookys/autopilot.git ~/projects/autopilot
cd ~/projects/autopilot
./scripts/dev-setup.sh

Dev mode symlinks the plugin cache to your local clone. Edits to skills/ take effect immediately after /reload-plugins — no reinstall needed.

Push/pull works normally across machines. Each machine runs step 1 once, then dev-setup.sh once.

Note: Dev mode sets version: "dev" in the plugin registry. To revert to the release version, run /plugin update autopilot@autopilot.

Cache directory layout

The plugin cache lives at ~/.claude/plugins/cache/autopilot/autopilot/. Each entry is either a versioned directory (snapshot from install/update) or a symlink to a local clone:

~/.claude/plugins/cache/autopilot/autopilot/
├── develop -> ~/projects/autopilot   # symlink — live edits, /reload-plugins to sync
└── 2.0.0                             # snapshot — created by install or reload

Symlink naming: The cache directory name does not need to be a semver string. You can use semantic names like develop, nightly, or local to distinguish dev symlinks from release snapshots. Claude Code resolves the symlink and reads plugin.json inside for the actual version.

Stale cache cleanup: After upgrading, old version directories may linger. Remove them manually:

rm -rf ~/.claude/plugins/cache/autopilot/autopilot/<old-version>

Branch workflow

Branch	Purpose	`plugin.json` version
`main`	Stable releases, tagged (e.g. `v1.4.5`)	Matches latest tag
`develop`	Next version development	Next major/minor (e.g. `2.0.0`)

When developing: checkout develop, symlink points to it, /reload-plugins picks up changes. Remember to bump plugin.json version before tagging a release.

Update (marketplace users)

/plugin update autopilot@autopilot

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.claude-plugin		.claude-plugin
.claude		.claude
agents		agents
docs		docs
evals		evals
hooks		hooks
project-config-template		project-config-template
references		references
scripts		scripts
skill-creator-workspace		skill-creator-workspace
skills		skills
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
README.zh-TW.md		README.zh-TW.md
settings.example.json		settings.example.json

Folders and files

Latest commit

History

Repository files navigation

Autopilot

The Problem

The Solution

The Rule-Setter Model

Key Features

Three Modes of Operation

How They Work Together

Skill Boundaries

Install

Cross-Repository Configuration (Injection)

How Injection Works

Available Config Files

Example: C++ Game Server Config

Example: Skill Routing

Team Setup

Known Limitation

Design Philosophy

Methodology Agents

Recommended Companions

Hooks (v2.5.0)

Tier A — Default-On (8 hooks)

Tier B — Opt-In (6 hooks)

Secret Detection

Override

Inspired By

Development

Cache directory layout

Branch workflow

Update (marketplace users)

Changelog

Origin

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages