Skip to content

Issue 6: Agent-prompt and mini-skill format + seed prompt #625

@gewenyu99

Description

@gewenyu99

Issue 6: Agent-prompt and mini-skill format + seed prompt

Epic: Task-queue orchestrator runner · Depends on: #631 (the agents · PR: #619, PostHog/context-mill#181
content type this is served from) and #624 (the executor consumes what this defines;
the frontmatter fields can be agreed in parallel)

Why

The core requirement is to separate WHAT from HOW.

  • An agent prompt says what a task should do. It carries the artifacts: which
    model to run on, the goal, how to tell if it succeeded, which tools it may use,
    and which mini-skills to load.
  • A mini-skill says how to do the work. It is the procedural knowledge.

Both are markdown files with frontmatter, served from context-mill. Skills already
exist there. Agent prompts are a new content type, agents, a flavor parallel to
skills (#631). Mini-skills are skills. The wizard loads an agent by type the same
way it loads a skill today (constants.ts content URLs, the fetch path in
wizard-tools.ts, localhost:8765 for local dev). They are authored on a
clearly-named experiment branch.

The agent-prompt file (the WHAT)

A markdown file with frontmatter. The frontmatter holds the artifacts the executor
needs to configure the run. The body is the instruction the agent reads, with the
success criteria in plain text. The agent reads the criteria, does the work, and
reports the outcome through complete_task.

Each task runs as a fresh, stateless agent, so the body is self-contained. It does
not assume any memory of earlier tasks beyond the handoffs it is given. This
mirrors the convention hogai's TaskTool uses for its one-shot subagents
(ee/hogai/tools/task.py): each run is stateless, and the prompt carries
everything the agent needs.

---
type: instrument-events
model: claude-sonnet-4-6     # cheapest model that succeeds; cheap is the default
skills: [instrument-events]  # mini-skills to load (the HOW)
allowedTools: [Read, Edit, Grep, Glob, Bash]
disallowedTools: [enqueue_task]   # a leaf task does not seed more work
dependsOn: [init]
---

## Goal
Add at least one `posthog.capture` call at a meaningful user interaction.

## How you know you succeeded
A `posthog.capture` call exists in the app source and fires on a real user
action, not on page load. If there is no good interaction to instrument, say so
and fail the task with a reason.

## Handoff
When you finish, call `complete_task` with what you did, what your goal was, and
what the next agent should know.

The mini-skill file (the HOW)

A markdown SKILL.md with frontmatter, the format context-mill already uses for
skills (name, description, then the procedure). The agent prompt names it in
skills:, and it is delivered through the existing skill pipeline. This is how
skills work today. We are authoring small, single-purpose ones for the experiment.

What the wizard side needs

  1. A loader that reads an agent-prompt markdown file and parses its frontmatter
    into the runtime shape the executor uses: model, allowedTools,
    disallowedTools, skills, dependsOn. The body (goal, success criteria,
    handoff instruction) becomes the prompt text the agent receives. The loader also
    renders the relevant upstream handoffs ( Issue 4: Orchestrator MCP tools (in wizard-tools) #623) into a short "Context from previous
    steps" section appended to the prompt, the handoff objects turned to text. This
    is the real implementation of the resolver interface Issue 5: Executor framework + fresh per-task agent #624 defines, which Issue 5: Executor framework + fresh per-task agent #624 tested
    with an inline stub.
  2. A registry keyed by type, used by the executor and by enqueue_task
    validation, populated from the available agent-prompt files.
  3. The seed prompt, itself an agent-prompt markdown file, the
    integrate-posthog orchestrator. It tells the orchestrator to inspect the repo
    quickly and seed the queue fast, a brief glance rather than a long plan. It
    seeds through enqueue_task, keeps tasks small and discrete so they finish fast
    and stream progress, and runs on a cheap model. The first task is runnable as
    soon as seeding lands.

Key files

  • the loader and registry in src/lib/programs/orchestrator/ that fetch and parse
    the agent-prompt markdown, reusing the existing skill fetch path
    (wizard-tools.ts downloadSkill and fetchSkillMenu)
  • the agent-prompt and mini-skill markdown files, in context-mill on the
    experiment branch

Acceptance criteria

  • An agent-prompt markdown file parses into model, tools, skills, and deps from
    frontmatter, and its body becomes the agent's prompt.
  • Success criteria live in the markdown body as plain text. The agent reports
    done or failed through complete_task.
  • Each agent prompt sets a model and defaults to the cheapest viable tier. The
    orchestrator seed prompt runs on a cheap model.
  • Mini-skills load through the existing skill mechanism, verified via the
    system:init tools log (agent-interface.ts:1589).
  • The seed prompt, run against a test app, produces a valid seeded queue
    (verified in Issue 7: Walking skeleton, end-to-end with stub tasks #626).

As built (current)

Implemented in agent-prompt-loader.ts (loader, registry, resolver) and wired in
orchestrator-runner.ts. Key decisions beyond the original plan:

  • The client injects the basics; agent prompts are lean intent. The /agents
    body carries only goal + success criteria (plus genuine task specifics like the
    seed's graph). The wizard injects the I/O contract around it — who the agent is,
    that it reports via complete_task with a handoff, the project context
    (id/key/host), and a pointer to the framework's reference EXAMPLE.md. Authors
    never restate that. Two injectors: assembleSeedPrompt, assembleTaskPrompt.
  • Frontmatter label. A short human title for the queue panel; the seed can
    also override per task via enqueue_task's label.
  • Tool-name expansion. Orchestrator tools named short in frontmatter
    (enqueue_task) expand to mcp__posthog-wizard__* so disallowedTools bites.
  • Registry pre-fetched once at startup; its types drive enqueue_task
    validation. Resolving a task to its run config is then synchronous.
  • Handoff + input context rendered into each task prompt from upstream
    dependencies and the task's own inputs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions