feat(harness): inject skills as user-role messages instead of system-prompt blocks (Hermes parity)

## Summary

Hermes injects skill content as **user-role messages** rather than appending it to the system prompt. The CrowClaw harness currently embeds matched skills as `<skill>` XML blocks **inside** the system prompt (`packages/core/src/prompt-builder.ts:47`), which means the system prompt mutates per turn whenever the skill match set changes. This invalidates Anthropic prompt-cache and OpenAI prefix-cache hits on the (longer, otherwise stable) system message.

Adopt the Hermes pattern verbatim — keep accuracy, do not work around it.

## Why this matters (no shortcut)

- Anthropic `cache_control: ephemeral` saves 5min cache hit at 90% input-token discount **only if the prefix is byte-identical**. Today, `matchSkillManifests()` runs per turn and the resulting `<skill>` blocks change → system prompt changes → cache miss.
- For a 12-iteration session with skills attaching/detaching, current behavior costs ~10× the prompt-caching savings the design intended to deliver.
- Workarounds (e.g., "freeze skills per session") sacrifice the per-turn relevance score Hermes specifically values. Don't take them.

## Source (Hermes)

- `NousResearch/hermes-agent` — `AGENTS.md` documents skills-as-user-messages caching rationale: skill content is rendered into a user-role turn at invocation time, separate from the system prompt's stable bootstrap.
- `agentskills.io` standard: skill payloads carry their own envelope and are caller-injected, not provider-system-prompted.

## Scope

### Files to change

| File | Change |
|---|---|
| `packages/core/src/prompt-builder.ts` | Remove the `<skill>` block emission from `buildSystemPrompt()`. Keep `matchedSkills` parameter accepted but ignored in the system-prompt path. |
| `packages/core/src/index.ts` | In both `run()` and `runStreaming()`, after recall + skill match, prepend a synthetic user-role message `<skill name="..." tools="..."><description>...</description><instructions>...</instructions></skill>` per matched skill, immediately before the actual user message. Wrap all matched-skill messages in a single user turn so they share one cache key. |
| `packages/core/src/index.ts` | Track injected skill turn so `recordTurn()` does not persist the skill payload to session state (these are ephemeral injection artifacts, like memory recall). |
| `tests/agent-loop.test.ts` (or similar) | New test: when the same user message is sent twice, the system prompt is byte-identical across calls (verify via SHA-256 hash). Today this test would fail when skills match differently. |

### Backend contract changes
- The shape of provider request bodies changes: `messages[]` now begins with optional skill-injection user turns. No public route signature changes.
- The `MatchedSkill[]` array remains exported from `core` for plugin/observability consumers.

### What must NOT change
- `matchSkillManifests()` scoring algorithm and the 3-skill cap.
- The `skillTokenBudget` (default 16k) — applied to the new user-turn block.
- The on-disk skill format (SKILL.md frontmatter).
- Persona prompt placement (still at the top of the system prompt).
- Memory recall placement (still as a `<recalled-context>` system message).

## Acceptance criteria

- [ ] When the same user message is sent on two consecutive turns and the same skill is matched, the system prompt sent to the provider is byte-identical (SHA-256 hash equality).
- [ ] When skill matches change between turns, the system prompt stays byte-identical and the skill payload appears in a user-role turn instead.
- [ ] Anthropic provider's `cache_control` marker stays effective on the system prompt across skill-match changes.
- [ ] No skill content is persisted to `SessionState.messages` (verified by inspecting the message log after a turn).
- [ ] Existing skill-driven behavior (skill instructions actually shape the agent's reply) continues to pass current tests after the new test above is added.

## Performance target

- Anthropic prompt-cache hit rate on system prompt: ≥ 90% across a 10-turn session where skill matches vary. Measured by Anthropic API response `cache_read_input_tokens / total_input_tokens`.
- Average input-token cost per turn after the second turn drops to ≤ 20% of an uncached call (90% discount × ~95% prefix-cacheable).

## Out of scope

- Implementing skills-as-user-messages for non-Anthropic providers' caching (OpenAI prefix-cache works with the same mechanism but isn't measured here).
- Adding new skill formats or extending `MatchedSkill` shape.
- Skill ranking changes.

Labels: `enhancement`, `priority/critical`, `perf`, `source/hermes`


File	Change
`packages/core/src/prompt-builder.ts`	Remove the `<skill>` block emission from `buildSystemPrompt()`. Keep `matchedSkills` parameter accepted but ignored in the system-prompt path.
`packages/core/src/index.ts`	In both `run()` and `runStreaming()`, after recall + skill match, prepend a synthetic user-role message `<skill name="..." tools="..."><description>...</description><instructions>...</instructions></skill>` per matched skill, immediately before the actual user message. Wrap all matched-skill messages in a single user turn so they share one cache key.
`packages/core/src/index.ts`	Track injected skill turn so `recordTurn()` does not persist the skill payload to session state (these are ephemeral injection artifacts, like memory recall).
`tests/agent-loop.test.ts` (or similar)	New test: when the same user message is sent twice, the system prompt is byte-identical across calls (verify via SHA-256 hash). Today this test would fail when skills match differently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(harness): inject skills as user-role messages instead of system-prompt blocks (Hermes parity) #230

Summary

Why this matters (no shortcut)

Source (Hermes)

Scope

Files to change

Backend contract changes

What must NOT change

Acceptance criteria

Performance target

Out of scope

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat(harness): inject skills as user-role messages instead of system-prompt blocks (Hermes parity) #230

Description

Summary

Why this matters (no shortcut)

Source (Hermes)

Scope

Files to change

Backend contract changes

What must NOT change

Acceptance criteria

Performance target

Out of scope

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions