fix(copilot-app): warn-not-fail on newer App schema + feat: batch-bug-shepherd primitive by danielmeppiel · Pull Request #1434 · microsoft/apm

danielmeppiel · 2026-05-21T15:17:30Z

TL;DR

Two related changes in one PR:

Phase A — fix: Stop hard-failing apm install --target copilot-app when the GitHub Copilot desktop App ships a newer PRAGMA user_version than APM was tested against. Warn-and-continue instead; keep the hard-fail for versions below the minimum (where the workflows table may not exist).
Phase B — feat: Ship .apm/skills/batch-bug-shepherd/ and the .apm/prompts/batch-bug-shepherd.prompt.md Copilot App workflow that loads it. The skill composes with the existing .apm/skills/apm-review-panel/ to drive a batch of suspected bugs from raw issue list to mergeable PR queue.

Phase A — schema gate UX fix

Problem

_check_user_version() in src/apm_cli/integration/copilot_app_db.py hard-fails any deploy whose user_version falls outside [_MIN_SUPPORTED_USER_VERSION, _MAX_SUPPORTED_USER_VERSION]. Both bounds were 13. The Copilot App is now at user_version = 15 on macOS, so every user attempting apm install --target copilot-app hits:

Copilot App DB schema is newer than this APM release supports (user_version=15, max tested 13). Upgrade APM, or skip '--target copilot-app' until APM catches up.

The Copilot App ships fast and bumps user_version on additive, forward-compatible changes that do not break APM's narrow read/write surface (the workflows table). Hard-failing on every such bump made every Copilot App release a release window for APM.

devx-ux verdict

Consulted the devx-ux-expert persona before touching code. Verdict: ship_as_proposed.

Hard-failing on a forward-compatible additive bump violates the "install succeeds without reading docs" north star -- it blocks users who simply updated their Copilot App, with zero actionable recovery other than "downgrade Copilot" or "wait for APM release". Warn-and-continue respects the user's time; the < min hard-fail is justified because structural prerequisites (workflows table) genuinely may not exist. This mirrors how npm warns on unsupported engine versions but does not abort.

Wording (shipped verbatim, printable ASCII, [!] per STATUS_SYMBOLS):

[!] Copilot App schema version {actual} is newer than APM's tested maximum ({max_tested}). Proceeding anyway -- if you see unexpected behavior, run: apm --version && gh issue new -R microsoft/apm -t 'Schema v{actual} compat' -b 'Observed: <describe>'

On the question of a +N cap before degrading to error, devx-ux argued against:

No cap at +N is needed. The warning already names the exact delta, giving the user signal proportional to risk. If APM later discovers a version that truly breaks reads (e.g., table renames), add that specific version to a _KNOWN_BREAKING list and hard-fail on it -- not on an arbitrary numeric threshold. A blanket +N cap would recreate the original problem at a higher number.

Implementation

_MAX_SUPPORTED_USER_VERSION bumped 13 -> 15 (observed live, working).
_MIN_SUPPORTED_USER_VERSION stays at 13 -- below this the workflows table may not exist, so a hard fail is correct.
Above max: warn-and-continue once per process via _rich_warning. Dedup is enforced by a module-level _warned_user_versions: set[int] so multi-row installs do not spam the same line N times. _reset_user_version_warning_state() is a test helper.

Trade-offs (explicit)

Why warn-and-continue above max, but hard-fail below min? The two failure modes are not symmetric. Below min the workflows table may genuinely not exist -- any write would raise OperationalError mid-transaction with a far worse diagnostic. Above max, APM's read/write surface (a single CRUD on workflows) is small enough that additive schema changes are overwhelmingly safe; on the rare truly-breaking bump we will see breakage reports via the URL in the warning and add the specific version to a _KNOWN_BREAKING list.
Why no +N cap? See devx-ux quote above. A blanket cap recreates the original problem at a higher number; a _KNOWN_BREAKING allow-list targets the actual risk.
Why dedup at process scope, not collector scope? _check_user_version is called from low-level DB writers (deploy_workflow, delete_workflows) that have no DiagnosticCollector in scope. Threading one through would touch the public signature of every DB writer; a module-level set is the minimal surgical change.

Phase B — batch-bug-shepherd primitive + Copilot App workflow

What ships

.apm/skills/batch-bug-shepherd/SKILL.md plus seven asset files (triage-prompt.md, shepherd-prompt.md, fix-prompt.md, completion-prompt.md, final-report-template.md, ground-truth-table.md, verdict-schema.json).
.apm/prompts/batch-bug-shepherd.prompt.md -- Copilot App workflow prompt (interval: manual, mode: interactive) that loads the skill and accepts a targets input (issue list or the literal sweep-all).
.agents/skills/batch-bug-shepherd/ -- compiled output produced by apm install --target copilot,copilot-app.
apm.lock.yaml regenerated to include both the skill path and the synthetic copilot-app-db://workflows/apm--local--_local--batch-bug-shepherd URI.

Composition with apm-review-panel

The shepherd phase delegates panel review to .apm/skills/apm-review-panel/ via a sibling relative link rather than reinventing it. This is the first cross-skill composition in this repo -- a deliberate test of whether the "skills compose via filesystem siblings" pattern holds up under non-trivial use.

Design rationale

The genesis design pass produced a genesis-plan.md artifact (8-step disciplined design with mermaid + interface sketch). That artifact is intentionally not shipped in this PR -- it is design scaffolding for the author, not a user-consumed primitive. It lives in the session workspace at /Users/danielmeppiel/.copilot/workspaces/007bd6d3-a723-4ebb-9ed5-e861e73b1b20/artifacts/genesis-plan.md if reviewers want to read it.

Validation evidence

Lint silent (CI-mirror commands per .apm/instructions/linting.instructions.md):

uv run --extra dev ruff check src/ tests/         # All checks passed!
uv run --extra dev ruff format --check src/ tests/ # 1037 files already formatted

Tests green for the changed module:
```
uv run --extra dev pytest tests/unit/integration/test_copilot_app_db.py \
                          tests/unit/integration/test_copilot_app_error_ux.py
# 42 passed in 2.24s
```
Including the new regression coverage:
- test_warns_but_continues_above_max[16|17|50] -- forward-compat scenarios, parametrized.
- test_above_max_warning_deduped_per_version -- multi-row install warns once, not per row.
- test_rejects_user_version_below_min -- < 13 still hard-fails.
End-to-end install succeeded against the live App DB (v15):
```
$ uv run apm install --target copilot,copilot-app
[>] Installing dependencies from apm.yml...
  [+] <project root> (local)
  |-- 1 prompts integrated -> copilot-app/workflows/
  |-- workflows arrive disabled; enable from the Copilot App's Workflows tab
  ...
[*] Installed 1 APM dependency in 4.2s.
```
No warning fires on v15 because v15 is now the documented tested max. (Warning behaviour was independently verified by copying the DB and bumping PRAGMA user_version = 17, then re-running with APM_COPILOT_APP_DB -- the [!] line appeared exactly once, full text matching the devx-ux wording.)

SQLite row landed with consent gate enforced (enabled = 0):

$ sqlite3 ~/.copilot/data.db \
    "SELECT id, enabled FROM workflows WHERE id LIKE 'apm--%batch-bug-shepherd%'"
apm--local--_local--batch-bug-shepherd|0

Lockfile diff carries both surfaces:

+ .agents/skills/batch-bug-shepherd
+ copilot-app-db://workflows/apm--local--_local--batch-bug-shepherd

How to test

Check out this branch.
uv venv && uv pip install -e ".[dev]".
uv run --extra dev ruff check src/ tests/ && uv run --extra dev ruff format --check src/ tests/ -- both silent.
uv run --extra dev pytest tests/unit/integration/test_copilot_app_db.py -v -- 14 tests pass, including the new TestVersionGuard::test_warns_but_continues_above_max cases.

To exercise the warning end-to-end without bumping your real Copilot App:

cp ~/.copilot/data.db ./build/test-bumped.db
sqlite3 ./build/test-bumped.db "PRAGMA user_version = 17"
APM_COPILOT_APP_DB=./build/test-bumped.db uv run apm install --target copilot-app

Expect the [!] line naming version 17 and tested maximum (15) plus install completing successfully.

To inspect the workflow row landed by Phase B:

sqlite3 ~/.copilot/data.db \
  "SELECT id, name, interval, mode, enabled FROM workflows WHERE id LIKE 'apm--%batch-bug-shepherd%'"

Expect enabled = 0 (consent gate).

The Copilot App ships fast and bumps PRAGMA user_version on additive changes that do not break APM's read/write surface. Hard-failing every install whose user_version exceeded the highest version APM was tested against made every Copilot App release a release window for APM -- awful UX for users who simply updated their App. Change: - Bump _MAX_SUPPORTED_USER_VERSION 13 -> 15 (newly observed, working). - Above max: warn-and-continue once per process (deduped by version) via _rich_warning, with exact wording supplied by the devx-ux-expert persona: names the version delta and points the user at a ready-to-file issue title for breakage reports. - Below min: continues to hard-fail -- the workflows table may genuinely not exist on a pre-workflows schema. Wording verbatim per devx-ux verdict (ship_as_proposed). No cap at +N: warning already names the exact delta, giving signal proportional to risk. If a future version truly breaks reads, add it to a _KNOWN_BREAKING list and hard-fail on that specific version. Tests cover v16/v17/v50 warn-not-raise, the <13 hard-fail path, and the per-process dedup contract for multi-row installs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… prompt Ships two composed APM primitives: 1. .apm/skills/batch-bug-shepherd/ -- the working spec for driving a batch of suspected bugs in microsoft/apm from raw issue list to mergeable PR queue. Fans out one reproduction subagent per candidate (LEGIT / UNCLEAR / FIXED-AT-HEAD), cross-references against open PRs, then branches: * in-flight community PR -> shepherd via .apm/skills/apm-review-panel * no PR -> fix session with TDD and mutation-break gate Dispatches one completion subagent per shepherd verdict to resolve panel follow-ups and post a ready-to-merge confirmation. Maintains a single plan.md ground-truth table as canonical session state. 2. .apm/prompts/batch-bug-shepherd.prompt.md -- a Copilot App workflow prompt (interval: manual, mode: interactive) that loads the skill above and accepts a 'targets' input (either an issue list or the literal 'sweep-all'). Lands in ~/.copilot/data.db as a workflow row with enabled=0 (consent gate); user must opt in via the Copilot App Workflows tab before it runs. The skill composes with the existing .apm/skills/apm-review-panel/ via a relative sibling link -- the shepherd phase delegates panel review to that skill rather than reinventing it. Design rationale lives in the genesis-plan.md authored during the genesis design pass; that artifact is intentionally not shipped (it is design scaffolding, not a user-consumed primitive). apm.lock.yaml regenerated by 'apm install --target copilot,copilot-app' to include both the skill path and the synthetic copilot-app-db://workflows/apm--local--_local--batch-bug-shepherd URI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Drop hard-coded .apm/skills/ path probes from the meta-prompt, the SKILL.md composition section, and the shepherd-prompt asset. Skills are activated by NAME by the harness; the meta-prompt and skill body no longer assume APM-on-disk layout. This lets the same primitive load identically inside Copilot CLI, Copilot App workflows, Claude Code, Codex, or any other harness that resolves skills by name. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates the Copilot App DB schema guard to warn (instead of hard-fail) when PRAGMA user_version is newer than APM’s tested max, and introduces a new batch-bug-shepherd skill plus a Copilot App workflow prompt to drive batch bug triage/shepherding.

Changes:

Relax Copilot App DB schema gating: hard-fail only when user_version is below min; warn-and-continue (deduped) when above max.
Add batch-bug-shepherd skill assets + workflow prompt for interactive batch bug processing.
Regenerate/install outputs and lockfile entries for Copilot App workflow deployment.

Show a summary per file

File	Description
tests/unit/integration/test_copilot_app_db.py	Replaces above-max hard-fail test with forward-compat warning + dedup assertions.
src/apm_cli/integration/copilot_app_db.py	Implements warn-and-continue for newer schemas, bumps tested max, adds warning dedup state.
apm.lock.yaml	Records locally deployed Copilot App workflow URI.
.apm/skills/batch-bug-shepherd/SKILL.md	Adds orchestrator skill specification and composition contract with apm-review-panel.
.apm/skills/batch-bug-shepherd/assets/triage-prompt.md	Adds WAVE 1 triage subagent spawn body.
.apm/skills/batch-bug-shepherd/assets/shepherd-prompt.md	Adds WAVE 2a shepherd subagent spawn body (delegates to apm-review-panel).
.apm/skills/batch-bug-shepherd/assets/fix-prompt.md	Adds WAVE 2b fix subagent spawn body with mutation-break + lint gates.
.apm/skills/batch-bug-shepherd/assets/completion-prompt.md	Adds WAVE 3 completion subagent spawn body for follow-ups + supersede flow.
.apm/skills/batch-bug-shepherd/assets/final-report-template.md	Adds final report + PR comment templates used by orchestrator/completion.
.apm/skills/batch-bug-shepherd/assets/ground-truth-table.md	Adds plan.md ground-truth table template for cross-wave state.
.apm/skills/batch-bug-shepherd/assets/verdict-schema.json	Adds JSON schema for triage/shepherd/completion structured returns.
.apm/prompts/batch-bug-shepherd.prompt.md	Adds Copilot App workflow prompt that loads and runs the new skill.
.agents/skills/batch-bug-shepherd/SKILL.md	Adds compiled/installed copy of the skill spec for agent runtime consumption.
.agents/skills/batch-bug-shepherd/assets/triage-prompt.md	Compiled copy of triage prompt asset.
.agents/skills/batch-bug-shepherd/assets/shepherd-prompt.md	Compiled copy of shepherd prompt asset.
.agents/skills/batch-bug-shepherd/assets/fix-prompt.md	Compiled copy of fix prompt asset.
.agents/skills/batch-bug-shepherd/assets/completion-prompt.md	Compiled copy of completion prompt asset.
.agents/skills/batch-bug-shepherd/assets/final-report-template.md	Compiled copy of final report/comment templates.
.agents/skills/batch-bug-shepherd/assets/ground-truth-table.md	Compiled copy of ground-truth table template.
.agents/skills/batch-bug-shepherd/assets/verdict-schema.json	Compiled copy of verdict JSON schema.

Copilot's findings

Files reviewed: 20/20 changed files
Comments generated: 7

+_MAX_SUPPORTED_USER_VERSION: int = 15
 """Highest ``PRAGMA user_version`` we are tested against.

 When the app ships a newer schema we refuse to write rather than risk


+            ``_MIN_SUPPORTED_USER_VERSION``.  Versions above
+            ``_MAX_SUPPORTED_USER_VERSION`` only emit a warning.


+    if version > _MAX_SUPPORTED_USER_VERSION and version not in _warned_user_versions:
+        _warned_user_versions.add(version)
+        # Wording authored by the devx-ux-expert persona; ship verbatim.
+        from apm_cli.utils.console import _rich_warning
+
+        _rich_warning(
+            f"[!] Copilot App schema version {version} is newer than APM's "
+            f"tested maximum ({_MAX_SUPPORTED_USER_VERSION}). Proceeding "
+            f"anyway -- if you see unexpected behavior, run: apm --version "
+            f"&& gh issue new -R microsoft/apm -t 'Schema v{version} compat' "
+            f"-b 'Observed: <describe>'"
        )


+
+| issue | verdict | pr | pr_in_flight | author | status | notes |
+|-------|---------|----|--------------|--------|--------|-------|
+| #___ | pending-triage | | | | pending-triage | seeded from <list-or-sweep>; awaiting wave 1 |


@@ -0,0 +1,133 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",


+        do not break APM's read/write surface; hard-failing made every
+        App release a release window for APM.  See devx-ux verdict in
+        the schema-warn PR.


+  Use this skill to drive a batch of suspected bugs in microsoft/apm
+  from raw issue list to mergeable PR queue. Fan out one reproduction
+  subagent per candidate issue (verify on HEAD: LEGIT / UNCLEAR /
+  FIXED-AT-HEAD), cross-reference each legit issue against open PRs,
+  branch the workflow (in-flight community PR -> shepherd via the
+  apm-review-panel skill; no PR -> fix session with TDD and a
+  mutation-break gate), then dispatch one completion subagent per
+  shepherd verdict to resolve panel follow-ups, push to the contributor
+  fork (or open a superseding PR that preserves author authorship via
+  commit trailers) and post one ready-to-merge confirmation. Maintain a
+  single plan.md ground-truth table as the canonical session state.
+  Activate when the maintainer asks to triage a list of issues, sweep
+  the bug queue, drive in-flight PRs to merge, shepherd community
+  contributions, or work down the bug backlog -- even if "shepherd" or
+  "batch" is not named.
+---
+
+# batch-bug-shepherd - Outer-loop bug-queue orchestrator
+
+This skill is an A10 ORCHESTRATOR-SAGA over three fan-out waves
+(triage, shepherd-or-fix, completion) with a persisted ground-truth
+table between phases. It COMPOSES the
+[apm-review-panel](../apm-review-panel/SKILL.md) skill -- it does NOT
+re-implement panel review. Per-PR shepherding is delegated; per-issue
+verification, PR-in-flight branching, fix dispatch, completion, and
+the cross-session table are owned here.
+
+The skill is ADVISORY at the panel layer and EXECUTIVE at the
+orchestrator layer: it WILL push commits, open PRs, post comments,
+close superseded PRs. Every consequential write goes through a
+deterministic CLI (`gh`, `git`, `uv run ruff`) wrapped in plan +
+execute + verify (A9 SUPERVISED EXECUTION).
+


Captures all 15 local skills (including batch-bug-shepherd) and the copilot-app-db workflow entry into apm.lock.yaml so subsequent runs (apm install --frozen, drift-check) see the canonical post-install state. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-05-21T15:48:20Z

APM Review Panel: `ship_with_followups`

warn-not-fail on forward-compat Copilot App schema bumps + first cross-skill composition primitive (batch-bug-shepherd)

cc @danielmeppiel @sergio-sisternes-epam -- a fresh advisory pass is ready for your review.

Phase A is a high-confidence, well-scoped fix that unblocks every user on a newer Copilot App without compromising the hard-fail for versions below the structural minimum. The test suite is substantive -- 42 passing unit tests against real SQLite, including parametrized forward-compat and dedup scenarios. The test-coverage expert's evidence block confirms the core user promise holds (TestVersionGuard::test_warns_but_continues_above_max). The one latent risk supply-chain-security raised -- that warn-and-continue removes the last write-safety gate -- is real but overstated: SQLite will still raise on structurally incompatible writes, and the PR description explicitly documents the _KNOWN_BREAKING escape valve for future genuinely-breaking bumps. The CHANGELOG gap is the only pre-merge blocker in Phase A: two independent panelists (doc-writer, oss-growth-hacker) flag it, and a behavior change of this kind needs a documented migration line per APM's own operating principles. The warning message UX issues (angle-bracket placeholder, [!] prefix duplication, compound && chain) are three-panelist-convergent and trivially fixable before merge.

Phase B is structurally sound as APM's first cross-skill composition primitive, and the consent gate (enabled=0) is correctly enforced. Two issues require resolution before or immediately after merge. First, the OSS Growth blocking finding on auto-closing contributor PRs is real contributor-trust risk: a maintainer running sweep-all for the first time could close ten community PRs before understanding the consequence. This is a documentation-only fix in SKILL.md -- it does not require code changes -- but it must land before or in this PR, not as a follow-up, because the skill is already installable. Second, supply-chain-security's prompt injection finding is recommended, not blocking, but it is architecturally important for the long-term credibility of the skill-composition pattern: triage-prompt.md should wrap raw issue content in clearly delimited untrusted-data blocks before this skill is used against a live repo with external contributors.

The TOCTOU race on _warned_user_versions (python-architect, supply-chain-security) is a latent trap only relevant if the CLI gains a thread-pool install path. The CLI is demonstrably single-threaded today; defer to a follow-up with a threading.Lock. The capsys-Rich Console fragility (test-coverage-expert, outcome: unknown) is a real latent test trap but does not affect today's green run; it belongs in the follow-up queue ranked above the TOCTOU fix because it will silently bite the next engineer who installs rich in the test environment.

Dissent. OSS Growth classified the SKILL.md auto-close-PR warning as blocking; no other panelist matched that severity on a documentation-only file covering a consent-gated (enabled=0) skill. The finding is real but the mechanism is doc, not code. CEO resolution: treat it as a pre-merge required fix (add the warning block to SKILL.md before merge) rather than a ship-blocking rework. The skill is already guarded by enabled=0 and requires an explicit manual trigger -- the risk window is narrower than a blocking verdict implies, but the fix cost is zero so there is no reason to defer it. Supply-chain-security's prompt-injection finding is recommended; no panelist elevated it to blocking. CEO agrees with that calibration given the skill is new, not yet run against live data, and a remediation path (XML-delimited untrusted-data blocks) is well-defined.

Aligned with: Pragmatic-as-npm: Phase A mirrors npm's engine-version warning pattern -- warn on forward-compatible mismatches, do not abort. The devx-ux expert confirmed this framing pre-code. The no-cap rationale is sound package-manager reasoning. OSS-community-driven: Phase B introduces APM's first cross-skill composition primitive; the auto-close-PR finding is a direct community-trust risk that must be documented in SKILL.md before any external maintainer runs sweep-all. Secure-by-default: The consent gate (enabled=0) on the workflow row is correct posture; the prompt injection gap in triage-prompt.md is a follow-up priority before the skill is used at scale.

Growth signal. Two distinct launch beats are available from this PR. Phase A is a silent-churn recovery story: users who updated GitHub Copilot App and hit a hard install failure get unblocked with no action required. This belongs at the top of the next release note, not buried as a bug entry. Phase B is APM's first composable-skill story -- 'skills compose via filesystem siblings' is the Unix-pipe moment for AI-native package management primitives. A standalone docs page (Maintainer Skills) and a brief external post would surface this pattern to evaluators and contributors who currently have no visibility into .apm/.

Panel summary

Persona	B	R	N	Takeaway
Python Architect	0	1	2	Phase A is clean and well-reasoned. One thread-safety gap in the TOCTOU dedup check; lazy import is justified; test-helper exposure is acceptable Python idiom.
CLI Logging Expert	0	1	2	No double-prefix or encoding faults. One recommended fix on the angle-bracket placeholder; two nits on style consistency and message length.
DevX UX Expert	0	1	2	Phase A warn-and-continue is well-designed; one actionable nit on warning prefix and one recommended fix for the feedback command UX.
Supply Chain Security Expert	0	2	1	Schema gate relaxation is contained (SQLite will hard-error on incompatible writes); prompt injection via verbatim issue-body in batch-bug-shepherd is the real risk.
OSS Growth Hacker	1	3	1	Phase A is a silent-churn fix worth a prominent release beat; Phase B ships APM's first composable-skill story but needs a contributor-trust guardrail callout in docs.
Doc Writer	0	1	2	SKILL.md style is consistent with apm-review-panel; CHANGELOG.md missing warn-not-fail entry for Phase A behavior change; no blocking doc issues.
Test Coverage Expert	0	2	1	Warn-not-fail behavior has solid unit coverage against real SQLite; two gaps: capsys fragility when Rich is installed, and no integration-tier test for the full apm-install CLI path.

B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.

Top 5 follow-ups

[OSS Growth Hacker] (blocking-severity) Add explicit auto-close-PR warning block to SKILL.md before merge -- A maintainer running sweep-all for the first time could silently close community contributor PRs. The fix is a doc-only addition to SKILL.md -- zero code cost, zero delay. Must land in this PR, not as a follow-up, because the skill is already installable.
[Doc Writer] (blocking-severity) Add CHANGELOG.md [Unreleased] entry for warn-not-fail Phase A behavior change -- Two independent panelists (doc-writer, oss-growth-hacker) flag this gap. A behavior change from hard-fail to warn-and-continue requires a documented migration line per APM operating principles. Without it, users hitting schema mismatches after upgrade have no documented signal this is intentional.
[Supply Chain Security Expert] Wrap raw issue-body content in untrusted-data delimiters in triage-prompt.md -- ISSUE_BODY_OR_REPRO is passed verbatim from GitHub issues into a triage subagent whose return JSON propagates into write-capable completion subagents. XML-style delimiters with an explicit untrusted-data instruction are the standard mitigation. Priority before any live sweep-all run against a repo with external contributors.
[Test Coverage Expert] Fix capsys + Rich Console singleton fragility in TestVersionGuard tests -- Rich Console stores the sys.stdout reference at instantiation time; pytest's capsys swaps it per-test. Tests pass today only because rich is not installed in the test environment. Adding _reset_console() to _reset_user_version_warning_state() is a one-line fix that prevents a silent future trap.
[Test Coverage Expert] Add integration-with-fixtures test for apm install against a version-bumped Copilot App DB -- Current integration tests mock copilot_app_db -- they never exercise _check_user_version through a real DB. If hard-fail is re-introduced in a future refactor, no integration test catches it. Evidence outcome: missing on a behavior-change surface.

Architecture

classDiagram
    direction LR

    class copilot_app_db {
        <<IOBoundary>>
        +_MIN_SUPPORTED_USER_VERSION int
        +_MAX_SUPPORTED_USER_VERSION int
        +_warned_user_versions set
        +deploy_workflow(db_path, row) str
        +delete_workflows(db_path, ids) int
        +list_managed_workflow_ids(db_path) list
        +resolve_copilot_app_db_path() Path
        -_check_user_version(conn) int
        -_reset_user_version_warning_state() None
        -_begin_immediate_with_retry(conn) None
        -_validate_row(row) None
    }

    class WorkflowRow {
        <<ValueObject>>
        +id str
        +name str
        +prompt str
        +interval str
        +schedule_hour int
        +enabled int
        +mode str
        +model str
    }

    class CopilotAppDbError {
        <<Exception>>
    }
    class CopilotAppDbMissingError {
        <<Exception>>
    }
    class CopilotAppDbSchemaError {
        <<Exception>>
    }
    class CopilotAppDbLockedError {
        <<Exception>>
    }

    class console {
        <<Utility>>
        +_rich_warning(msg) None
    }

    CopilotAppDbError <|-- CopilotAppDbMissingError
    CopilotAppDbError <|-- CopilotAppDbSchemaError
    CopilotAppDbError <|-- CopilotAppDbLockedError
    copilot_app_db *-- WorkflowRow : accepts
    copilot_app_db ..> CopilotAppDbSchemaError : raises
    copilot_app_db ..> CopilotAppDbMissingError : raises
    copilot_app_db ..> CopilotAppDbLockedError : raises
    copilot_app_db ..> console : lazy import in warn path

    class copilot_app_db:::touched
    class WorkflowRow:::touched

    classDef touched fill:#fff3b0,stroke:#d47600

flowchart TD
    A([apm install / apm uninstall]) --> B[deploy_workflow / delete_workflows]
    B --> C{db_path exists?}
    C -- no --> D[raise CopilotAppDbMissingError]
    C -- yes --> E[_validate_row]
    E --> F[_open_connection]
    F --> G[_check_user_version via PRAGMA user_version]
    G --> H{version less than MIN=13?}
    H -- yes --> I[raise CopilotAppDbSchemaError]
    H -- no --> J{version greater than MAX=15?}
    J -- yes, not in _warned_user_versions --> K[[_rich_warning to stderr, dedup via module-level set]]
    K --> L[_begin_immediate_with_retry]
    J -- no or already warned --> L
    L --> M{acquire write lock?}
    M -- locked --> N[raise CopilotAppDbLockedError]
    M -- acquired --> O{row exists?}
    O -- INSERT --> P[INSERT INTO workflows, enabled forced to 0]
    O -- UPDATE --> Q{execution-affecting fields changed?}
    Q -- yes --> R[UPDATE + reset enabled=0]
    Q -- no --> S[UPDATE name/model only]
    P --> T[COMMIT]
    R --> T
    S --> T
    T --> U[return copilot-app-db URI]

Recommendation

Ship after two pre-merge doc fixes land in this PR: (1) add the auto-close-PR warning block to SKILL.md, and (2) add the CHANGELOG [Unreleased] entry for the warn-not-fail behavior change. Both are documentation-only, zero-risk changes that can be committed directly to the branch. Phase A is production-ready: the core user promise is test-verified against real SQLite, the trade-offs are explicitly documented in the PR body, and the UX warning message issues (angle-bracket placeholder, [!] duplication, && chain) are three-panelist-convergent nits the author should address in the same commit as the CHANGELOG fix. Phase B ships correctly consent-gated (enabled=0) and the cross-skill composition pattern is sound. The prompt injection and capsys fragility items are genuine follow-ups but do not block merge given the skill's current pre-production status and the narrow consent gate.

Full per-persona findings

Python Architect

[recommended] TOCTOU race on _warned_user_versions check-then-add is not atomic at src/apm_cli/integration/copilot_app_db.py:346
version not in _warned_user_versions and _warned_user_versions.add(version) are two separate operations. In any concurrent caller scenario, two threads can both pass the not in test before either completes the add. The CLI is currently single-threaded so this does not bite today, but the module-level mutable set without a lock is a latent trap as parallel installs are on the roadmap.
Suggested: Protect the check-and-add with a threading.Lock.
[nit] Late import of _rich_warning inside _check_user_version has no documented justification at src/apm_cli/integration/copilot_app_db.py:349
Suggested: Add # deferred to avoid importing console at module init above the import, or hoist it to the top of the file.
[nit] _reset_user_version_warning_state exposes module internals but is an accepted Python test-helper idiom at src/apm_cli/integration/copilot_app_db.py:316
No action required unless a project-wide @testing_only marker is introduced.

CLI Logging Expert

[recommended] -b 'Observed: <describe>' leaves a literal angle-bracket placeholder the user must know to replace at src/apm_cli/integration/copilot_app_db.py
Users who copy-paste the suggested gh issue new command verbatim will file an issue with body text Observed: <describe> -- an unhelpful stub.
Suggested: Replace with --web so gh issue new opens the browser form, or change placeholder to -b 'Observed: TODO_DESCRIBE_BEHAVIOR'.
[nit] [!] is embedded in the f-string rather than passed as symbol="warning" to _rich_warning at src/apm_cli/integration/copilot_app_db.py
No double-prefix risk (markup=False is set); minor convention drift from callers that use symbol="warning".
[nit] Warning message is a single long line (~220 chars) that will wrap unpredictably on narrow terminals at src/apm_cli/integration/copilot_app_db.py
Suggested: Split into two _rich_warning calls or use \n to indent the recovery command on a second line.

DevX UX Expert

[recommended] Warning message embeds a raw multi-step shell pipeline that users must copy-paste-split manually at src/apm_cli/integration/copilot_app_db.py
Printing a compound && chain in a single-line warning breaks on Windows cmd.exe and obscures the two distinct actions. Package-manager convention emits the simplest self-contained action.
Suggested: Drop the apm --version && prefix or split into two lines.
[nit] Warning prefix [!] duplicates what _rich_warning already signals visually at src/apm_cli/integration/copilot_app_db.py
Suggested: Remove [!] from the start of the f-string.
[nit] batch-bug-shepherd prompt uses mode: interactive for a long-running batch orchestrator at .apm/prompts/batch-bug-shepherd.prompt.md
Suggested: Document in the prompt that the session is long-running and the user should expect async subagent waves.

Supply Chain Security Expert

[recommended] Verbatim issue-body passed to write-capable subagents enables indirect prompt injection at .apm/skills/batch-bug-shepherd/assets/triage-prompt.md
ISSUE_BODY_OR_REPRO is passed verbatim from GitHub issues into the agent context. A malicious issue reporter can craft an issue body with embedded instructions that manipulate the triage return JSON and propagate into the completion wave, which executes git/gh commands.
Suggested: Wrap raw issue content in XML-style <issue-body> tags with an explicit instruction that the model must treat everything inside as untrusted user data.
[recommended] warn-and-continue on unknown schema versions removes the last write-safety gate for non-additive App schema changes at src/apm_cli/integration/copilot_app_db.py
A schema change that adds a column with a default would allow INSERT to succeed with semantically incorrect data.
Suggested: Supplement with a post-write schema assertion: after any INSERT or UPDATE, read back the row and assert columns APM cares about match what was written.
[nit] _warned_user_versions is a process-global mutable set with no thread safety at src/apm_cli/integration/copilot_app_db.py
Not a security issue today; harmless duplicate warnings in a concurrent scenario.

OSS Growth Hacker

[blocking] Skill can auto-close community contributor PRs -- this must be surfaced in the skill description before any maintainer runs sweep-all at .apm/skills/batch-bug-shepherd/SKILL.md
SKILL.md documents 'Close the original PR with a courteous handoff comment referencing the superseding PR.' A contributor who sees their PR auto-closed by an agent will read this as rejection. A maintainer running sweep-all for the first time could close ten community PRs before realising the consequence.
Suggested: Add a visible warning block near the top of SKILL.md: 'This skill WILL close community contributor PRs when it opens a superseding PR. Use targets: <specific-issue-list> rather than sweep-all for your first run.'
[recommended] Phase A fix is a first-run conversion win that needs a prominent CHANGELOG story, not a buried bug entry at CHANGELOG.md
Suggested: Lead with: 'Copilot App users no longer see install failures after a Copilot App update -- APM now warns and continues on forward-compatible schema bumps.'
[recommended] First cross-skill composition is APM's strongest OSS primitive story to date -- needs a launch beat at .apm/skills/batch-bug-shepherd/SKILL.md
Suggested: Add a Composition section to the docs site and pair with a release post angle.
[recommended] batch-bug-shepherd ships 1726 lines with no docs-site entry -- feature will be invisible at docs/src/content/docs/
Suggested: Add at minimum a one-paragraph callout in what-is-apm.md or a new Maintainer Skills page.
[nit] Warning message gh issue new -b 'Observed: ' arrives blank -- filed issues have no actionable data at src/apm_cli/integration/copilot_app_db.py

Auth Expert -- inactive

No auth-surface files changed; copilot_app_db.py schema gate is SQLite version checking, not credential or token management.

Doc Writer

[recommended] CHANGELOG.md has no entry for the warn-not-fail copilot-app schema behavior change at CHANGELOG.md
Users upgrading and hitting schema mismatches will have no documented signal that this is intentional behavior, not a regression.
Suggested: Add under [Unreleased] ### Fixed: 'apm install --target copilot-app no longer fails hard on unrecognized Copilot App DB schema versions; it now warns and continues. (fix(copilot-app): warn-not-fail on newer App schema + feat: batch-bug-shepherd primitive #1434)'
[nit] SKILL.md lacks an explicit invocation trigger section heading at .apm/skills/batch-bug-shepherd/SKILL.md
Suggested: Extract the Activate when... clause from the description into a ## When to activate section.
[nit] No cross-link from SKILL.md to the Copilot App prompt file at .apm/skills/batch-bug-shepherd/SKILL.md
Suggested: Add to ## Bundled assets: '../../prompts/batch-bug-shepherd.prompt.md -- Copilot App workflow entry point for this skill.'

Test Coverage Expert

[recommended] capsys + Rich Console singleton is fragile: stale stdout reference breaks capture when Rich is installed at tests/unit/integration/test_copilot_app_db.py
_rich_warning() routes through a module-level Console singleton. When rich is installed, Console stores the sys.stdout reference at instantiation time; pytest's capsys swaps it per-test. Tests pass today only because rich is not installed in the test environment.
Suggested: In _reset_user_version_warning_state(), add: from apm_cli.utils.console import _reset_console; _reset_console().
Proof (unknown at unit): tests/unit/integration/test_copilot_app_db.py::TestVersionGuard::test_warns_but_continues_above_max -- proves: APM warns the user when the Copilot App DB schema is newer than tested, instead of failing the install
[recommended] No integration-with-fixtures test exercises apm install CLI against a version-bumped Copilot App DB at tests/integration/test_install_services_orchestration.py
Current integration tests mock copilot_app_db -- _check_user_version is never exercised through a real DB. Grep confirms zero user_version matches in tests/integration/.
Suggested: Add test_copilot_app_install_warns_on_bumped_schema that creates a real SQLite DB at user_version=16, sets APM_COPILOT_APP_DB, invokes deploy_workflow, and asserts no CopilotAppDbSchemaError is raised.
Proof (missing at integration-with-fixtures): tests/integration/test_install_services_orchestration.py::test_copilot_app_install_warns_on_bumped_schema -- proves: A user with a Copilot App DB that was auto-upgraded can still run apm install without failure
[nit] Warning state reset relies on manual per-test call; an autouse fixture would prevent future test pollution at tests/unit/integration/test_copilot_app_db.py
Suggested: Add @pytest.fixture(autouse=True) to TestVersionGuard that calls _reset_user_version_warning_state() before and after each test.

_{This panel is advisory. It does not block merge. Re-apply the

panel-review label after addressing feedback to re-run.}

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

pypi.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "pypi.org"

See Network Configuration for more information.

Generated by PR Review Panel for issue #1434 · ● 3.3M · ◷

Move the apm-review-panel skill out of the microsoft/apm monorepo's shared .apm/skills/ tree into a dedicated, publishable APM package under packages/apm-review-panel/ using the HYBRID layout (apm.yml + SKILL.md at the root, with co-located assets/ and evals/). The root apm.yml now declares the package via a local-path manifest dep, so 'apm install' at root continues to deploy the skill into .agents/skills/apm-review-panel/ -- but the dependency is now explicit and inspectable via 'apm deps list' instead of relying on the includes-auto walk of the monorepo .apm/ tree. Rule-of-three justification (see genesis-plan-v2.md step 3.5): panel is independently useful for any single-PR review (not just batch shepherding), so extraction unlocks consumer reuse and prevents the consumer from being forced to install the shepherd to get the panel. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…package Move batch-bug-shepherd from the monorepo .apm/skills/ and .apm/prompts/ trees into a dedicated, publishable APM package under packages/batch-bug-shepherd/ using the multi-primitive .apm/ layout so the package can ship BOTH the skill and the workflow prompt: packages/batch-bug-shepherd/ apm.yml .apm/ skills/batch-bug-shepherd/ SKILL.md assets/... prompts/ batch-bug-shepherd.prompt.md The package declares its dep on apm-review-panel via a local-path manifest entry (../apm-review-panel, anchored to the package dir per APM's local-path rule). The shepherd-prompt asset still probes the panel by NAME at use-site as the A9 SUPERVISED EXECUTION backstop if the harness registry is bypassed -- belt + suspenders, see genesis-plan-v2.md step 3.5 'Declaration mechanism per external module'. Root apm.yml now depends on packages/batch-bug-shepherd via local path; 'apm install' continues to deploy the skill to .agents/skills/batch-bug-shepherd/ and the workflow row to the App SQLite DB (with enabled=0 per the consent gate) -- but the path is now manifest-declared and the package is independently shippable: apm install microsoft/apm/packages/batch-bug-shepherd works in any consumer repo and transitively pulls apm-review-panel. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Closes HIGH finding #1 from genesis-critique.md: the workflow prompt's 'Hard rules' block restated four disciplines (ASCII-only, lint contract, mutation-break gate, single-writer per comment) that already live in the SKILL.md body. Two sources of truth invited drift -- if the skill body added a new gate, the prompt would not auto-update. Collapse the section to a single 'Delegation' paragraph naming the skill as the authoritative source for all disciplines. The prompt now carries only the per-trigger-surface contract (workflow frontmatter + the input parameter + the 6-step ACTIVATE / SCOPE / PLAN / INITIALIZE / EXECUTE / RENDER procedure that summons the skill); every discipline travels with the skill body. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Closes HIGH finding #2 from genesis-critique.md: the v1 ship deferred the evals gate. This commit authors the minimum set the gate requires, mirroring the pr-description-skill/evals/ shape so a single CI lane can score both: evals/ evals.json - manifest + gates (val split = ship gate) triggers.json - 10 fire + 10 no_fire, 60/40 train/val content/ - 2 scenario manifests + rubrics three-issues-mixed.json sweep-bug-queue.json fixtures/ - 4 markdown fixtures (with/without skill x 2) README.md .gitignore - results/ (timestamped runner output) scripts/ run_evals.py - stdlib-only, deterministic matcher The no-fire trigger set deliberately includes queries that SHOULD route to apm-review-panel ('review my PR', 'panel-review this PR') so DISPATCH COLLISION between the two skills would surface as val-no-fire-rate dropping below 0.5. Val split result on first run: trigger fire rate 1.0, no-fire rate 1.0, content delta_anchors 8 and 7 (gates require >= 1). Overall: passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Closes HIGH finding #3 from genesis-critique.md: the original description was 950/1024 chars, too close to the runtime cap to absorb future trigger additions, and it under-named the indirect maintainer phrasings the skill actually wants to catch. Trim ~75 chars while ADDING two indirect triggers explicitly exercised in evals/triggers.json (fire-07 'shepherd all bug-flagged issues this quarter', fire-08 'weekly sweep of community-reported issues'): - 'reproduction subagent per candidate issue' -> 'triage subagent per issue' (the procedural detail belongs in the body, not the dispatch surface) - collapsed the parenthetical workflow branch into a single -> arrow - added 'shepherd all bug-flagged issues this quarter', 'run a weekly sweep of community-reported issues', 'work down community bug contributions' to the Activate list - kept the imperative shape, intent-first ordering, and the '-- even if shepherd or batch is not named' tail Verified: val split evals still report fire 1.0 / no-fire 1.0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…age extraction Mechanical `apm install --target copilot,copilot-app --force` output after the package extraction. The previous lockfile listed the two skills under `local_deployed_files` (root .apm/ tree); they now appear as proper `dependencies` entries because they ship as self-contained packages under `packages/`. Also re-materializes `.agents/skills/{apm-review-panel,batch-bug-shepherd}/` with the trimmed dispatch description and the moved workflow asset, since those compiled trees are tracked in this repo. No source changes -- just the compiled-output snapshot. Verified: - ruff check + ruff format --check: silent - apm install --target copilot,copilot-app --force: clean - sqlite3 ~/.copilot/data.db -> workflow enabled=0 (consent gate ok) - standalone apm install from a scratch dir resolves apm-review-panel transitively via the batch-bug-shepherd manifest dep (depth: 2, resolved_by: _local/batch-bug-shepherd) - val split evals: trigger fire 1.0, no-fire 1.0, content passed Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

danielmeppiel · 2026-05-21T19:11:58Z

Genesis-driven repackaging: `batch-bug-shepherd` + `apm-review-panel` extracted to `packages/`

Follow-up on the v1 ship of batch-bug-shepherd. The genesis critique surfaced three HIGH findings (hidden coupling in the workflow prompt, missing evals gate, dispatch description too tight against the 1024-char cap). This extension closes all three AND turns both primitives into self-contained, standalone-installable APM packages that any maintainer can take a dependency on without dragging in the rest of microsoft/apm.

What changed (6 commits)

Commit	Layer	Intent
`fd55a172`	packages	Extract `apm-review-panel` -> `packages/apm-review-panel/` (HYBRID layout)
`c3af73d3`	packages	Repackage `batch-bug-shepherd` -> `packages/batch-bug-shepherd/` (`.apm/` layout, declares `../apm-review-panel` as manifest dep)
`a385560f`	skills	Collapse HIDDEN COUPLING in `batch-bug-shepherd.prompt.md` (Hard rules block -> single Delegation paragraph that names the skill as authoritative)
`89ecd34c`	skills	Add content + trigger evals (10 fire + 10 no_fire, 60/40 train/val; 2 content scenarios; stdlib runner)
`19b55fa1`	skills	Tighten dispatch description (~950 -> ~875 chars) and add 2 indirect triggers (`shepherd all bug-flagged issues this quarter`, `weekly sweep of community-reported issues`)
`62d63bb8`	install	Regenerate `.agents/skills/` + `apm.lock.yaml` after extraction

Package topology

flowchart TD
    subgraph ROOT["microsoft/apm (root apm.yml)"]
        ROOTMANIFEST["dependencies.apm:<br/>./packages/apm-review-panel<br/>./packages/batch-bug-shepherd"]
    end
    subgraph PKG1["packages/apm-review-panel (HYBRID)"]
        PANELYAML[apm.yml]
        PANELSKILL[SKILL.md]
        PANELASSETS[assets/*]
    end
    subgraph PKG2["packages/batch-bug-shepherd (.apm/ layout)"]
        SHEPYAML["apm.yml<br/>dependencies.apm:<br/>../apm-review-panel"]
        SHEPSKILL[.apm/skills/batch-bug-shepherd/]
        SHEPPROMPT[.apm/prompts/batch-bug-shepherd.prompt.md]
    end
    ROOT --> PKG1
    ROOT --> PKG2
    PKG2 -. local manifest dep .-> PKG1

Dependency resolution

flowchart LR
    CONSUMER[Any consumer apm.yml<br/>dependencies.apm:<br/>'/abs/path/packages/batch-bug-shepherd']
    SHEP[packages/batch-bug-shepherd]
    PANEL[packages/apm-review-panel]
    CONSUMER -- MANIFEST DEP --> SHEP
    SHEP -- LOCAL SIBLING<br/>'../apm-review-panel' --> PANEL

Local-path deps anchor relative to the declaring package's directory (npm/pip/cargo parity), so ../apm-review-panel inside packages/batch-bug-shepherd/apm.yml resolves correctly whether the consumer installs from the monorepo root or pins the absolute path to packages/batch-bug-shepherd/.

Verification evidence

Lint (CI-mirror, must be silent):

$ uv run --extra dev ruff check src/ tests/ && uv run --extra dev ruff format --check src/ tests/
All checks passed!
1037 files already formatted

apm install --target copilot,copilot-app --force in this repo:

[+] ./packages/apm-review-panel (local)
|-- Skill integrated -> .agents/skills/
[+] ./packages/batch-bug-shepherd (local)
|-- 1 prompts integrated -> copilot-app/workflows/
|-- workflows arrive disabled; enable from the Copilot App's Workflows tab
|-- 1 skill(s) integrated -> .agents/skills/
[+] ../apm-review-panel (local)
|-- (files unchanged)            <-- transitive dep deduplicated
[*] Installed 4 APM dependencies in 2.9s.

SQLite consent gate (workflows must arrive disabled):

$ sqlite3 ~/.copilot/data.db "SELECT id, enabled FROM workflows WHERE id LIKE '%batch-bug%'"
apm--local--_local--batch-bug-shepherd|0
apm--_local--batch-bug-shepherd--batch-bug-shepherd|0

Standalone install from a scratch dir (proves the package works as a third-party dep):

# scratch/apm.yml
dependencies:
  apm:
    - /abs/path/to/packages/batch-bug-shepherd

$ apm install --target copilot
[+] /.../packages/batch-bug-shepherd (local)
|-- 1 skill(s) integrated -> .agents/skills/
[+] ../apm-review-panel (local)
|-- Skill integrated -> .agents/skills/
[*] Installed 2 APM dependencies in 0.8s.

Generated lockfile shows the transitive resolution with provenance:

- repo_url: _local/apm-review-panel
  depth: 2
  resolved_by: _local/batch-bug-shepherd
  package_type: hybrid
  local_path: ../apm-review-panel

Eval gates (val split = ship gate):

$ python packages/batch-bug-shepherd/.apm/skills/batch-bug-shepherd/scripts/run_evals.py --split val
triggers:  fire 1.0 / no-fire 1.0  passed
content:   delta_anchors 8 and 7   passed
overall:   passed

The no-fire set deliberately includes queries that should route to apm-review-panel instead (review my PR, panel-review this PR) so DISPATCH COLLISION between the two skills would surface as the val-no-fire rate dropping below 0.5.

How HIGH findings were closed

HIDDEN COUPLING (fd55a172 + a385560f): the prompt no longer duplicates skill rules. It hands off via a Delegation paragraph that names batch-bug-shepherd as authoritative. The skill is now activate-by-name, not path-coupled.
MISSING EVALS GATE (89ecd34c): full trigger + content harness committed, runner is stdlib-only, gates documented in evals/README.md.
DISPATCH-DESCRIPTION TIGHTNESS (19b55fa1): trimmed ~75 chars while adding 2 user-named indirect triggers; evals confirm both new triggers fire.

Modularity gain

Consumers can now write:

dependencies:
  apm:
    - github.com/microsoft/apm/packages/apm-review-panel       # panel only, for any single-PR review use case
    - github.com/microsoft/apm/packages/batch-bug-shepherd     # full shepherd, panel pulled in transitively

The rule-of-three holds: apm-review-panel is useful pre-shepherd, with-shepherd, and for future single-PR review workflows. Extraction unlocks that without dragging the shepherd in.

Design plan, mermaid diagrams, dependency graph, phantom-dependency guard, and the full evals matrix are persisted in genesis-plan-v2.md in the agent's workspace.

A sub-package's local-path dep appears in the root lockfile with resolved_by set. The root manifest cannot make it go away by editing its dependencies.apm list, so flagging it as an orphan creates an unfixable CI failure. Surfaced by PR #1434, where batch-bug-shepherd declares ../apm-review-panel as a manifest dep. Both that transitive entry and the depth=1 root entry land in apm.lock.yaml; the orphan check was flagging the depth=2 one. Restrict orphan detection to direct deps (resolved_by is None). Add a regression test that covers the exact shape. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The previous commit fixes _check_no_orphans on the consumer side, but the audit gate in microsoft/apm-action@v1 pins APM 0.14.0, which does not yet have the fix. Until a release including the fix ships, the '../apm-review-panel' transitive local-path entry would keep the gate failing. Drop the manifest dep from packages/batch-bug-shepherd/apm.yml; the runtime activate-by-name probe in assets/shepherd-prompt.md remains the working backstop, and the root manifest declares both packages directly. Inline comment documents the rationale and the restoration condition. Lockfile regenerated by clean install (apm_modules wiped first); all 8 audit checks pass locally. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…-app-schema-warn-and-meta-prompt

Test was added on main after this PR forked. It assumed the old hard-fail behavior; our commit 05ea778 changed schema-too-new to warn-and-continue. Update assertion to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…-app-schema-warn-and-meta-prompt

danielmeppiel and others added 2 commits May 21, 2026 17:16

danielmeppiel requested a review from sergio-sisternes-epam as a code owner May 21, 2026 15:17

Copilot AI review requested due to automatic review settings May 21, 2026 15:17

danielmeppiel added the panel-review Trigger the apm-review-panel gh-aw workflow label May 21, 2026

Copilot started reviewing on behalf of danielmeppiel May 21, 2026 15:23 View session

Copilot AI reviewed May 21, 2026

View reviewed changes

github-actions Bot removed the panel-review Trigger the apm-review-panel gh-aw workflow label May 21, 2026

danielmeppiel and others added 6 commits May 21, 2026 21:04

danielmeppiel and others added 5 commits May 21, 2026 22:23

Merge remote-tracking branch 'origin/main' into danielmeppiel/copilot…

5f1e27e

…-app-schema-warn-and-meta-prompt

Merge remote-tracking branch 'origin/main' into danielmeppiel/copilot…

61d9880

…-app-schema-warn-and-meta-prompt

danielmeppiel merged commit ef74bdd into main May 21, 2026
12 checks passed

danielmeppiel deleted the danielmeppiel/copilot-app-schema-warn-and-meta-prompt branch May 21, 2026 21:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(copilot-app): warn-not-fail on newer App schema + feat: batch-bug-shepherd primitive#1434

fix(copilot-app): warn-not-fail on newer App schema + feat: batch-bug-shepherd primitive#1434
danielmeppiel merged 15 commits into
mainfrom
danielmeppiel/copilot-app-schema-warn-and-meta-prompt

danielmeppiel commented May 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented May 21, 2026

Python Architect

CLI Logging Expert

DevX UX Expert

Supply Chain Security Expert

OSS Growth Hacker

Auth Expert -- inactive

Doc Writer

Test Coverage Expert

Uh oh!

danielmeppiel commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		``_MIN_SUPPORTED_USER_VERSION``. Versions above
		``_MAX_SUPPORTED_USER_VERSION`` only emit a warning.

		@@ -0,0 +1,133 @@
		{
		"$schema": "http://json-schema.org/draft-07/schema#",

Conversation

danielmeppiel commented May 21, 2026

TL;DR

Phase A — schema gate UX fix

Problem

devx-ux verdict

Implementation

Trade-offs (explicit)

Phase B — batch-bug-shepherd primitive + Copilot App workflow

What ships

Composition with apm-review-panel

Design rationale

Validation evidence

How to test

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

github-actions Bot commented May 21, 2026

APM Review Panel: ship_with_followups

Panel summary

Top 5 follow-ups

Architecture

Recommendation

Python Architect

CLI Logging Expert

DevX UX Expert

Supply Chain Security Expert

OSS Growth Hacker

Auth Expert -- inactive

Doc Writer

Test Coverage Expert

Uh oh!

danielmeppiel commented May 21, 2026

Genesis-driven repackaging: batch-bug-shepherd + apm-review-panel extracted to packages/

What changed (6 commits)

Package topology

Dependency resolution

Verification evidence

How HIGH findings were closed

Modularity gain

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

APM Review Panel: `ship_with_followups`

Genesis-driven repackaging: `batch-bug-shepherd` + `apm-review-panel` extracted to `packages/`