Skip to content

Integrate copilot runtime#2

Merged
danielmeppiel merged 9 commits into
mainfrom
integrate-copilot-runtime
Sep 25, 2025
Merged

Integrate copilot runtime#2
danielmeppiel merged 9 commits into
mainfrom
integrate-copilot-runtime

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

Documentation and User Guidance Updates:

  • Updated all setup instructions and examples in README.md, docs/getting-started.md, docs/cli-reference.md, docs/runtime-integration.md, and related files to recommend and default to apm runtime setup copilot instead of Codex CLI. This includes new explanations, usage examples, and troubleshooting steps for Copilot CLI. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

Copilot CLI Runtime Integration:

  • Added a new script scripts/runtime/setup-copilot.sh to automate installation, configuration, and environment setup for the GitHub Copilot CLI, including token detection, MCP config directory creation, and prerequisite checks for Node.js and npm versions.

Token Management and CLI Logic:

  • Updated src/apm_cli/core/token_manager.py to clarify token precedence for Copilot CLI, remove unused npm token handling, and document the new recommended token environment variables. [1] [2] [3]
  • Adjusted the CLI (src/apm_cli/cli.py) to prioritize Copilot CLI when installing MCP dependencies, ensuring Copilot is checked and suggested before Codex or VSCode. [1] [2]

These changes make Copilot CLI the default and best-supported AI runtime for APM, streamline the onboarding experience, and ensure all documentation and automation scripts are consistent and up to date.

- Add src/apm_cli/adapters/client/copilot.py: MCP client adapter for Copilot CLI
- Add src/apm_cli/runtime/copilot_runtime.py: Runtime adapter for Copilot CLI execution
- Add scripts/runtime/setup-copilot.sh: Copilot CLI installation script

These are zero-risk additions as they are new files that don't modify existing code.
Ready for Phase 2: Runtime infrastructure integration.
- Update runtime factory to register CopilotRuntime as first priority
- Add copilot to supported runtimes in runtime manager
- Update runtime preference order: copilot → codex → llm
- Add npm-based removal logic for copilot runtime
- Export CopilotRuntime in __init__.py

Low risk changes that integrate core Copilot files into runtime system.
- Verified 'apm install --runtime' option includes copilot first
- Confirmed 'apm runtime setup copilot' command works
- Verified runtime status shows copilot as highest priority
- Runtime detection logic already prioritizes copilot correctly
- Error messages already mention copilot CLI installation

No additional changes needed - CLI integration already complete from clean-main branch.
- Verified existing tests already support copilot runtime
- Added comprehensive test_copilot_runtime.py with 12 test cases
- Tests cover runtime detection, initialization, execution, error handling
- All existing runtime factory and detection tests pass with copilot
- Integration tests already handle copilot in multi-runtime scenarios

Low risk additions that provide comprehensive test coverage for Copilot runtime.
- Replace references from Codex to GitHub Copilot in README, CLI reference, and getting started guides.
- Modify setup scripts to install GitHub Copilot CLI with MCP configuration.
- Update token management to reflect the removal of GITHUB_NPM_PAT.
- Adjust integration tests to verify Copilot setup.
- Enhance example scripts in apm.yml for Copilot usage.
…e instantiation and enhance runtime info retrieval with mocked subprocess output.
@danielmeppiel danielmeppiel merged commit 55839cb into main Sep 25, 2025
15 checks passed
@danielmeppiel danielmeppiel deleted the integrate-copilot-runtime branch February 27, 2026 09:42
sergio-sisternes-epam referenced this pull request in sergio-sisternes-epam/apm Mar 2, 2026
- Use LockFile.read() instead of raw yaml.safe_load() in _collect_transitive_mcp_deps (#1)
- Guard against mcp:null in get_mcp_dependencies() (#2)
- Remove inline MCP installation pipeline, defer to follow-up PR (#3/microsoft#7)
- Remove redundant import builtins in _deduplicate_mcp_deps (microsoft#10)
- Add tests for mcp:null, mcp:[], root-over-transitive dedup order (microsoft#9)
- Remove tests for deleted inline pipeline functions
danielmeppiel added a commit that referenced this pull request Mar 31, 2026
- Narrow except Exception to except ImportError for lazy marketplace import (comment #1)
- Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2)
- Include subdir in git-subdir source resolution with path traversal validation (comment #3)
- Include relative path in relative source resolution with traversal validation (comment #4)
- Sanitize marketplace name in cache file paths to prevent path traversal (comment #5)
- Fix docs: stale-if-error, not stale-while-revalidate (comment #6)
- Consolidate CHANGELOG entries into single line with (#503) (comment #7)
- Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8)
- Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9)
- Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10)
- Fix doc examples to use identifier-compatible names (comments #11, #12)
- Update tests to match corrected resolver behavior, add traversal tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request Mar 31, 2026
Bug #1 - Format incompatibility with awesome-copilot marketplace:
  - Parser now accepts 'source' key (Copilot CLI) as type discriminator
    fallback when 'type' key is absent, normalizing to 'type' for resolvers
  - GitHub source resolver now accepts 'path' field (Copilot CLI) as
    virtual subdirectory, same as 'subdir' in git-subdir sources
  - Path traversal validation applied to 'path' field
  - Fixes: 8 of 62 plugins in awesome-copilot that use github source
    objects with 'source'+'path' keys instead of 'type'+'subdir'

Bug #2 - Lockfile provenance never written:
  - Root cause: install passed raw marketplace refs (NAME@MARKETPLACE)
    as only_packages, but DependencyReference.parse() can't parse those,
    so identity filtering removed all deps -> 'already installed'
  - Fix: use validated_packages (canonical owner/repo strings) instead
    of raw click argument for only_pkgs

Both bugs verified fixed via E2E tests against real marketplaces:
  - github/awesome-copilot (62 plugins)
  - anthropics/skills (3 plugins)
  - microsoft/azure-skills (1 plugin)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request Mar 31, 2026
…covery + governance (#503)

* Initial plan

* Initial plan for marketplace integration

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* feat: marketplace integration core implementation

- Add marketplace/ package: models, errors, registry, client, resolver
- Add marketplace CLI commands: add, list, browse, update, remove, search
- Add lockfile provenance fields: discovered_via, marketplace_plugin_name
- Add install hook for NAME@MARKETPLACE syntax pre-parse intercept
- Wire marketplace commands in cli.py

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* docs: add marketplace integration guide and CLI reference

- Create guides/marketplaces.md covering marketplace concepts,
  registration, browsing, search, install syntax, provenance tracking,
  and cache behavior
- Add apm marketplace and apm search command sections to cli-commands.md
- Update apm install arguments to include NAME@MARKETPLACE syntax
- Update plugins.md Finding Plugins section with marketplace cross-refs

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* docs: fix marketplace.json format and lockfile field names to match implementation

- Use array-based plugins format matching models.py parser expectations
- Use discovered_via and marketplace_plugin_name matching lockfile.py fields
- Document both Copilot CLI (repository/ref) and Claude Code (source) formats

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* docs: fix git-subdir and relative source descriptions to match resolver

- git-subdir uses separate repo and subdir fields
- Relative string sources resolve to marketplace repo subdirectory

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* feat: add marketplace unit tests and docs

- 114 unit tests across 8 test files covering all marketplace modules
- New marketplace guide at docs/src/content/docs/guides/marketplaces.md
- Updated CLI reference with marketplace and search commands
- Updated plugins guide with marketplace integration section
- CHANGELOG entry for marketplace feature

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* refactor: address code review feedback

- Use List[MarketplacePlugin] from typing instead of lowercase generic
- Eliminate duplicated condition in install.py marketplace intercept
- Restructure control flow for clarity

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* fix: address all 12 PR review comments on marketplace integration

- Narrow except Exception to except ImportError for lazy marketplace import (comment #1)
- Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2)
- Include subdir in git-subdir source resolution with path traversal validation (comment #3)
- Include relative path in relative source resolution with traversal validation (comment #4)
- Sanitize marketplace name in cache file paths to prevent path traversal (comment #5)
- Fix docs: stale-if-error, not stale-while-revalidate (comment #6)
- Consolidate CHANGELOG entries into single line with (#503) (comment #7)
- Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8)
- Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9)
- Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10)
- Fix doc examples to use identifier-compatible names (comments #11, #12)
- Update tests to match corrected resolver behavior, add traversal tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Copilot CLI format compatibility and marketplace provenance bugs

Bug #1 - Format incompatibility with awesome-copilot marketplace:
  - Parser now accepts 'source' key (Copilot CLI) as type discriminator
    fallback when 'type' key is absent, normalizing to 'type' for resolvers
  - GitHub source resolver now accepts 'path' field (Copilot CLI) as
    virtual subdirectory, same as 'subdir' in git-subdir sources
  - Path traversal validation applied to 'path' field
  - Fixes: 8 of 62 plugins in awesome-copilot that use github source
    objects with 'source'+'path' keys instead of 'type'+'subdir'

Bug #2 - Lockfile provenance never written:
  - Root cause: install passed raw marketplace refs (NAME@MARKETPLACE)
    as only_packages, but DependencyReference.parse() can't parse those,
    so identity filtering removed all deps -> 'already installed'
  - Fix: use validated_packages (canonical owner/repo strings) instead
    of raw click argument for only_pkgs

Both bugs verified fixed via E2E tests against real marketplaces:
  - github/awesome-copilot (62 plugins)
  - anthropics/skills (3 plugins)
  - microsoft/azure-skills (1 plugin)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: scope marketplace search to QUERY@MARKETPLACE format

Search now requires QUERY@MARKETPLACE (e.g. apm search security@skills)
to eliminate name collisions across marketplaces. Added search_marketplace()
client function for single-marketplace search.

- Rejects bare queries without @ — clear error with usage example
- Validates marketplace exists before searching
- Updated docs/guides/marketplaces.md with new syntax
- 7 test cases: format validation, unknown marketplace, results, no results

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: update CLI reference and plugins guide for scoped search syntax

Align all documentation with QUERY@MARKETPLACE search format.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: use centralized path_security for marketplace traversal checks

Replace 3 ad-hoc '..' in x.split('/') checks in marketplace/resolver.py
with validate_path_segments() from utils/path_security.py. Add
defense-in-depth validate_path_segments() call to _sanitize_cache_name()
in client.py.

This ensures marketplace code uses the same cross-platform path safety
utilities (backslash normalization, single-dot rejection) as the rest
of APM.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add path safety rule to copilot-instructions.md

Directs contributors to use validate_path_segments() and
ensure_path_within() from utils/path_security.py instead of
ad-hoc traversal checks.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>
Co-authored-by: danielmeppiel <dmeppiel@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request Apr 22, 2026
* feat(policy): W1 foundations for install-time policy enforcement (#827)

Wave 1 of issue #827 implementation. Lays the foundations the install
pipeline gate (W2) will plug into. No behaviour change yet — install
still does NOT enforce policy until W2 wires the gate phase.

What's in:
- policy_checks: new public seam run_dependency_policy_checks(deps,
  lockfile=, policy=, mcp_deps=, effective_target=) accepting a
  resolved dep set; old run_policy_checks(project_root, policy) is now
  a thin wrapper. Honours require_resolution: project-wins for
  version-pin mismatches only. Latent isinstance(allow, list) bug
  fixed for schema's Tuple[str, ...].
- policy/discovery: cache stores merged effective policy with chain
  metadata + fingerprint. Atomic writes via temp + os.replace, with
  pid+thread_id suffix to prevent concurrent-writer collision.
  MAX_STALE_TTL=7d ceiling on cache reuse. PolicyFetchResult expanded
  to express 9 outcomes (found, absent, cached_stale,
  cache_miss_fetch_fail, malformed, disabled, garbage_response,
  no_git_remote, empty).
- diagnostics: CATEGORY_POLICY constant + per-category renderer wired
  into render_summary().
- command_logger: InstallLogger.policy_resolved/violation/disabled
  with per-class actionable error wording (auth/unreachable/malformed/
  blocked).
- tests/fixtures/policy/: 14 policy fixtures + 7 project fixtures
  (denied-direct, denied-transitive, required-missing,
  required-version-mismatch, mcp-denied, target-mismatch,
  unpacked-bundle) covering W4 live matrix scenarios L2/L4/L13 and
  rubber-duck findings I5/I6/I7/N14/C2.
- docs: 12-section Install-time enforcement guide skeleton in both
  enterprise/policy-reference.md and packages/apm-guide skill mirror.
  10 sections filled; sections 7 (snippets) and 10 (error table)
  stubbed for W3-docs-final once W2 lands and W4 captures live output.

Tests:
- tests/unit: 4878 passed (1 pre-existing unrelated MCP failure
  deselected). Includes 41 logger + 29 policy-seam + 38 cache + 21
  fixture-load new tests.

Refs: #827
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(install): W2A policy enforcement at install time (#827)

Wave 2A wires the three install-time enforcement sites planned for #827:

1. **Pipeline gate phase** (src/apm_cli/install/phases/policy_gate.py):
   New phase running between resolve and targets. Discovers org policy,
   resolves the inheritance chain via resolve_policy_chain, persists the
   merged effective policy + chain refs to cache (chain_refs threading
   per C1 amendment), then calls run_dependency_policy_checks against
   the resolved deps. Routes 9 discovery outcomes (found, absent,
   cached_stale, cache_miss_fetch_fail, malformed, disabled,
   garbage_response, no_git_remote, empty). Block-mode violations raise
   PolicyViolationError to halt the pipeline cleanly.

2. **--mcp branch preflight** (src/apm_cli/policy/install_preflight.py
   + commands/install.py:1091-1125):
   apm install --mcp does NOT enter the install pipeline. New shared
   helper run_policy_preflight() runs discovery + dep checks for any
   non-pipeline command site. Wired into --mcp BEFORE _run_mcp_install
   so denied servers never reach the integrator. Also exports
   PolicyBlockError for callers.

3. **install <pkg> snapshot+rollback** (commands/install.py):
   apm install <pkg> mutates apm.yml BEFORE the pipeline runs. We now
   snapshot apm.yml as raw bytes (not parsed YAML, to avoid round-trip
   drift on whitespace / key-order / comments), and on ANY pipeline
   failure (policy block, download error, etc.) restore byte-for-byte
   via tempfile + os.replace atomic write. Logs '[i] apm.yml restored
   to its previous state.' and exits non-zero.

InstallContext gains policy_fetch, policy_enforcement_active, no_policy.

Tests: +68 new tests, 4946 unit tests pass total.
- test_policy_gate_phase.py: 27 (covers all 9 outcomes)
- test_mcp_preflight_policy.py: 22 (escape hatches, allow/deny, transport,
  self-defined, trust_transitive, discovery outcomes, return shape)
- test_install_pkg_policy_rollback.py: 19 (byte-equal restore, comments
  preserved, --no-policy bypass, download error rollback, snapshot
  unit tests)

W2B (dry-run, target-aware, escape-hatch CLI flag) and C2 panel review
follow.

Refs: #827
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(policy): W2B install enforcement - escape hatch, dry-run preview, target-aware check (#827)

W2B completes the enforcement surface:

* policy_target_check.py - new pipeline phase after targets that re-runs
  target/compilation checks with the resolved effective_target. Filters
  to TARGET_CHECK_IDS only to avoid double-emitting dep violations from
  the gate phase. Honors CLI --target override (I6 fix scenario).

* --no-policy escape hatch on apm install / install <pkg> / install --mcp
  / update. APM_POLICY_DISABLE=1 env var equivalent. Both route through
  ctx.no_policy and emit always-visible warnings via
  InstallLogger.policy_disabled() noting that apm audit --ci still fails.

* --dry-run policy preview. run_policy_preflight gains dry_run=True kwarg.
  Emits '[!] Would be blocked by policy: <dep> -- <reason>' (block) or
  '[!] Policy warning: <dep> -- <reason>' (warn) before the would-install
  table. Never raises, never mutates. Direct manifest deps only (resolver
  doesn't run in dry-run; documented limitation).

InstallRequest, InstallService, InstallContext threaded with no_policy.
LOC budget on install.py raised 1625 -> 1650 with documented rationale.

Tests: 5003 unit pass (+57 W2B: 17 target_check + 24 no_policy_flag +
16 dry_run_policy). Full suite green vs main baseline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): C2 panel fixes - transitive MCP enforcement, shared chain discovery, dry-run cap, drop apm update --no-policy (#827)

C2 panel checkpoint surfaced 4 fixes (S1+B1+D2 BLOCKER/PASS-WITH-CONCERN, D1
DevX). All landed; full suite 5032 pass.

S1 (Supply Chain BLOCKER) - transitive MCP enforcement:
  Transitive MCP servers from APM packages were bypassing install-time policy.
  The pipeline gate phase only sees direct apm.yml deps; transitive MCP servers
  are merged later via MCPIntegrator.collect_transitive() and written to
  runtime configs (.copilot/mcp.json, .cursor/mcp.json) with no policy check.
  This defeated #827 on the most security-critical dep category.
  Fix: second run_policy_preflight() call in commands/install.py after the
  transitive merge, before MCPIntegrator.install(). On block: abort MCP config
  writes, exit non-zero. APM packages remain installed (gate phase approved
  them). 15 new unit tests in test_transitive_mcp_policy.py.

B1 (Architect, partial) - shared chain-aware discovery:
  Extract discover_policy_with_chain() into policy/discovery.py so both
  policy_gate.py and install_preflight.py walk the same inheritance chain.
  Closes the gap where --mcp / --dry-run paths could resolve a different
  effective policy than the pipeline path. Gate-phase keeps its 9-outcome
  routing; only the discovery seam moved. 10 new tests in
  test_chain_discovery_shared.py.

D2 (DevX UX) - dry-run noise cap:
  install_preflight._DRY_RUN_PREVIEW_LIMIT = 5. Long deny lists now show
  5 lines per severity bucket + tail '[!] ... and N more would be blocked
  by policy. Run apm audit for full report.' 4 new tests.

D1 (DevX UX) - drop apm update --no-policy:
  apm update is the CLI self-updater (refreshes the apm binary), not a
  dependency refresh. The flag was accepted but unused. Removed the option
  and flipped the test to assert the flag is now rejected.

LOC budget on install.py raised 1650 -> 1675 with documented justification.

Tests: 5032 unit pass (+29 new: 15 transitive_mcp + 10 chain_discovery_shared
+ 4 dry_run_noise_cap). 1 pre-existing MCP test deselected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs+test(policy): W3 - integration matrix, docs final fill, CHANGELOG, growth (#827)

W3 phase complete. All 5 parallel workstreams landed.

Tests:
  tests/integration/test_policy_install_e2e.py - 17 e2e scenarios I1..I17
  Covers all 9 PolicyFetchResult outcomes + all 6 violation classes via
  CliRunner-driven full-pipeline flows. Mocks discover_policy_with_chain
  at both seams (policy_gate + install_preflight). Uses _build_policy()
  helper for frozen-dataclass safe construction.

Docs:
  docs/src/content/docs/enterprise/policy-reference.md
    sec 7: 8 verbatim CLI snippets (success, block, warn, --no-policy,
    APM_POLICY_DISABLE, --dry-run with overflow tail, install <pkg>
    rollback, transitive MCP block)
    sec 10: outcome table (9 fetch outcomes) + violation table (6 classes)
    Added explicit JSON/SARIF non-goal callout (C1 amendment).
  packages/apm-guide/.apm/skills/apm-usage/governance.md
    Same content, leaner skill version, links back to docs for full text.

CHANGELOG.md:
  Added: --no-policy / APM_POLICY_DISABLE escape hatch, --dry-run preview,
    install <pkg> rollback
  Changed: pipeline gains policy_gate + policy_target_check phases, shared
    chain discovery + atomic cache + MAX_STALE_TTL
  Security (headline): apm install enforces apm-policy.yml; transitive MCP
    checked before runtime config write

Follow-up issue #829 filed: policy.fetch_failure: warn|block schema knob.

Tests: 5049 pass (5032 unit + 17 integration). 1 pre-existing MCP test
deselected.

PR body drafted at session-state/files/pr-body-827.md. Growth strategy
entry + asciinema script staged in WIP (gitignored).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): C3 fixes - direct MCP enforcement, malformed posture, warn-mode coverage, doc drift (#827)

C3 final panel + rubber-duck found 5 issues. All fixed.

#1 (CRITICAL) - Direct MCP deps in apm.yml bypassed enforcement:
  ctx.direct_mcp_deps now populated in pipeline.py from
  apm_package.get_mcp_dependencies() before policy_gate runs. policy_gate
  reads direct_mcp_deps (not the dead mcp_deps_to_install) and passes them
  to run_dependency_policy_checks. install.py:1496 second preflight guard
  drops 'and transitive_mcp' so direct-only MCP installs are also caught.

#2 (CRITICAL) - Malformed policy handling inconsistent + broke rollback:
  policy_gate.py replaced sys.exit(1) on malformed with fail-open warn
  (matches install_preflight + cache_miss_fetch_fail/garbage_response
  posture). sys.exit was bypassing the rollback handler in install.py for
  apm install <pkg>. CEO mandate: malformed = warn, fail-closed knob is
  follow-up #829.

#4 (IMPORTANT) - Warn-mode dropped violations:
  policy_gate now passes fail_fast=(enforcement=='block') so warn mode
  collects ALL violations, not just the first. Also emits warnings for
  passed=True checks with non-empty details (project-wins version-pin
  mismatches were silently dropped).

#3 (IMPORTANT) - Chain inheritance is 1-level, not multi-level:
  discover_policy_with_chain only walks one parent. Toned down docs in
  policy-reference.md and governance.md with explicit caution callout.
  Filed follow-up #831 for proper recursive walk + cycle detection.

#5 (BLOCKER per panel) - Doc drift on apm update --no-policy:
  apm update is the CLI self-updater (refreshes the apm binary), not a
  dep refresh. Removed all mentions from both docs. apm deps update is
  the dep-refresh surface (runs install pipeline, gate applies); --no-policy
  is NOT exposed there today.

Tests: 5059 pass (5049 baseline + 10 new: 6 unit gate + 4 integration
I18/I19/I20). New integration tests cover real direct-MCP block, real
malformed fail-open, warn-mode multi-violation. I16 class renamed to
TestI16GarbageResponsePolicy to fix mislabeling.

Follow-ups: #829 (fetch_failure schema knob), #831 (multi-level chain).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): in-PR resolution of #834 (warn-mode rendering) and #831 (recursive extends chain) (#827)

Originally filed as follow-ups during C3, moved in-PR per reviewer
request so #832 ships a complete enforcement story.

#834 - Warn-mode policy violations did not render in the install
summary. Root cause: pipeline created a fresh DiagnosticCollector for
install_result.diagnostics while InstallLogger.policy_violation()
pushed warnings into logger.diagnostics. Two collectors, one rendered.
Fix: when a logger is present, reuse logger.diagnostics so policy
records flow through render_summary() (block mode unaffected - it
aborts inline before summary).

#831 - extends: chain only supported one level (parent). Inheritance
machinery (resolve_policy_chain, detect_cycle, MAX_CHAIN_DEPTH=5) was
already N-deep capable; discovery never wired it. Fix: rewrite
_resolve_and_persist_chain as iterative depth-first walk, leaf-first;
cycle detection via inheritance.detect_cycle; honor MAX_CHAIN_DEPTH=5
with explicit pre-append check; partial-chain warning when a mid-chain
ref fails to fetch ('Policy chain incomplete: <ref> unreachable, using
<N> of <M> policies'); single cache write at leaf with full chain
fingerprint.

Tests: +1 unit (warn-render), +5 unit (3-level full, cycle, depth
limit, partial chain, single-level regression), +1 integration
(TestI21ThreeLevelExtendsChain). 5044 unit pass.

Docs: enterprise/policy-reference.md and apm-usage/governance.md
chain-depth callouts updated.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(changelog): record in-PR resolution of #834 and #831 under #827

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): address review-panel pre-merge findings (#827)

- Security F1 (HIGH): pin extends: chain to leaf policy host; disable
  HTTP redirects in _fetch_from_url and _fetch_github_contents. Closes
  cross-host credential leak vector via git credential fill fallback
  and SSRF/Referer-leak vector via 30x redirects. raw.githubusercontent
  .com is treated as distinct from github.com (strict pin).
- Logging C1+C2 + UX F1/F2/F4/F5/F9: extract InstallLogger.policy_
  discovery_miss() canonical helper covering all 7 discovery outcomes;
  route both policy_gate and install_preflight through it. absent now
  verbose-only; no_git_remote downgraded to [i]; garbage_response gets
  distinct wording (no VPN/firewall noise); cached_stale and cache_
  miss_fetch_fail messages now state enforcement posture explicitly;
  violation messages dedupe dep_ref prefix; wire _policy_reason_blocked
  into block-severity policy_violation as dim secondary line.
- Docs: remove [Planned] banner from policy-reference; update
  enforcement tables (policy-reference + governance skill) to reflect
  install-time blocking; document --no-policy / APM_POLICY_DISABLE in
  cli-commands.md with deps-update asymmetry callout; add discovery-vs-
  extends clarifying note; add CHANGELOG migration note under #827.

Tests: 5053 -> 5068 (+15 logging, +9 security host-pin).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(policy): ship enterprise hardening pack on top of #827

Four enterprise hardening items shipped in-PR per CISO-arbitrated panel
verdict + CTO threat-model deep dive (PR #832 comments 4294087760 +
4294115069). Closes #829.

1. policy.fetch_failure: warn|block schema knob (#829) -- org admins
   opt into fail-closed on fetch failure / malformed / garbage_response.
   Default 'warn' preserves backwards compat.
2. apm.yml policy.fetch_failure_default: warn|block -- project-side
   complement so a project can lock down behavior even when no policy
   is reachable to read the org-side knob from.
3. apm policy status diagnostic command -- show discovery outcome,
   source, enforcement, cache age, extends chain, effective rule
   counts, and hash-pin state. --json for SIEM ingestion. Trust-but-
   verify tool that makes fail-open acceptable.
4. apm.yml policy.hash: 'sha256:...' consumer-side pin -- closes the
   garbage_response compromised-intermediary vector by verifying raw
   policy bytes against a project-pinned digest. Equivalent of pip
   --require-hashes for the policy itself. ALWAYS fail-closed on
   mismatch, regardless of fetch_failure setting (a hash mismatch is
   an explicit pin violation, not a fetch failure). sha384/sha512
   accepted; md5/sha1 rejected (collision-resistant only).
5. apm audit --ci auto-discovers org policy when --policy-source is
   not provided; --no-policy flag added to skip. Closes the
   audit/install asymmetry that left CI blind to sideloaded primitives.

Tests: 5068 -> 5157 (+89: hash pin 31, fetch_failure knob, audit
auto-discovery, policy status command, plus updates to existing
discovery tests for the new expected_hash kwarg threading).

Docs: policy-reference §9.5 (fetch_failure), §9.6 (hash pin),
§9.7 (apm policy status), §9.8 (audit auto-discovery); governance.md
skill mirrors all of the above; cli-commands.md gets policy status +
audit --no-policy. CHANGELOG entries under [Unreleased] Added /
Added (Security).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(policy): address doc-writer review BLOCKERs (#827)

- policy-reference.md: remove stale 'planned fetch_failure knob' paragraph
  that contradicted the §9.5 entry shipped in the same PR; add Linux
  hash-compute one-liner alongside the macOS shasum example.
- cli-commands.md: add 'apm policy status' command section under a new
  'apm policy' family (synopsis, --policy-source/--no-cache/--json,
  exit-code note, examples). Add --no-policy flag to 'apm audit' options
  list. Reword --policy SOURCE description to reflect that --ci now
  auto-discovers when --policy is omitted. Update audit examples to
  match (drop the now-redundant '--policy org' from auto-discovery
  example, add explicit --no-policy variant).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(policy): address doc-writer HIGH+LOW findings (#827)

- manifest-schema.md: add policy: block to schema diagram + new
  section 3.9 documenting fetch_failure_default, hash, hash_algorithm
- policy-reference.md: add fetch_failure: warn to canonical schema
  YAML and a fetch_failure entry under Top-level fields; lift apm
  policy status and apm audit --ci auto-discovery into proper
  numbered subsections (9.7 / 9.8) so anchors match the skill mirror
- governance.md: surface install-time enforcement with link to
  policy-reference#install-time-enforcement
- ci-policy-setup.md: annotate Step 3 noting apm audit --ci
  auto-discovers and --policy org is now an explicit override
- security.md: add Compromised policy intermediary row to attack
  surface comparison, linked to policy.hash consumer-side pin
- cli-commands.md: split --no-policy into 2-line nested bullet
  separating behaviour from env-var equivalence
- apm-guide skill mirror: add fetch_failure: warn to schema overview
  to keep skill aligned with policy-reference

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): address PR review panel logging+arch findings (#827)

BLOCKING:
- command_logger.policy_discovery_miss: gate no_git_remote info
  message on verbose mode; previously emitted on every install in a
  non-git directory

Architecture:
- New install/errors.py with canonical PolicyViolationError;
  PolicyBlockError kept as re-exported alias to preserve test patches
- New policy/outcome_routing.py::route_discovery_outcome
  consolidating the 9-outcome routing table; policy_gate.py and
  install_preflight.py now delegate instead of duplicating
- pipeline.py: catch PolicyViolationError before bare Exception so
  policy block messages are not double-nested in RuntimeError
- commands/install.py: isinstance(PolicyViolationError) branch in
  the legacy handler for the same reason

Logging UX:
- install_preflight: empty check.details now falls back to
  [check.name] so the block message is never blank
- _extract_dep_ref helper replaces detail.split(":")[0] with
  defensive parsing that falls back to check.name

Security:
- discovery._get_cache_dir asserts containment vs project_root
  (resolves symlinks) instead of an unguarded join
- Removed dead no_policy= kwarg from discover_policy_with_chain;
  env-var defence-in-depth retained on the call site

Tests: +tests/unit/policy/test_pr_832_findings.py covering all 8
  findings; install_logger split into silent/verbose cases. 5176
  unit tests pass, 0 regressions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(policy): use urllib.parse for host assertions to silence CodeQL (#827)

CodeQL's py/incomplete-url-substring-sanitization rule fired 6 times
on test_extends_host_pin.py because bare 'host' in msg substring
checks could in theory match a host appearing at an arbitrary URL
position (path, query, userinfo). The assertions are correct in
practice -- they assert on production error messages of known
format -- but the pattern is not safe in general.

Replace each substring check with a precise extractor:

- _assert_extends_host_in_message / _assert_leaf_host_in_message:
  regex-anchor on the production 'extends host: <h>' / 'leaf host:
  <h>' tokens, then exact-compare the captured group.
- _assert_redirect_target_host: regex-extract the redirect target
  URL after 'to ', then urllib.parse.urlparse(...).hostname compare.

No production-code changes; all 9 host-pin tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy,audit): address PR #832 DevX UX blockers

- audit --no-policy help text rewritten to describe positive
  behaviour first ("Skip org policy discovery and enforcement"
  instead of the negative "Skip auto-discovery ... in --ci mode"),
  so apm audit --help no longer hides the primary effect behind a
  caveat. Aligns the code with the docs.

- apm policy status --check flag added: exits 1 when outcome is
  not 'found' (i.e. policy unresolvable / absent / disabled /
  fetch-failed), 0 otherwise. Default behaviour unchanged (always
  exit 0) so the diagnostic remains safe for human and SIEM use,
  while CI authors get the npm audit / pip check style contract
  via a single flag.

Updates cli-commands.md, policy-reference.md, and CHANGELOG.md to
document the new flag and exit-code table. Adds TestStatusCheckFlag
covering the found / unresolvable / discovery-exception / json
combinations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel pushed a commit that referenced this pull request May 4, 2026
CEO panel recommended landing two in-PR follow-ups before merge:

1. Recovery hint in drift output (cli-logging + devx-ux convergence):
   render_drift_text now appends '[i] Run apm install to re-sync
   deployed files with the lockfile.' so users see WHAT and HOW in one
   message. Honors Message Writing Rule #4 'Include the fix'.

2. Doc-sync (doc-writer + devx-ux convergence):
   - reference/cli-commands.md: add --no-drift to audit options table;
     amend --ci description to mention drift contribution.
   - integrations/ci-cd.md: replace bash 'git status --porcelain'
     workaround under 'Verify Deployed Primitives' with 'apm audit --ci'
     one-liner; update 'We dogfood this' callout text.
   - getting-started/quick-start.md: retarget stale cross-ref from the
     now-superseded ci-cd anchor to the new drift-detection guide.
   - guides/drift-detection.md: drop the self-contradictory case #2 in
     'When to use --no-drift' (strip-mode is auto-skipped, not opt-out).
   - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line
     pointing readers to the guide for detail.

Tracked as follow-up issues (CEO call):
- supply-chain: verify cache content matches lockfile resolved_commit
  before drift replay trusts it (commit-SHA pinning bypass on shared
  CI caches).
- test-coverage: inverse-normalization unit test asserting BOM/CRLF/
  Build-ID guards do NOT mask real content drift (safety invariant).

Lint clean. 45 drift tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request May 5, 2026
* feat(drift): Phase A infra - guards + diagnostic category

- Add _ReadOnlyProjectGuard context manager (utils/guards.py): snapshots
  stat of protected paths, raises ProtectedPathMutationError on any
  mutation. Defense-in-depth above the scratch-root remap.
- Add CATEGORY_DRIFT + drift() recording method to DiagnosticCollector.
- Add drift_count property and _render_drift_group renderer that groups
  by kind (modified/unintegrated/orphaned) with stable section header
  for machine consumers.
- Tests: 7 unit tests covering happy path, mutation, creation, deletion,
  missing-tolerated, exception-not-masked, single-file protected path.

Refs #1071. Phase A of WIP/drift/06-final-plan.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(drift): Phase B+C - replay engine + audit CLI wiring

Implements the drift detection feature per WIP/drift/06-final-plan.md
(closes #1071 scope alignment with #898).

Engine (Phase B):
- src/apm_cli/install/drift.py: ReplayConfig, DriftFinding, CheckLogger,
  CacheMissError, normalization helpers (build-id strip, line endings,
  BOM), run_replay() (cache-only), diff_scratch_against_project(),
  text/json/sarif renderers, atexit scratch cleanup.
- src/apm_cli/install/services.py: scratch_root kwarg with
  ensure_path_within defense-in-depth guard for replay isolation.
- src/apm_cli/policy/ci_checks.py: _check_drift() wrapper returning
  (CheckResult, list[DriftFinding]); graceful CacheMissError handling.

CLI surface (Phase C):
- src/apm_cli/commands/audit.py: --no-drift opt-out flag with mutex
  against --strip/--file via UsageError. Drift wired into both
  _audit_ci_gate (--ci) and _audit_content_scan (bare project audit)
  paths, default-on per ADR-02. JSON/SARIF/text renderers integrated;
  --no-drift warning gated to text mode (stdout cleanliness).

Tests:
- tests/unit/install/test_drift.py: 13 unit tests (normalization,
  diff cases, renderers).
- Legacy --ci tests opt out of drift via batch --no-drift injection
  (fixture parity, not a behavior change).

7597 unit tests pass; lint clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(drift): Phase D - integration + e2e + perf coverage (43 tests)

Implements the locked test matrix for issue #1071 drift
detection. Floor of 43 tests across three new files closes the
'ULTRA HARDENING OF HELL' coverage requirement.

New files:
- tests/integration/test_drift_check.py (32 tests):
  * Section A: 9 drift cases (modified/unintegrated/orphaned + CRLF/
    BOM/Build-ID false-positive guards)
  * Section B: 4 past-PR regressions (#1067, #882, #889, source-deleted)
  * Section C: 7 edges (no/corrupt lockfile, untracked governed,
    no-write contract, idempotency)
  * Section D: 3 multi-target (copilot/claude/cursor)
  * Section E: 9 default-on / --no-drift opt-out (mutex, stderr
    routing, JSON suppression)
- tests/integration/test_drift_check_e2e.py (10 tests):
  full install->mutate->audit loop with mix_stderr=False, air-gap
  proof, JSON/SARIF stability, 30s smoke
- tests/unit/install/test_drift_perf.py (1 test):
  100 primitives replay+diff under 5s

Engine fix surfaced by tests:
- src/apm_cli/install/drift.py: run_replay now reads apm.yml's target
  field via parse_target_field and passes it to resolve_targets.
  Without this, multi-target projects (copilot+claude+cursor) replayed
  only the auto-detected primary target, falsely reporting secondary
  target deployments as orphaned. Helper _read_apm_yml_target() added.

CI wiring:
- scripts/test-integration.sh: two new blocks in run_e2e_tests()
  invoking the integration + e2e suites before the final success log.
  Both safe to run without GITHUB_APM_PAT (cache-only, mocked network).

Verification: 56 drift-domain tests pass; full repo lint clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(drift): CHANGELOG + Starlight guide + apm-usage skill + ci.yml note

- CHANGELOG.md: Added [Unreleased] entry under Added describing the
  default-on drift detection in apm audit, the three failure modes it
  catches, false-positive guards, --no-drift opt-out + mutex semantics,
  and the JSON/SARIF integration shape. Closes #1071, supersedes #898.
- docs/src/content/docs/guides/drift-detection.md (NEW, sidebar order 7):
  Full user-facing guide -- what drift means, how the cache-only replay
  works (with mermaid diagram), exit-code matrix, when to use --no-drift,
  output formats, and the CI single-line gate that replaces the legacy
  git status --porcelain script.
- packages/apm-guide/.apm/skills/apm-usage/commands.md: Extended the
  audit row with --no-drift flag and added a paragraph documenting the
  drift-by-default behavior, three failure modes, false-positive
  normalization, and JSON/SARIF integration. Aligns the skill that
  ships in apm-guide with the new CLI surface (per
  apm-keep-docs-up-to-date.instructions.md rule 4).
- .github/workflows/ci.yml: Annotated Gate B (legacy bash drift check)
  with a comment marking it redundant once apm-action ships a CLI with
  default-on drift detection (this PR's release). Kept as
  defense-in-depth fallback until then.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(drift): address panel feedback - recovery hint + doc-sync

CEO panel recommended landing two in-PR follow-ups before merge:

1. Recovery hint in drift output (cli-logging + devx-ux convergence):
   render_drift_text now appends '[i] Run apm install to re-sync
   deployed files with the lockfile.' so users see WHAT and HOW in one
   message. Honors Message Writing Rule #4 'Include the fix'.

2. Doc-sync (doc-writer + devx-ux convergence):
   - reference/cli-commands.md: add --no-drift to audit options table;
     amend --ci description to mention drift contribution.
   - integrations/ci-cd.md: replace bash 'git status --porcelain'
     workaround under 'Verify Deployed Primitives' with 'apm audit --ci'
     one-liner; update 'We dogfood this' callout text.
   - getting-started/quick-start.md: retarget stale cross-ref from the
     now-superseded ci-cd anchor to the new drift-detection guide.
   - guides/drift-detection.md: drop the self-contradictory case #2 in
     'When to use --no-drift' (strip-mode is auto-skipped, not opt-out).
   - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line
     pointing readers to the guide for detail.

Tracked as follow-up issues (CEO call):
- supply-chain: verify cache content matches lockfile resolved_commit
  before drift replay trusts it (commit-SHA pinning bypass on shared
  CI caches).
- test-coverage: inverse-normalization unit test asserting BOM/CRLF/
  Build-ID guards do NOT mask real content drift (safety invariant).

Lint clean. 45 drift tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(drift): address Copilot review - exit-code contract + types + diagnostics

Bare 'apm audit' is advisory (exit 0 on drift); 'apm audit --ci' is
the gate (exit 1). Closes the regression introduced when content-scan
escalation accidentally also escalated drift findings.

Also addresses inline review:
- A2: vacuous ASCII-encoding assertion now scopes per-line
- A4: tuple[float, int] -> tuple[int, int] in guards.py
- A5: type-annotated _check_drift signature
- A6: clarified DRIFT_ORPHANED comment
- A7: CHANGELOG references PR + closes
- A3: CacheMiss message now drift-specific (no --no-cache confusion)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(drift): link drift detection guide from README security section

Per oss-growth: surfaces drift detection alongside content security
and lockfile integrity in the conversion-critical Production-grade
section, so a reader scanning for 'why APM' sees the supply-chain
story end-to-end.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(drift): cache pin marker for stale-cache detection

apm install drops a .apm-pin JSON marker into each cached package
root recording the resolved_commit; apm audit verifies it before
running drift replay. Catches the 'teammate bumped lockfile, did
not reinstall' + 'shared CI runner reused stale apm_modules'
scenarios that would otherwise silently produce misleading drift
output.

LockfileBuilder syncs markers UNCONDITIONALLY (even when the
lockfile YAML is unchanged and even when no install happens), so
existing users self-heal on their next 'apm install'.

This is stale-cache detection, NOT cryptographic integrity --
defending against active cache tampering requires content-addressed
hashes, which is deferred.

Schema (v1): {schema_version: 1, resolved_commit: <sha>}
Marker file: <install_path>/.apm-pin

Coverage:
- 14 unit tests in test_cache_pin.py (positive + every error path
  + skip rules + idempotent re-run + self-heal regression)
- 1 integration test in test_drift_check_e2e.py exercising the
  full install -> mark -> verify flow against a synthetic cache

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address panel follow-ups C1-C5 on PR #1137

C1 (supply-chain): Fail closed on unpinned remote deps
- cache_pin.find_unpinned_remote_deps() helper + stderr warning in
  sync_markers_for_lockfile
- drift._materialize_install_path raises CacheMissError for remote
  deps with resolved_commit=None (was silent fail-open)
- Replaced silent-skip test with warning assertion + new helper test

C2 (architecture): Wire _ReadOnlyProjectGuard into run_replay
- run_replay() now wraps the deps loop with _ReadOnlyProjectGuard
  on governed root dirs + apm.lock.yaml + AGENTS.md
- Regression test: monkeypatched leaky integrator triggers
  ProtectedPathMutationError

C3 (cli-logging-ux): Stderr message on swallowed CacheMissError
- audit._audit_content_scan emits '[!] drift check could not run:
  <msg>' to stderr when drift_failed and no findings (covers cache
  miss, missing lockfile, cache-pin error)
- Integration test e10 asserts stderr message in bare-audit path

C4 (docs): Baseline-check phrasing + CHANGELOG link
- governance-guide, ci-cd, cli-commands now read '7 baseline checks
  plus integration drift detection'
- CHANGELOG drift-detection link points to docs site URL

C5 (oss-growth): User-promise framing
- CHANGELOG drift entry leads with the user promise (forgotten
  installs + hand-edits) before mechanism
- drift-detection.md gains a 'Try it now' block at the top
- Before/after CI comparison promoted to its own subsection with
  explicit framing of what the bash workaround missed

Verification: ruff check + format silent; 7621 unit tests + 27 drift
integration tests green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(changelog): trim drift entry to single 'so what?' line

Collapse the two added entries (drift + cache-pin markers) into one
short line that answers the developer 'so what?' and points to the
Drift Detection guide for the full mechanism + opt-out + cache-pin
details. Per maintainer feedback: the previous entries were too long
for a CHANGELOG.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Daniel Meppiel <copilot-rework@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel pushed a commit to tillig/apm that referenced this pull request May 14, 2026
CI:
- ruff format on test_agents_compiler_coverage.py and
  test_claude_formatter.py (Lint job was failing on format --check)
- Fix broken docs anchor #claude-code-deduplication in
  producer/compile.md (Deploy Docs / build was failing on
  starlight-links-validator)

PR microsoft#1146 review-panel follow-ups:
- Doc Writer microsoft#1: add explicit <a id> anchor before the
  :::note[]:::callout so the table link resolves (Starlight does
  not auto-generate anchors from callout directives)
- Doc Writer microsoft#2: rewrite the dedup note to attribute
  .claude/rules/ population to BOTH apm install AND apm compile
  (a producer who only runs compile hits dedup on the second run
  and the docs previously gave them no explanation)
- CLI Logging microsoft#4 / DevX UX: dry-run preview now appends a
  '(instructions section skipped: .claude/rules/ already
  populated...)' line whenever skip_instructions fires, so
  scripted consumers and novice users no longer see a bare
  'Would generate 0 files' with no why
- Test Coverage microsoft#3: add tests/integration/
  test_install_compile_claude_dedup_e2e.py -- exercises the
  install->compile pipeline through the real CLI to lock in the
  cross-module dedup contract; second test pins the
  compile-alone twice path

Verified locally:
- ruff check + ruff format --check both silent
- Full unit suite: 8311 passed
- npm run build inside docs/: 'All internal links are valid'

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
sergio-sisternes-epam pushed a commit that referenced this pull request May 19, 2026
sergio-sisternes-epam added a commit that referenced this pull request May 19, 2026
- Use LockFile.read() instead of raw yaml.safe_load() in _collect_transitive_mcp_deps (#1)
- Guard against mcp:null in get_mcp_dependencies() (#2)
- Remove inline MCP installation pipeline, defer to follow-up PR (#3/#7)
- Remove redundant import builtins in _deduplicate_mcp_deps (#10)
- Add tests for mcp:null, mcp:[], root-over-transitive dedup order (#9)
- Remove tests for deleted inline pipeline functions
sergio-sisternes-epam pushed a commit that referenced this pull request May 19, 2026
…covery + governance (#503)

* Initial plan

* Initial plan for marketplace integration

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* feat: marketplace integration core implementation

- Add marketplace/ package: models, errors, registry, client, resolver
- Add marketplace CLI commands: add, list, browse, update, remove, search
- Add lockfile provenance fields: discovered_via, marketplace_plugin_name
- Add install hook for NAME@MARKETPLACE syntax pre-parse intercept
- Wire marketplace commands in cli.py

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* docs: add marketplace integration guide and CLI reference

- Create guides/marketplaces.md covering marketplace concepts,
  registration, browsing, search, install syntax, provenance tracking,
  and cache behavior
- Add apm marketplace and apm search command sections to cli-commands.md
- Update apm install arguments to include NAME@MARKETPLACE syntax
- Update plugins.md Finding Plugins section with marketplace cross-refs

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* docs: fix marketplace.json format and lockfile field names to match implementation

- Use array-based plugins format matching models.py parser expectations
- Use discovered_via and marketplace_plugin_name matching lockfile.py fields
- Document both Copilot CLI (repository/ref) and Claude Code (source) formats

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* docs: fix git-subdir and relative source descriptions to match resolver

- git-subdir uses separate repo and subdir fields
- Relative string sources resolve to marketplace repo subdirectory

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* feat: add marketplace unit tests and docs

- 114 unit tests across 8 test files covering all marketplace modules
- New marketplace guide at docs/src/content/docs/guides/marketplaces.md
- Updated CLI reference with marketplace and search commands
- Updated plugins guide with marketplace integration section
- CHANGELOG entry for marketplace feature

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* refactor: address code review feedback

- Use List[MarketplacePlugin] from typing instead of lowercase generic
- Eliminate duplicated condition in install.py marketplace intercept
- Restructure control flow for clarity

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* fix: address all 12 PR review comments on marketplace integration

- Narrow except Exception to except ImportError for lazy marketplace import (comment #1)
- Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2)
- Include subdir in git-subdir source resolution with path traversal validation (comment #3)
- Include relative path in relative source resolution with traversal validation (comment #4)
- Sanitize marketplace name in cache file paths to prevent path traversal (comment #5)
- Fix docs: stale-if-error, not stale-while-revalidate (comment #6)
- Consolidate CHANGELOG entries into single line with (#503) (comment #7)
- Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8)
- Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9)
- Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10)
- Fix doc examples to use identifier-compatible names (comments #11, #12)
- Update tests to match corrected resolver behavior, add traversal tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Copilot CLI format compatibility and marketplace provenance bugs

Bug #1 - Format incompatibility with awesome-copilot marketplace:
  - Parser now accepts 'source' key (Copilot CLI) as type discriminator
    fallback when 'type' key is absent, normalizing to 'type' for resolvers
  - GitHub source resolver now accepts 'path' field (Copilot CLI) as
    virtual subdirectory, same as 'subdir' in git-subdir sources
  - Path traversal validation applied to 'path' field
  - Fixes: 8 of 62 plugins in awesome-copilot that use github source
    objects with 'source'+'path' keys instead of 'type'+'subdir'

Bug #2 - Lockfile provenance never written:
  - Root cause: install passed raw marketplace refs (NAME@MARKETPLACE)
    as only_packages, but DependencyReference.parse() can't parse those,
    so identity filtering removed all deps -> 'already installed'
  - Fix: use validated_packages (canonical owner/repo strings) instead
    of raw click argument for only_pkgs

Both bugs verified fixed via E2E tests against real marketplaces:
  - github/awesome-copilot (62 plugins)
  - anthropics/skills (3 plugins)
  - microsoft/azure-skills (1 plugin)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: scope marketplace search to QUERY@MARKETPLACE format

Search now requires QUERY@MARKETPLACE (e.g. apm search security@skills)
to eliminate name collisions across marketplaces. Added search_marketplace()
client function for single-marketplace search.

- Rejects bare queries without @ — clear error with usage example
- Validates marketplace exists before searching
- Updated docs/guides/marketplaces.md with new syntax
- 7 test cases: format validation, unknown marketplace, results, no results

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: update CLI reference and plugins guide for scoped search syntax

Align all documentation with QUERY@MARKETPLACE search format.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: use centralized path_security for marketplace traversal checks

Replace 3 ad-hoc '..' in x.split('/') checks in marketplace/resolver.py
with validate_path_segments() from utils/path_security.py. Add
defense-in-depth validate_path_segments() call to _sanitize_cache_name()
in client.py.

This ensures marketplace code uses the same cross-platform path safety
utilities (backslash normalization, single-dot rejection) as the rest
of APM.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add path safety rule to copilot-instructions.md

Directs contributors to use validate_path_segments() and
ensure_path_within() from utils/path_security.py instead of
ad-hoc traversal checks.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>
Co-authored-by: danielmeppiel <dmeppiel@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
sergio-sisternes-epam pushed a commit that referenced this pull request May 19, 2026
* feat(policy): W1 foundations for install-time policy enforcement (#827)

Wave 1 of issue #827 implementation. Lays the foundations the install
pipeline gate (W2) will plug into. No behaviour change yet — install
still does NOT enforce policy until W2 wires the gate phase.

What's in:
- policy_checks: new public seam run_dependency_policy_checks(deps,
  lockfile=, policy=, mcp_deps=, effective_target=) accepting a
  resolved dep set; old run_policy_checks(project_root, policy) is now
  a thin wrapper. Honours require_resolution: project-wins for
  version-pin mismatches only. Latent isinstance(allow, list) bug
  fixed for schema's Tuple[str, ...].
- policy/discovery: cache stores merged effective policy with chain
  metadata + fingerprint. Atomic writes via temp + os.replace, with
  pid+thread_id suffix to prevent concurrent-writer collision.
  MAX_STALE_TTL=7d ceiling on cache reuse. PolicyFetchResult expanded
  to express 9 outcomes (found, absent, cached_stale,
  cache_miss_fetch_fail, malformed, disabled, garbage_response,
  no_git_remote, empty).
- diagnostics: CATEGORY_POLICY constant + per-category renderer wired
  into render_summary().
- command_logger: InstallLogger.policy_resolved/violation/disabled
  with per-class actionable error wording (auth/unreachable/malformed/
  blocked).
- tests/fixtures/policy/: 14 policy fixtures + 7 project fixtures
  (denied-direct, denied-transitive, required-missing,
  required-version-mismatch, mcp-denied, target-mismatch,
  unpacked-bundle) covering W4 live matrix scenarios L2/L4/L13 and
  rubber-duck findings I5/I6/I7/N14/C2.
- docs: 12-section Install-time enforcement guide skeleton in both
  enterprise/policy-reference.md and packages/apm-guide skill mirror.
  10 sections filled; sections 7 (snippets) and 10 (error table)
  stubbed for W3-docs-final once W2 lands and W4 captures live output.

Tests:
- tests/unit: 4878 passed (1 pre-existing unrelated MCP failure
  deselected). Includes 41 logger + 29 policy-seam + 38 cache + 21
  fixture-load new tests.

Refs: #827
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(install): W2A policy enforcement at install time (#827)

Wave 2A wires the three install-time enforcement sites planned for #827:

1. **Pipeline gate phase** (src/apm_cli/install/phases/policy_gate.py):
   New phase running between resolve and targets. Discovers org policy,
   resolves the inheritance chain via resolve_policy_chain, persists the
   merged effective policy + chain refs to cache (chain_refs threading
   per C1 amendment), then calls run_dependency_policy_checks against
   the resolved deps. Routes 9 discovery outcomes (found, absent,
   cached_stale, cache_miss_fetch_fail, malformed, disabled,
   garbage_response, no_git_remote, empty). Block-mode violations raise
   PolicyViolationError to halt the pipeline cleanly.

2. **--mcp branch preflight** (src/apm_cli/policy/install_preflight.py
   + commands/install.py:1091-1125):
   apm install --mcp does NOT enter the install pipeline. New shared
   helper run_policy_preflight() runs discovery + dep checks for any
   non-pipeline command site. Wired into --mcp BEFORE _run_mcp_install
   so denied servers never reach the integrator. Also exports
   PolicyBlockError for callers.

3. **install <pkg> snapshot+rollback** (commands/install.py):
   apm install <pkg> mutates apm.yml BEFORE the pipeline runs. We now
   snapshot apm.yml as raw bytes (not parsed YAML, to avoid round-trip
   drift on whitespace / key-order / comments), and on ANY pipeline
   failure (policy block, download error, etc.) restore byte-for-byte
   via tempfile + os.replace atomic write. Logs '[i] apm.yml restored
   to its previous state.' and exits non-zero.

InstallContext gains policy_fetch, policy_enforcement_active, no_policy.

Tests: +68 new tests, 4946 unit tests pass total.
- test_policy_gate_phase.py: 27 (covers all 9 outcomes)
- test_mcp_preflight_policy.py: 22 (escape hatches, allow/deny, transport,
  self-defined, trust_transitive, discovery outcomes, return shape)
- test_install_pkg_policy_rollback.py: 19 (byte-equal restore, comments
  preserved, --no-policy bypass, download error rollback, snapshot
  unit tests)

W2B (dry-run, target-aware, escape-hatch CLI flag) and C2 panel review
follow.

Refs: #827
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(policy): W2B install enforcement - escape hatch, dry-run preview, target-aware check (#827)

W2B completes the enforcement surface:

* policy_target_check.py - new pipeline phase after targets that re-runs
  target/compilation checks with the resolved effective_target. Filters
  to TARGET_CHECK_IDS only to avoid double-emitting dep violations from
  the gate phase. Honors CLI --target override (I6 fix scenario).

* --no-policy escape hatch on apm install / install <pkg> / install --mcp
  / update. APM_POLICY_DISABLE=1 env var equivalent. Both route through
  ctx.no_policy and emit always-visible warnings via
  InstallLogger.policy_disabled() noting that apm audit --ci still fails.

* --dry-run policy preview. run_policy_preflight gains dry_run=True kwarg.
  Emits '[!] Would be blocked by policy: <dep> -- <reason>' (block) or
  '[!] Policy warning: <dep> -- <reason>' (warn) before the would-install
  table. Never raises, never mutates. Direct manifest deps only (resolver
  doesn't run in dry-run; documented limitation).

InstallRequest, InstallService, InstallContext threaded with no_policy.
LOC budget on install.py raised 1625 -> 1650 with documented rationale.

Tests: 5003 unit pass (+57 W2B: 17 target_check + 24 no_policy_flag +
16 dry_run_policy). Full suite green vs main baseline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): C2 panel fixes - transitive MCP enforcement, shared chain discovery, dry-run cap, drop apm update --no-policy (#827)

C2 panel checkpoint surfaced 4 fixes (S1+B1+D2 BLOCKER/PASS-WITH-CONCERN, D1
DevX). All landed; full suite 5032 pass.

S1 (Supply Chain BLOCKER) - transitive MCP enforcement:
  Transitive MCP servers from APM packages were bypassing install-time policy.
  The pipeline gate phase only sees direct apm.yml deps; transitive MCP servers
  are merged later via MCPIntegrator.collect_transitive() and written to
  runtime configs (.copilot/mcp.json, .cursor/mcp.json) with no policy check.
  This defeated #827 on the most security-critical dep category.
  Fix: second run_policy_preflight() call in commands/install.py after the
  transitive merge, before MCPIntegrator.install(). On block: abort MCP config
  writes, exit non-zero. APM packages remain installed (gate phase approved
  them). 15 new unit tests in test_transitive_mcp_policy.py.

B1 (Architect, partial) - shared chain-aware discovery:
  Extract discover_policy_with_chain() into policy/discovery.py so both
  policy_gate.py and install_preflight.py walk the same inheritance chain.
  Closes the gap where --mcp / --dry-run paths could resolve a different
  effective policy than the pipeline path. Gate-phase keeps its 9-outcome
  routing; only the discovery seam moved. 10 new tests in
  test_chain_discovery_shared.py.

D2 (DevX UX) - dry-run noise cap:
  install_preflight._DRY_RUN_PREVIEW_LIMIT = 5. Long deny lists now show
  5 lines per severity bucket + tail '[!] ... and N more would be blocked
  by policy. Run apm audit for full report.' 4 new tests.

D1 (DevX UX) - drop apm update --no-policy:
  apm update is the CLI self-updater (refreshes the apm binary), not a
  dependency refresh. The flag was accepted but unused. Removed the option
  and flipped the test to assert the flag is now rejected.

LOC budget on install.py raised 1650 -> 1675 with documented justification.

Tests: 5032 unit pass (+29 new: 15 transitive_mcp + 10 chain_discovery_shared
+ 4 dry_run_noise_cap). 1 pre-existing MCP test deselected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs+test(policy): W3 - integration matrix, docs final fill, CHANGELOG, growth (#827)

W3 phase complete. All 5 parallel workstreams landed.

Tests:
  tests/integration/test_policy_install_e2e.py - 17 e2e scenarios I1..I17
  Covers all 9 PolicyFetchResult outcomes + all 6 violation classes via
  CliRunner-driven full-pipeline flows. Mocks discover_policy_with_chain
  at both seams (policy_gate + install_preflight). Uses _build_policy()
  helper for frozen-dataclass safe construction.

Docs:
  docs/src/content/docs/enterprise/policy-reference.md
    sec 7: 8 verbatim CLI snippets (success, block, warn, --no-policy,
    APM_POLICY_DISABLE, --dry-run with overflow tail, install <pkg>
    rollback, transitive MCP block)
    sec 10: outcome table (9 fetch outcomes) + violation table (6 classes)
    Added explicit JSON/SARIF non-goal callout (C1 amendment).
  packages/apm-guide/.apm/skills/apm-usage/governance.md
    Same content, leaner skill version, links back to docs for full text.

CHANGELOG.md:
  Added: --no-policy / APM_POLICY_DISABLE escape hatch, --dry-run preview,
    install <pkg> rollback
  Changed: pipeline gains policy_gate + policy_target_check phases, shared
    chain discovery + atomic cache + MAX_STALE_TTL
  Security (headline): apm install enforces apm-policy.yml; transitive MCP
    checked before runtime config write

Follow-up issue #829 filed: policy.fetch_failure: warn|block schema knob.

Tests: 5049 pass (5032 unit + 17 integration). 1 pre-existing MCP test
deselected.

PR body drafted at session-state/files/pr-body-827.md. Growth strategy
entry + asciinema script staged in WIP (gitignored).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): C3 fixes - direct MCP enforcement, malformed posture, warn-mode coverage, doc drift (#827)

C3 final panel + rubber-duck found 5 issues. All fixed.

#1 (CRITICAL) - Direct MCP deps in apm.yml bypassed enforcement:
  ctx.direct_mcp_deps now populated in pipeline.py from
  apm_package.get_mcp_dependencies() before policy_gate runs. policy_gate
  reads direct_mcp_deps (not the dead mcp_deps_to_install) and passes them
  to run_dependency_policy_checks. install.py:1496 second preflight guard
  drops 'and transitive_mcp' so direct-only MCP installs are also caught.

#2 (CRITICAL) - Malformed policy handling inconsistent + broke rollback:
  policy_gate.py replaced sys.exit(1) on malformed with fail-open warn
  (matches install_preflight + cache_miss_fetch_fail/garbage_response
  posture). sys.exit was bypassing the rollback handler in install.py for
  apm install <pkg>. CEO mandate: malformed = warn, fail-closed knob is
  follow-up #829.

#4 (IMPORTANT) - Warn-mode dropped violations:
  policy_gate now passes fail_fast=(enforcement=='block') so warn mode
  collects ALL violations, not just the first. Also emits warnings for
  passed=True checks with non-empty details (project-wins version-pin
  mismatches were silently dropped).

#3 (IMPORTANT) - Chain inheritance is 1-level, not multi-level:
  discover_policy_with_chain only walks one parent. Toned down docs in
  policy-reference.md and governance.md with explicit caution callout.
  Filed follow-up #831 for proper recursive walk + cycle detection.

#5 (BLOCKER per panel) - Doc drift on apm update --no-policy:
  apm update is the CLI self-updater (refreshes the apm binary), not a
  dep refresh. Removed all mentions from both docs. apm deps update is
  the dep-refresh surface (runs install pipeline, gate applies); --no-policy
  is NOT exposed there today.

Tests: 5059 pass (5049 baseline + 10 new: 6 unit gate + 4 integration
I18/I19/I20). New integration tests cover real direct-MCP block, real
malformed fail-open, warn-mode multi-violation. I16 class renamed to
TestI16GarbageResponsePolicy to fix mislabeling.

Follow-ups: #829 (fetch_failure schema knob), #831 (multi-level chain).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): in-PR resolution of #834 (warn-mode rendering) and #831 (recursive extends chain) (#827)

Originally filed as follow-ups during C3, moved in-PR per reviewer
request so #832 ships a complete enforcement story.

#834 - Warn-mode policy violations did not render in the install
summary. Root cause: pipeline created a fresh DiagnosticCollector for
install_result.diagnostics while InstallLogger.policy_violation()
pushed warnings into logger.diagnostics. Two collectors, one rendered.
Fix: when a logger is present, reuse logger.diagnostics so policy
records flow through render_summary() (block mode unaffected - it
aborts inline before summary).

#831 - extends: chain only supported one level (parent). Inheritance
machinery (resolve_policy_chain, detect_cycle, MAX_CHAIN_DEPTH=5) was
already N-deep capable; discovery never wired it. Fix: rewrite
_resolve_and_persist_chain as iterative depth-first walk, leaf-first;
cycle detection via inheritance.detect_cycle; honor MAX_CHAIN_DEPTH=5
with explicit pre-append check; partial-chain warning when a mid-chain
ref fails to fetch ('Policy chain incomplete: <ref> unreachable, using
<N> of <M> policies'); single cache write at leaf with full chain
fingerprint.

Tests: +1 unit (warn-render), +5 unit (3-level full, cycle, depth
limit, partial chain, single-level regression), +1 integration
(TestI21ThreeLevelExtendsChain). 5044 unit pass.

Docs: enterprise/policy-reference.md and apm-usage/governance.md
chain-depth callouts updated.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(changelog): record in-PR resolution of #834 and #831 under #827

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): address review-panel pre-merge findings (#827)

- Security F1 (HIGH): pin extends: chain to leaf policy host; disable
  HTTP redirects in _fetch_from_url and _fetch_github_contents. Closes
  cross-host credential leak vector via git credential fill fallback
  and SSRF/Referer-leak vector via 30x redirects. raw.githubusercontent
  .com is treated as distinct from github.com (strict pin).
- Logging C1+C2 + UX F1/F2/F4/F5/F9: extract InstallLogger.policy_
  discovery_miss() canonical helper covering all 7 discovery outcomes;
  route both policy_gate and install_preflight through it. absent now
  verbose-only; no_git_remote downgraded to [i]; garbage_response gets
  distinct wording (no VPN/firewall noise); cached_stale and cache_
  miss_fetch_fail messages now state enforcement posture explicitly;
  violation messages dedupe dep_ref prefix; wire _policy_reason_blocked
  into block-severity policy_violation as dim secondary line.
- Docs: remove [Planned] banner from policy-reference; update
  enforcement tables (policy-reference + governance skill) to reflect
  install-time blocking; document --no-policy / APM_POLICY_DISABLE in
  cli-commands.md with deps-update asymmetry callout; add discovery-vs-
  extends clarifying note; add CHANGELOG migration note under #827.

Tests: 5053 -> 5068 (+15 logging, +9 security host-pin).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(policy): ship enterprise hardening pack on top of #827

Four enterprise hardening items shipped in-PR per CISO-arbitrated panel
verdict + CTO threat-model deep dive (PR #832 comments 4294087760 +
4294115069). Closes #829.

1. policy.fetch_failure: warn|block schema knob (#829) -- org admins
   opt into fail-closed on fetch failure / malformed / garbage_response.
   Default 'warn' preserves backwards compat.
2. apm.yml policy.fetch_failure_default: warn|block -- project-side
   complement so a project can lock down behavior even when no policy
   is reachable to read the org-side knob from.
3. apm policy status diagnostic command -- show discovery outcome,
   source, enforcement, cache age, extends chain, effective rule
   counts, and hash-pin state. --json for SIEM ingestion. Trust-but-
   verify tool that makes fail-open acceptable.
4. apm.yml policy.hash: 'sha256:...' consumer-side pin -- closes the
   garbage_response compromised-intermediary vector by verifying raw
   policy bytes against a project-pinned digest. Equivalent of pip
   --require-hashes for the policy itself. ALWAYS fail-closed on
   mismatch, regardless of fetch_failure setting (a hash mismatch is
   an explicit pin violation, not a fetch failure). sha384/sha512
   accepted; md5/sha1 rejected (collision-resistant only).
5. apm audit --ci auto-discovers org policy when --policy-source is
   not provided; --no-policy flag added to skip. Closes the
   audit/install asymmetry that left CI blind to sideloaded primitives.

Tests: 5068 -> 5157 (+89: hash pin 31, fetch_failure knob, audit
auto-discovery, policy status command, plus updates to existing
discovery tests for the new expected_hash kwarg threading).

Docs: policy-reference §9.5 (fetch_failure), §9.6 (hash pin),
§9.7 (apm policy status), §9.8 (audit auto-discovery); governance.md
skill mirrors all of the above; cli-commands.md gets policy status +
audit --no-policy. CHANGELOG entries under [Unreleased] Added /
Added (Security).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(policy): address doc-writer review BLOCKERs (#827)

- policy-reference.md: remove stale 'planned fetch_failure knob' paragraph
  that contradicted the §9.5 entry shipped in the same PR; add Linux
  hash-compute one-liner alongside the macOS shasum example.
- cli-commands.md: add 'apm policy status' command section under a new
  'apm policy' family (synopsis, --policy-source/--no-cache/--json,
  exit-code note, examples). Add --no-policy flag to 'apm audit' options
  list. Reword --policy SOURCE description to reflect that --ci now
  auto-discovers when --policy is omitted. Update audit examples to
  match (drop the now-redundant '--policy org' from auto-discovery
  example, add explicit --no-policy variant).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(policy): address doc-writer HIGH+LOW findings (#827)

- manifest-schema.md: add policy: block to schema diagram + new
  section 3.9 documenting fetch_failure_default, hash, hash_algorithm
- policy-reference.md: add fetch_failure: warn to canonical schema
  YAML and a fetch_failure entry under Top-level fields; lift apm
  policy status and apm audit --ci auto-discovery into proper
  numbered subsections (9.7 / 9.8) so anchors match the skill mirror
- governance.md: surface install-time enforcement with link to
  policy-reference#install-time-enforcement
- ci-policy-setup.md: annotate Step 3 noting apm audit --ci
  auto-discovers and --policy org is now an explicit override
- security.md: add Compromised policy intermediary row to attack
  surface comparison, linked to policy.hash consumer-side pin
- cli-commands.md: split --no-policy into 2-line nested bullet
  separating behaviour from env-var equivalence
- apm-guide skill mirror: add fetch_failure: warn to schema overview
  to keep skill aligned with policy-reference

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): address PR review panel logging+arch findings (#827)

BLOCKING:
- command_logger.policy_discovery_miss: gate no_git_remote info
  message on verbose mode; previously emitted on every install in a
  non-git directory

Architecture:
- New install/errors.py with canonical PolicyViolationError;
  PolicyBlockError kept as re-exported alias to preserve test patches
- New policy/outcome_routing.py::route_discovery_outcome
  consolidating the 9-outcome routing table; policy_gate.py and
  install_preflight.py now delegate instead of duplicating
- pipeline.py: catch PolicyViolationError before bare Exception so
  policy block messages are not double-nested in RuntimeError
- commands/install.py: isinstance(PolicyViolationError) branch in
  the legacy handler for the same reason

Logging UX:
- install_preflight: empty check.details now falls back to
  [check.name] so the block message is never blank
- _extract_dep_ref helper replaces detail.split(":")[0] with
  defensive parsing that falls back to check.name

Security:
- discovery._get_cache_dir asserts containment vs project_root
  (resolves symlinks) instead of an unguarded join
- Removed dead no_policy= kwarg from discover_policy_with_chain;
  env-var defence-in-depth retained on the call site

Tests: +tests/unit/policy/test_pr_832_findings.py covering all 8
  findings; install_logger split into silent/verbose cases. 5176
  unit tests pass, 0 regressions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(policy): use urllib.parse for host assertions to silence CodeQL (#827)

CodeQL's py/incomplete-url-substring-sanitization rule fired 6 times
on test_extends_host_pin.py because bare 'host' in msg substring
checks could in theory match a host appearing at an arbitrary URL
position (path, query, userinfo). The assertions are correct in
practice -- they assert on production error messages of known
format -- but the pattern is not safe in general.

Replace each substring check with a precise extractor:

- _assert_extends_host_in_message / _assert_leaf_host_in_message:
  regex-anchor on the production 'extends host: <h>' / 'leaf host:
  <h>' tokens, then exact-compare the captured group.
- _assert_redirect_target_host: regex-extract the redirect target
  URL after 'to ', then urllib.parse.urlparse(...).hostname compare.

No production-code changes; all 9 host-pin tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy,audit): address PR #832 DevX UX blockers

- audit --no-policy help text rewritten to describe positive
  behaviour first ("Skip org policy discovery and enforcement"
  instead of the negative "Skip auto-discovery ... in --ci mode"),
  so apm audit --help no longer hides the primary effect behind a
  caveat. Aligns the code with the docs.

- apm policy status --check flag added: exits 1 when outcome is
  not 'found' (i.e. policy unresolvable / absent / disabled /
  fetch-failed), 0 otherwise. Default behaviour unchanged (always
  exit 0) so the diagnostic remains safe for human and SIEM use,
  while CI authors get the npm audit / pip check style contract
  via a single flag.

Updates cli-commands.md, policy-reference.md, and CHANGELOG.md to
document the new flag and exit-code table. Adds TestStatusCheckFlag
covering the found / unresolvable / discovery-exception / json
combinations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
sergio-sisternes-epam pushed a commit that referenced this pull request May 19, 2026
* feat(drift): Phase A infra - guards + diagnostic category

- Add _ReadOnlyProjectGuard context manager (utils/guards.py): snapshots
  stat of protected paths, raises ProtectedPathMutationError on any
  mutation. Defense-in-depth above the scratch-root remap.
- Add CATEGORY_DRIFT + drift() recording method to DiagnosticCollector.
- Add drift_count property and _render_drift_group renderer that groups
  by kind (modified/unintegrated/orphaned) with stable section header
  for machine consumers.
- Tests: 7 unit tests covering happy path, mutation, creation, deletion,
  missing-tolerated, exception-not-masked, single-file protected path.

Refs #1071. Phase A of WIP/drift/06-final-plan.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(drift): Phase B+C - replay engine + audit CLI wiring

Implements the drift detection feature per WIP/drift/06-final-plan.md
(closes #1071 scope alignment with #898).

Engine (Phase B):
- src/apm_cli/install/drift.py: ReplayConfig, DriftFinding, CheckLogger,
  CacheMissError, normalization helpers (build-id strip, line endings,
  BOM), run_replay() (cache-only), diff_scratch_against_project(),
  text/json/sarif renderers, atexit scratch cleanup.
- src/apm_cli/install/services.py: scratch_root kwarg with
  ensure_path_within defense-in-depth guard for replay isolation.
- src/apm_cli/policy/ci_checks.py: _check_drift() wrapper returning
  (CheckResult, list[DriftFinding]); graceful CacheMissError handling.

CLI surface (Phase C):
- src/apm_cli/commands/audit.py: --no-drift opt-out flag with mutex
  against --strip/--file via UsageError. Drift wired into both
  _audit_ci_gate (--ci) and _audit_content_scan (bare project audit)
  paths, default-on per ADR-02. JSON/SARIF/text renderers integrated;
  --no-drift warning gated to text mode (stdout cleanliness).

Tests:
- tests/unit/install/test_drift.py: 13 unit tests (normalization,
  diff cases, renderers).
- Legacy --ci tests opt out of drift via batch --no-drift injection
  (fixture parity, not a behavior change).

7597 unit tests pass; lint clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(drift): Phase D - integration + e2e + perf coverage (43 tests)

Implements the locked test matrix for issue #1071 drift
detection. Floor of 43 tests across three new files closes the
'ULTRA HARDENING OF HELL' coverage requirement.

New files:
- tests/integration/test_drift_check.py (32 tests):
  * Section A: 9 drift cases (modified/unintegrated/orphaned + CRLF/
    BOM/Build-ID false-positive guards)
  * Section B: 4 past-PR regressions (#1067, #882, #889, source-deleted)
  * Section C: 7 edges (no/corrupt lockfile, untracked governed,
    no-write contract, idempotency)
  * Section D: 3 multi-target (copilot/claude/cursor)
  * Section E: 9 default-on / --no-drift opt-out (mutex, stderr
    routing, JSON suppression)
- tests/integration/test_drift_check_e2e.py (10 tests):
  full install->mutate->audit loop with mix_stderr=False, air-gap
  proof, JSON/SARIF stability, 30s smoke
- tests/unit/install/test_drift_perf.py (1 test):
  100 primitives replay+diff under 5s

Engine fix surfaced by tests:
- src/apm_cli/install/drift.py: run_replay now reads apm.yml's target
  field via parse_target_field and passes it to resolve_targets.
  Without this, multi-target projects (copilot+claude+cursor) replayed
  only the auto-detected primary target, falsely reporting secondary
  target deployments as orphaned. Helper _read_apm_yml_target() added.

CI wiring:
- scripts/test-integration.sh: two new blocks in run_e2e_tests()
  invoking the integration + e2e suites before the final success log.
  Both safe to run without GITHUB_APM_PAT (cache-only, mocked network).

Verification: 56 drift-domain tests pass; full repo lint clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(drift): CHANGELOG + Starlight guide + apm-usage skill + ci.yml note

- CHANGELOG.md: Added [Unreleased] entry under Added describing the
  default-on drift detection in apm audit, the three failure modes it
  catches, false-positive guards, --no-drift opt-out + mutex semantics,
  and the JSON/SARIF integration shape. Closes #1071, supersedes #898.
- docs/src/content/docs/guides/drift-detection.md (NEW, sidebar order 7):
  Full user-facing guide -- what drift means, how the cache-only replay
  works (with mermaid diagram), exit-code matrix, when to use --no-drift,
  output formats, and the CI single-line gate that replaces the legacy
  git status --porcelain script.
- packages/apm-guide/.apm/skills/apm-usage/commands.md: Extended the
  audit row with --no-drift flag and added a paragraph documenting the
  drift-by-default behavior, three failure modes, false-positive
  normalization, and JSON/SARIF integration. Aligns the skill that
  ships in apm-guide with the new CLI surface (per
  apm-keep-docs-up-to-date.instructions.md rule 4).
- .github/workflows/ci.yml: Annotated Gate B (legacy bash drift check)
  with a comment marking it redundant once apm-action ships a CLI with
  default-on drift detection (this PR's release). Kept as
  defense-in-depth fallback until then.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(drift): address panel feedback - recovery hint + doc-sync

CEO panel recommended landing two in-PR follow-ups before merge:

1. Recovery hint in drift output (cli-logging + devx-ux convergence):
   render_drift_text now appends '[i] Run apm install to re-sync
   deployed files with the lockfile.' so users see WHAT and HOW in one
   message. Honors Message Writing Rule #4 'Include the fix'.

2. Doc-sync (doc-writer + devx-ux convergence):
   - reference/cli-commands.md: add --no-drift to audit options table;
     amend --ci description to mention drift contribution.
   - integrations/ci-cd.md: replace bash 'git status --porcelain'
     workaround under 'Verify Deployed Primitives' with 'apm audit --ci'
     one-liner; update 'We dogfood this' callout text.
   - getting-started/quick-start.md: retarget stale cross-ref from the
     now-superseded ci-cd anchor to the new drift-detection guide.
   - guides/drift-detection.md: drop the self-contradictory case #2 in
     'When to use --no-drift' (strip-mode is auto-skipped, not opt-out).
   - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line
     pointing readers to the guide for detail.

Tracked as follow-up issues (CEO call):
- supply-chain: verify cache content matches lockfile resolved_commit
  before drift replay trusts it (commit-SHA pinning bypass on shared
  CI caches).
- test-coverage: inverse-normalization unit test asserting BOM/CRLF/
  Build-ID guards do NOT mask real content drift (safety invariant).

Lint clean. 45 drift tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(drift): address Copilot review - exit-code contract + types + diagnostics

Bare 'apm audit' is advisory (exit 0 on drift); 'apm audit --ci' is
the gate (exit 1). Closes the regression introduced when content-scan
escalation accidentally also escalated drift findings.

Also addresses inline review:
- A2: vacuous ASCII-encoding assertion now scopes per-line
- A4: tuple[float, int] -> tuple[int, int] in guards.py
- A5: type-annotated _check_drift signature
- A6: clarified DRIFT_ORPHANED comment
- A7: CHANGELOG references PR + closes
- A3: CacheMiss message now drift-specific (no --no-cache confusion)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(drift): link drift detection guide from README security section

Per oss-growth: surfaces drift detection alongside content security
and lockfile integrity in the conversion-critical Production-grade
section, so a reader scanning for 'why APM' sees the supply-chain
story end-to-end.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(drift): cache pin marker for stale-cache detection

apm install drops a .apm-pin JSON marker into each cached package
root recording the resolved_commit; apm audit verifies it before
running drift replay. Catches the 'teammate bumped lockfile, did
not reinstall' + 'shared CI runner reused stale apm_modules'
scenarios that would otherwise silently produce misleading drift
output.

LockfileBuilder syncs markers UNCONDITIONALLY (even when the
lockfile YAML is unchanged and even when no install happens), so
existing users self-heal on their next 'apm install'.

This is stale-cache detection, NOT cryptographic integrity --
defending against active cache tampering requires content-addressed
hashes, which is deferred.

Schema (v1): {schema_version: 1, resolved_commit: <sha>}
Marker file: <install_path>/.apm-pin

Coverage:
- 14 unit tests in test_cache_pin.py (positive + every error path
  + skip rules + idempotent re-run + self-heal regression)
- 1 integration test in test_drift_check_e2e.py exercising the
  full install -> mark -> verify flow against a synthetic cache

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address panel follow-ups C1-C5 on PR #1137

C1 (supply-chain): Fail closed on unpinned remote deps
- cache_pin.find_unpinned_remote_deps() helper + stderr warning in
  sync_markers_for_lockfile
- drift._materialize_install_path raises CacheMissError for remote
  deps with resolved_commit=None (was silent fail-open)
- Replaced silent-skip test with warning assertion + new helper test

C2 (architecture): Wire _ReadOnlyProjectGuard into run_replay
- run_replay() now wraps the deps loop with _ReadOnlyProjectGuard
  on governed root dirs + apm.lock.yaml + AGENTS.md
- Regression test: monkeypatched leaky integrator triggers
  ProtectedPathMutationError

C3 (cli-logging-ux): Stderr message on swallowed CacheMissError
- audit._audit_content_scan emits '[!] drift check could not run:
  <msg>' to stderr when drift_failed and no findings (covers cache
  miss, missing lockfile, cache-pin error)
- Integration test e10 asserts stderr message in bare-audit path

C4 (docs): Baseline-check phrasing + CHANGELOG link
- governance-guide, ci-cd, cli-commands now read '7 baseline checks
  plus integration drift detection'
- CHANGELOG drift-detection link points to docs site URL

C5 (oss-growth): User-promise framing
- CHANGELOG drift entry leads with the user promise (forgotten
  installs + hand-edits) before mechanism
- drift-detection.md gains a 'Try it now' block at the top
- Before/after CI comparison promoted to its own subsection with
  explicit framing of what the bash workaround missed

Verification: ruff check + format silent; 7621 unit tests + 27 drift
integration tests green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(changelog): trim drift entry to single 'so what?' line

Collapse the two added entries (drift + cache-pin markers) into one
short line that answers the developer 'so what?' and points to the
Drift Detection guide for the full mechanism + opt-out + cache-pin
details. Per maintainer feedback: the previous entries were too long
for a CHANGELOG.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Daniel Meppiel <copilot-rework@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel pushed a commit that referenced this pull request May 20, 2026
- install.py --target help: mention copilot-app + warn that repeated
  flag (-t a -t b) silently honors only the last value; use commas
  (devx-ux #1, #2)
- copilot-app.md: bump sidebar order 5 -> 6 (collision with
  github-rulesets.md), cross-link to reference/experimental/ and
  reference/targets-matrix/, rephrase WAL ownership to reflect that
  the App owns WAL and APM coexists via BEGIN IMMEDIATE + bounded
  retry, surface accepted schema range [13, 13], split lifecycle
  table cell with rationale below the table, add :::note callout
  clarifying the shape predicate, document source-deletion orphan
  case (doc-writer #1-5, devx-ux #4, #5)
- tests: add test_workflow_shape_skipped_by_copilot_prompt_integrator
  regression test asserting workflow-shape .prompt.md does NOT leak
  into .github/prompts/ when --target includes copilot
  (test-coverage #1)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request May 20, 2026
… App DB (#1405)

* feat(experimental): copilot-app target deploys scheduled prompts to App DB

Dark-shipped under the new `copilot_app` experimental flag (off by default).

When enabled, `apm install --target copilot-app --global` writes prompts
that carry a `schedule:` frontmatter block as rows in the GitHub Copilot
desktop App's SQLite store at `~/.copilot/data.db`.  No new CLI surface;
`install` / `update` / `uninstall` / `list` all flow through unchanged.

Hard contracts:
- `enabled = 0` on every insert -- user opts in from the App.
- Namespaced ids (`apm--<owner>--<pkg>--<prompt>`) so uninstall never
  touches user-authored rows.
- `PRAGMA user_version` guard (13 currently); refuse to write on unknown.
- WAL-safe SQLite with retry on `database is locked`.
- Update path preserves user state (`enabled`, `last_run_at`, overrides).
- Lockfile URIs use `copilot-app-db://workflows/<id>` (cowork precedent).

Tests: 53 new (DB module, schedule parser, target gating, install E2E).
Full unit suite: 8787 passed (one pre-existing macOS shlex failure
unrelated to this change).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(copilot-app): integration page, apm-usage skill, error UX + lockfile tests

Wave 4 + Wave 6a of the copilot-app dark-ship:

- docs/src/content/docs/integrations/copilot-app.md mirrors the
  copilot-cowork page: enable flag, lifecycle, DB resolution, 'auth'
  model, schema guard, concurrency, lockfile URI scheme, out-of-scope.
- apm-usage skill: commands.md notes copilot-app under experimental;
  package-authoring.md documents the optional schedule: frontmatter
  block.
- tests/unit/integration/test_copilot_app_error_ux.py (5 tests)
  exercises CopilotAppDbMissingError, CopilotAppDbSchemaError,
  CopilotAppDbLockedError mid-deploy: each surfaces as an actionable
  per-prompt diagnostic; one failing prompt does not block the next;
  resolver returning None mid-run is defensive (no crash).
- tests/unit/install/test_services.py adds a round-trip test for
  copilot-app-db:// URI generation through _deployed_path_entry.

Full unit suite: 8794 passed (1 pre-existing unrelated macOS skip).
Lint contract green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(copilot-app): preserve URI scheme for user-scope local installs

When 'apm install <local-pkg> --target copilot-app --global' was invoked,
the lockfile stored 'workflows/apm--...' without the 'copilot-app-db://'
scheme prefix. As a result, the subsequent uninstall could not detect the
copilot-app entry and the DB row was orphaned in the Copilot App.

Root cause: _deployed_path_entry tried 'target_path.relative_to(project_root)'
first. For --global installs, project_root is the user home and the
synthetic copilot-app root (~/.copilot/workflows) sits inside it, so the
relative_to() succeeded and skipped the dynamic-root URI branch entirely.

Fix: detect dynamic-root target match (cowork, copilot-app) before
attempting the project_root-relative encoding. The cowork PathTraversalError
behavior is preserved for the legacy out-of-tree case.

Adds 'test_install_local_pkg_then_uninstall_deletes_db_row' end-to-end
regression covering the install -> lockfile URI -> uninstall -> DB row
deletion roundtrip. Also extends partition_managed_files dynamic-root
branch with the 'prompts_copilot-app' bucket and adds a copilot-app scan
in uninstall engine so user-scope DB-backed targets are cleaned even when
the local apm.yml does not enumerate them.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(changelog): add experimental copilot-app target entry

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(copilot-app): drop autopilot, reset enabled on content change, ship docs

Address apm-review-panel CEO synthesis for PR #1405:

Security (supply-chain-security-expert blocking):
- Remove 'autopilot' from _VALID_MODES (copilot_app_db.py) and
  _VALID_SCHEDULE_MODES (prompt_integrator.py). Earlier docstring
  claimed third-party autopilot was policy-blocked but no code
  enforced it -- this lands the actual enforcement at the writer.
- deploy_workflow UPDATE branch now compares prompt body, mode,
  interval, schedule, model, and reasoning_effort against the
  existing row; when any execution-affecting field changes the
  user's prior opt-in is revoked (enabled = 0, next_run_at = NULL).
  Display-only changes (e.g. just the name) still preserve enabled,
  last_run_at, next_run_at. Closes the silent-malicious-update
  vector the panel flagged.

Test coverage (test-coverage-expert):
- Split the prior 'preserves enabled across updates' test into
  two scenarios that match the new semantics and add a third
  test covering schedule changes and a regression test that
  pins mode='autopilot' as rejected.

Docs (doc-writer blocking):
- Register copilot-app in the Starlight sidebar.
- Add copilot-app row to experimental flag table and update the
  targets-matrix experimental note + auto-detection callout.
- Strip false 'apm list' lifecycle row; replace the 'autopilot
  policy-blocked' paragraph with the secure-by-default rationale;
  expand the lifecycle table so the content-change reset is
  documented; fix two 'copilot_app flag' -> 'copilot-app flag'
  kebab-case drifts.

CHANGELOG (devx-ux nit):
- Replace 'apm config set experimental.copilot_app true' with
  the canonical 'apm experimental enable copilot-app'.

Tests: 62/62 copilot-app suite green; 1970/1970 integration+install
suite green; lint and format silent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* iter-3: force enabled=0 at INSERT writer + truthful docs

- copilot_app_db.deploy_workflow INSERT now hardcodes enabled=0 in the
  SQL (was: row.enabled passthrough). Defence in depth: a future caller
  cannot bootstrap an auto-running APM-deployed row even if the row
  dataclass carries enabled=1. The user opt-in path stays the same:
  enable from the App UI after install.
- New test: test_insert_forces_enabled_zero_even_if_caller_passes_one.
- Docs (copilot-app.md): lifecycle table row 3 now lists all 7
  execution-affecting fields (prompt, schedule, mode, model, reasoning
  effort), matching deploy_workflow comparison semantics.
- Docs (copilot-app.md): error wording for locked-DB paraphrased
  instead of quoting a string the code never emits.
- Docs (package-authoring.md): YAML example drops the autopilot
  comment; rationale aligned with the integrations/copilot-app.md
  framing (intentionally not accepted via this target).

Closes iter-2 panel feedback. No blocking findings from any of 8
panelists; this iteration converges the residual recommended items.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(copilot-app): allow project-scope install (lift --global gate)

A team-shared scheduled prompt declared in a project's apm.yml now
deploys to the developer's ~/.copilot/data.db on 'apm install', without
requiring '--global' user-scope install. The previous gate forced every
contributor to repeat the install at user scope to receive workflows
the team had already declared in the manifest.

Architectural change:
- Add TargetProfile.scope_invariant_resolver (default False).
- copilot-app sets scope_invariant_resolver=True because its deploy
  root (~/.copilot/data.db) is a user-machine resource that exists
  regardless of install intent.
- TargetProfile.for_scope(user_scope=False) now runs user_root_resolver
  for scope-invariant targets, populating resolved_deploy_root so the
  lockfile enrichment can map the synthetic 'workflows/<id>' path to
  the copilot-app-db://workflows/<id> URI.
- Cowork remains scope-sensitive (project-scope cowork still rejected).

Security envelope: the experimental copilot_app flag remains the
single opt-in gate. Removing the --global gate folds two consent
layers (flag + user-scope) into one (flag), which matches v1's stated
'apm install just works' UX promise. The DB row is still INSERTed with
enabled=0, the namespaced 'apm--<owner>--<pkg>--<prompt>' ID is
preserved, and the lockfile URI keeps uninstall surgical.

Tests:
- 8801 unit tests pass (full sweep).
- 64 copilot-app tests pass (was 63).
- New test_install_project_scope_then_uninstall_deletes_db_row
  exercises the full roundtrip via project apm.yml + chdir; rewrites
  the prior test_project_scope_requires_global which asserted the
  inverse.
- Manual verification in /tmp: install -> DB row appears with
  enabled=0 -> uninstall -> DB row gone.

Docs:
- integrations/copilot-app.md install incantation updated.
- apm-usage skill commands.md + package-authoring.md mention both
  project and user scope.
- CHANGELOG entry rewritten.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(copilot-app): address devx-ux follow-ups on gate-lift

Two recommended findings from devx-ux-expert re-panel (Opus 4.7,
agent_id devx-on-gate-lift):

1. Install output is silent about the 'enable in Copilot App' step.
   Added one-line trailing hint after the 'N prompts integrated ->
   copilot-app/workflows/' line, only when copilot-app actually wrote
   rows in this run:

       [+] /pkg (local)
         |-- 1 prompts integrated -> copilot-app/workflows/
         |-- workflows arrive disabled; enable from the Copilot App's
             Workflows tab

   This closes the first-contributor failure mode that the gate-lift
   surfaces (someone runs plain 'apm install' on a project that
   declares copilot-app in targets, sees the integrated line, doesn't
   realise the row landed enabled=0 and needs a Copilot App toggle to
   fire).

2. targets-matrix.md docs row understated project-scope ride-along
   for the three never-auto-detected targets. Reworded to call out
   that a project apm.yml 'targets:' field lets contributors pick
   them up via plain 'apm install'.

Plus the test-coverage nit: pinned verbatim install output shape in
the new project-scope roundtrip test (asserts 'prompts integrated'
AND 'enable from the Copilot App' appear).

Verification:
- 64 copilot-app tests pass
- Full unit sweep 8800 pass (1 pre-existing flake on
  test_runtime_windows.py unrelated to gate-lift -- fails on
  fc40650 too because local 'codex' binary is installed)
- Lint+format silent
- Manual e2e:
    [+] /pkg (local)
      |-- 1 prompts integrated -> copilot-app/workflows/
      |-- workflows arrive disabled; enable from the Copilot App's Workflows tab

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(copilot-app): narrow workflow-shape predicate to 3 keys

Option B refinement: distinguish workflow-shape prompts from plain
.prompt.md unambiguously. Only {interval, schedule_hour, schedule_day}
mark a prompt as a Copilot App workflow row; `mode` and
`reasoning_effort` are valid OPTIONAL fields on a workflow but cannot
flip the shape because plain VSCode prompts use `mode: agent|ask|edit`
legitimately.

Without this narrow, any plain prompt that set `mode:` would silently
land as a (broken) workflow when the user passed --target copilot-app,
or a workflow row could be lossy when a writer set only `mode:`.

Live e2e verified:
- Single-target copilot: workflow-shape SKIPPED, plain ships to
  .github/prompts/ correctly.
- Single-target copilot-app: workflow row in ~/.copilot/data.db with
  enabled=0; plain prompt warns then skips.
- Multi-target copilot,copilot-app (comma-separated): both dispatch
  paths fire; no leak between them.
- Update preserves user-side enabled=1 across re-install.
- Lockfile records copilot-app-db:// URIs cleanly; apm audit clean.

Warning text narrowed to actually-mandatory keys so the hint is
truthful and reproducible.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* address panel follow-ups: devx-ux, test-coverage, doc-writer

- install.py --target help: mention copilot-app + warn that repeated
  flag (-t a -t b) silently honors only the last value; use commas
  (devx-ux #1, #2)
- copilot-app.md: bump sidebar order 5 -> 6 (collision with
  github-rulesets.md), cross-link to reference/experimental/ and
  reference/targets-matrix/, rephrase WAL ownership to reflect that
  the App owns WAL and APM coexists via BEGIN IMMEDIATE + bounded
  retry, surface accepted schema range [13, 13], split lifecycle
  table cell with rationale below the table, add :::note callout
  clarifying the shape predicate, document source-deletion orphan
  case (doc-writer #1-5, devx-ux #4, #5)
- tests: add test_workflow_shape_skipped_by_copilot_prompt_integrator
  regression test asserting workflow-shape .prompt.md does NOT leak
  into .github/prompts/ when --target includes copilot
  (test-coverage #1)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Daniel Meppiel <copilot-rework@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request May 21, 2026
Closes HIGH finding #2 from genesis-critique.md: the v1 ship deferred
the evals gate. This commit authors the minimum set the gate
requires, mirroring the pr-description-skill/evals/ shape so a
single CI lane can score both:

  evals/
    evals.json              - manifest + gates (val split = ship gate)
    triggers.json           - 10 fire + 10 no_fire, 60/40 train/val
    content/                - 2 scenario manifests + rubrics
      three-issues-mixed.json
      sweep-bug-queue.json
    fixtures/               - 4 markdown fixtures (with/without skill x 2)
    README.md
    .gitignore              - results/ (timestamped runner output)
  scripts/
    run_evals.py            - stdlib-only, deterministic matcher

The no-fire trigger set deliberately includes queries that SHOULD
route to apm-review-panel ('review my PR', 'panel-review this PR')
so DISPATCH COLLISION between the two skills would surface as
val-no-fire-rate dropping below 0.5.

Val split result on first run: trigger fire rate 1.0, no-fire rate
1.0, content delta_anchors 8 and 7 (gates require >= 1). Overall:
passed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request May 21, 2026
…-shepherd primitive (#1434)

* fix(copilot-app): warn instead of fail on newer App schema versions

The Copilot App ships fast and bumps PRAGMA user_version on additive
changes that do not break APM's read/write surface. Hard-failing
every install whose user_version exceeded the highest version APM
was tested against made every Copilot App release a release window
for APM -- awful UX for users who simply updated their App.

Change:
- Bump _MAX_SUPPORTED_USER_VERSION 13 -> 15 (newly observed, working).
- Above max: warn-and-continue once per process (deduped by version)
  via _rich_warning, with exact wording supplied by the devx-ux-expert
  persona: names the version delta and points the user at a
  ready-to-file issue title for breakage reports.
- Below min: continues to hard-fail -- the workflows table may
  genuinely not exist on a pre-workflows schema.

Wording verbatim per devx-ux verdict (ship_as_proposed). No cap at
+N: warning already names the exact delta, giving signal proportional
to risk. If a future version truly breaks reads, add it to a
_KNOWN_BREAKING list and hard-fail on that specific version.

Tests cover v16/v17/v50 warn-not-raise, the <13 hard-fail path, and
the per-process dedup contract for multi-row installs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(skills): add batch-bug-shepherd primitive + Copilot App workflow prompt

Ships two composed APM primitives:

1. .apm/skills/batch-bug-shepherd/ -- the working spec for driving a
   batch of suspected bugs in microsoft/apm from raw issue list to
   mergeable PR queue. Fans out one reproduction subagent per candidate
   (LEGIT / UNCLEAR / FIXED-AT-HEAD), cross-references against open
   PRs, then branches:
   * in-flight community PR -> shepherd via .apm/skills/apm-review-panel
   * no PR -> fix session with TDD and mutation-break gate
   Dispatches one completion subagent per shepherd verdict to resolve
   panel follow-ups and post a ready-to-merge confirmation. Maintains
   a single plan.md ground-truth table as canonical session state.

2. .apm/prompts/batch-bug-shepherd.prompt.md -- a Copilot App workflow
   prompt (interval: manual, mode: interactive) that loads the skill
   above and accepts a 'targets' input (either an issue list or the
   literal 'sweep-all'). Lands in ~/.copilot/data.db as a workflow row
   with enabled=0 (consent gate); user must opt in via the Copilot App
   Workflows tab before it runs.

The skill composes with the existing .apm/skills/apm-review-panel/ via
a relative sibling link -- the shepherd phase delegates panel review
to that skill rather than reinventing it.

Design rationale lives in the genesis-plan.md authored during the
genesis design pass; that artifact is intentionally not shipped (it
is design scaffolding, not a user-consumed primitive).

apm.lock.yaml regenerated by 'apm install --target copilot,copilot-app'
to include both the skill path and the synthetic
copilot-app-db://workflows/apm--local--_local--batch-bug-shepherd URI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(skills): make batch-bug-shepherd primitive harness-agnostic

Drop hard-coded .apm/skills/ path probes from the meta-prompt, the
SKILL.md composition section, and the shepherd-prompt asset. Skills
are activated by NAME by the harness; the meta-prompt and skill body
no longer assume APM-on-disk layout. This lets the same primitive
load identically inside Copilot CLI, Copilot App workflows, Claude
Code, Codex, or any other harness that resolves skills by name.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(lockfile): regenerate after harness-agnostic primitive

Captures all 15 local skills (including batch-bug-shepherd) and the
copilot-app-db workflow entry into apm.lock.yaml so subsequent runs
(apm install --frozen, drift-check) see the canonical post-install
state.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor(packages): extract apm-review-panel as its own package

Move the apm-review-panel skill out of the microsoft/apm monorepo's
shared .apm/skills/ tree into a dedicated, publishable APM package
under packages/apm-review-panel/ using the HYBRID layout (apm.yml +
SKILL.md at the root, with co-located assets/ and evals/).

The root apm.yml now declares the package via a local-path manifest
dep, so 'apm install' at root continues to deploy the skill into
.agents/skills/apm-review-panel/ -- but the dependency is now
explicit and inspectable via 'apm deps list' instead of relying on
the includes-auto walk of the monorepo .apm/ tree.

Rule-of-three justification (see genesis-plan-v2.md step 3.5):
panel is independently useful for any single-PR review (not just
batch shepherding), so extraction unlocks consumer reuse and
prevents the consumer from being forced to install the shepherd
to get the panel.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor(packages): repackage batch-bug-shepherd as a standalone APM package

Move batch-bug-shepherd from the monorepo .apm/skills/ and .apm/prompts/
trees into a dedicated, publishable APM package under
packages/batch-bug-shepherd/ using the multi-primitive .apm/ layout
so the package can ship BOTH the skill and the workflow prompt:

  packages/batch-bug-shepherd/
    apm.yml
    .apm/
      skills/batch-bug-shepherd/
        SKILL.md
        assets/...
      prompts/
        batch-bug-shepherd.prompt.md

The package declares its dep on apm-review-panel via a local-path
manifest entry (../apm-review-panel, anchored to the package dir per
APM's local-path rule). The shepherd-prompt asset still probes the
panel by NAME at use-site as the A9 SUPERVISED EXECUTION backstop
if the harness registry is bypassed -- belt + suspenders, see
genesis-plan-v2.md step 3.5 'Declaration mechanism per external
module'.

Root apm.yml now depends on packages/batch-bug-shepherd via local
path; 'apm install' continues to deploy the skill to
.agents/skills/batch-bug-shepherd/ and the workflow row to the App
SQLite DB (with enabled=0 per the consent gate) -- but the path is
now manifest-declared and the package is independently shippable:

    apm install microsoft/apm/packages/batch-bug-shepherd

works in any consumer repo and transitively pulls apm-review-panel.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(skills): collapse hidden coupling in batch-bug-shepherd prompt

Closes HIGH finding #1 from genesis-critique.md: the workflow prompt's
'Hard rules' block restated four disciplines (ASCII-only, lint
contract, mutation-break gate, single-writer per comment) that already
live in the SKILL.md body. Two sources of truth invited drift -- if
the skill body added a new gate, the prompt would not auto-update.

Collapse the section to a single 'Delegation' paragraph naming the
skill as the authoritative source for all disciplines. The prompt
now carries only the per-trigger-surface contract (workflow
frontmatter + the input parameter + the 6-step ACTIVATE / SCOPE /
PLAN / INITIALIZE / EXECUTE / RENDER procedure that summons the
skill); every discipline travels with the skill body.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(skills): add content + trigger evals for batch-bug-shepherd

Closes HIGH finding #2 from genesis-critique.md: the v1 ship deferred
the evals gate. This commit authors the minimum set the gate
requires, mirroring the pr-description-skill/evals/ shape so a
single CI lane can score both:

  evals/
    evals.json              - manifest + gates (val split = ship gate)
    triggers.json           - 10 fire + 10 no_fire, 60/40 train/val
    content/                - 2 scenario manifests + rubrics
      three-issues-mixed.json
      sweep-bug-queue.json
    fixtures/               - 4 markdown fixtures (with/without skill x 2)
    README.md
    .gitignore              - results/ (timestamped runner output)
  scripts/
    run_evals.py            - stdlib-only, deterministic matcher

The no-fire trigger set deliberately includes queries that SHOULD
route to apm-review-panel ('review my PR', 'panel-review this PR')
so DISPATCH COLLISION between the two skills would surface as
val-no-fire-rate dropping below 0.5.

Val split result on first run: trigger fire rate 1.0, no-fire rate
1.0, content delta_anchors 8 and 7 (gates require >= 1). Overall:
passed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(skills): tighten batch-bug-shepherd dispatch description

Closes HIGH finding #3 from genesis-critique.md: the original
description was 950/1024 chars, too close to the runtime cap to
absorb future trigger additions, and it under-named the indirect
maintainer phrasings the skill actually wants to catch.

Trim ~75 chars while ADDING two indirect triggers explicitly
exercised in evals/triggers.json (fire-07 'shepherd all bug-flagged
issues this quarter', fire-08 'weekly sweep of community-reported
issues'):

  - 'reproduction subagent per candidate issue' -> 'triage subagent
    per issue' (the procedural detail belongs in the body, not the
    dispatch surface)
  - collapsed the parenthetical workflow branch into a single ->
    arrow
  - added 'shepherd all bug-flagged issues this quarter', 'run a
    weekly sweep of community-reported issues', 'work down community
    bug contributions' to the Activate list
  - kept the imperative shape, intent-first ordering, and the
    '-- even if shepherd or batch is not named' tail

Verified: val split evals still report fire 1.0 / no-fire 1.0.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(install): regenerate .agents/skills/ + apm.lock.yaml after package extraction

Mechanical `apm install --target copilot,copilot-app --force` output
after the package extraction. The previous lockfile listed the two
skills under `local_deployed_files` (root .apm/ tree); they now
appear as proper `dependencies` entries because they ship as
self-contained packages under `packages/`. Also re-materializes
`.agents/skills/{apm-review-panel,batch-bug-shepherd}/` with the
trimmed dispatch description and the moved workflow asset, since
those compiled trees are tracked in this repo.

No source changes -- just the compiled-output snapshot.

Verified:
  - ruff check + ruff format --check: silent
  - apm install --target copilot,copilot-app --force: clean
  - sqlite3 ~/.copilot/data.db -> workflow enabled=0 (consent gate ok)
  - standalone apm install from a scratch dir resolves
    apm-review-panel transitively via the batch-bug-shepherd
    manifest dep (depth: 2, resolved_by: _local/batch-bug-shepherd)
  - val split evals: trigger fire 1.0, no-fire 1.0, content passed

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(policy): exclude transitive deps from no-orphaned-packages check

A sub-package's local-path dep appears in the root lockfile with
resolved_by set. The root manifest cannot make it go away by editing
its dependencies.apm list, so flagging it as an orphan creates an
unfixable CI failure.

Surfaced by PR #1434, where batch-bug-shepherd declares ../apm-review-panel
as a manifest dep. Both that transitive entry and the depth=1 root entry
land in apm.lock.yaml; the orphan check was flagging the depth=2 one.

Restrict orphan detection to direct deps (resolved_by is None). Add a
regression test that covers the exact shape.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(packages): drop transitive manifest dep to unblock APM <=0.14.1 CI

The previous commit fixes _check_no_orphans on the consumer side, but
the audit gate in microsoft/apm-action@v1 pins APM 0.14.0, which does
not yet have the fix. Until a release including the fix ships, the
'../apm-review-panel' transitive local-path entry would keep the gate
failing.

Drop the manifest dep from packages/batch-bug-shepherd/apm.yml; the
runtime activate-by-name probe in assets/shepherd-prompt.md remains
the working backstop, and the root manifest declares both packages
directly. Inline comment documents the rationale and the restoration
condition.

Lockfile regenerated by clean install (apm_modules wiped first); all
8 audit checks pass locally.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(copilot-app): update schema-too-new test to expect warn-not-raise

Test was added on main after this PR forked. It assumed the old
hard-fail behavior; our commit 05ea778 changed schema-too-new to
warn-and-continue. Update assertion to match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant