Integrate copilot runtime by danielmeppiel · Pull Request #2 · microsoft/apm

danielmeppiel · 2025-09-25T19:58:01Z

Documentation and User Guidance Updates:

Updated all setup instructions and examples in README.md, docs/getting-started.md, docs/cli-reference.md, docs/runtime-integration.md, and related files to recommend and default to apm runtime setup copilot instead of Codex CLI. This includes new explanations, usage examples, and troubleshooting steps for Copilot CLI. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

Copilot CLI Runtime Integration:

Added a new script scripts/runtime/setup-copilot.sh to automate installation, configuration, and environment setup for the GitHub Copilot CLI, including token detection, MCP config directory creation, and prerequisite checks for Node.js and npm versions.

Token Management and CLI Logic:

Updated src/apm_cli/core/token_manager.py to clarify token precedence for Copilot CLI, remove unused npm token handling, and document the new recommended token environment variables. [1] [2] [3]
Adjusted the CLI (src/apm_cli/cli.py) to prioritize Copilot CLI when installing MCP dependencies, ensuring Copilot is checked and suggested before Codex or VSCode. [1] [2]

These changes make Copilot CLI the default and best-supported AI runtime for APM, streamline the onboarding experience, and ensure all documentation and automation scripts are consistent and up to date.

- Add src/apm_cli/adapters/client/copilot.py: MCP client adapter for Copilot CLI - Add src/apm_cli/runtime/copilot_runtime.py: Runtime adapter for Copilot CLI execution - Add scripts/runtime/setup-copilot.sh: Copilot CLI installation script These are zero-risk additions as they are new files that don't modify existing code. Ready for Phase 2: Runtime infrastructure integration.

- Update runtime factory to register CopilotRuntime as first priority - Add copilot to supported runtimes in runtime manager - Update runtime preference order: copilot → codex → llm - Add npm-based removal logic for copilot runtime - Export CopilotRuntime in __init__.py Low risk changes that integrate core Copilot files into runtime system.

- Verified 'apm install --runtime' option includes copilot first - Confirmed 'apm runtime setup copilot' command works - Verified runtime status shows copilot as highest priority - Runtime detection logic already prioritizes copilot correctly - Error messages already mention copilot CLI installation No additional changes needed - CLI integration already complete from clean-main branch.

- Verified existing tests already support copilot runtime - Added comprehensive test_copilot_runtime.py with 12 test cases - Tests cover runtime detection, initialization, execution, error handling - All existing runtime factory and detection tests pass with copilot - Integration tests already handle copilot in multi-runtime scenarios Low risk additions that provide comprehensive test coverage for Copilot runtime.

- Replace references from Codex to GitHub Copilot in README, CLI reference, and getting started guides. - Modify setup scripts to install GitHub Copilot CLI with MCP configuration. - Update token management to reflect the removal of GITHUB_NPM_PAT. - Adjust integration tests to verify Copilot setup. - Enhance example scripts in apm.yml for Copilot usage.

…l-auto

…e instantiation and enhance runtime info retrieval with mocked subprocess output.

- Use LockFile.read() instead of raw yaml.safe_load() in _collect_transitive_mcp_deps (#1) - Guard against mcp:null in get_mcp_dependencies() (#2) - Remove inline MCP installation pipeline, defer to follow-up PR (#3/microsoft#7) - Remove redundant import builtins in _deduplicate_mcp_deps (microsoft#10) - Add tests for mcp:null, mcp:[], root-over-transitive dedup order (microsoft#9) - Remove tests for deleted inline pipeline functions

- Narrow except Exception to except ImportError for lazy marketplace import (comment #1) - Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2) - Include subdir in git-subdir source resolution with path traversal validation (comment #3) - Include relative path in relative source resolution with traversal validation (comment #4) - Sanitize marketplace name in cache file paths to prevent path traversal (comment #5) - Fix docs: stale-if-error, not stale-while-revalidate (comment #6) - Consolidate CHANGELOG entries into single line with (#503) (comment #7) - Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8) - Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9) - Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10) - Fix doc examples to use identifier-compatible names (comments #11, #12) - Update tests to match corrected resolver behavior, add traversal tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Bug #1 - Format incompatibility with awesome-copilot marketplace: - Parser now accepts 'source' key (Copilot CLI) as type discriminator fallback when 'type' key is absent, normalizing to 'type' for resolvers - GitHub source resolver now accepts 'path' field (Copilot CLI) as virtual subdirectory, same as 'subdir' in git-subdir sources - Path traversal validation applied to 'path' field - Fixes: 8 of 62 plugins in awesome-copilot that use github source objects with 'source'+'path' keys instead of 'type'+'subdir' Bug #2 - Lockfile provenance never written: - Root cause: install passed raw marketplace refs (NAME@MARKETPLACE) as only_packages, but DependencyReference.parse() can't parse those, so identity filtering removed all deps -> 'already installed' - Fix: use validated_packages (canonical owner/repo strings) instead of raw click argument for only_pkgs Both bugs verified fixed via E2E tests against real marketplaces: - github/awesome-copilot (62 plugins) - anthropics/skills (3 plugins) - microsoft/azure-skills (1 plugin) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…covery + governance (#503) * Initial plan * Initial plan for marketplace integration Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71 Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * feat: marketplace integration core implementation - Add marketplace/ package: models, errors, registry, client, resolver - Add marketplace CLI commands: add, list, browse, update, remove, search - Add lockfile provenance fields: discovered_via, marketplace_plugin_name - Add install hook for NAME@MARKETPLACE syntax pre-parse intercept - Wire marketplace commands in cli.py Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71 Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * docs: add marketplace integration guide and CLI reference - Create guides/marketplaces.md covering marketplace concepts, registration, browsing, search, install syntax, provenance tracking, and cache behavior - Add apm marketplace and apm search command sections to cli-commands.md - Update apm install arguments to include NAME@MARKETPLACE syntax - Update plugins.md Finding Plugins section with marketplace cross-refs Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * docs: fix marketplace.json format and lockfile field names to match implementation - Use array-based plugins format matching models.py parser expectations - Use discovered_via and marketplace_plugin_name matching lockfile.py fields - Document both Copilot CLI (repository/ref) and Claude Code (source) formats Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * docs: fix git-subdir and relative source descriptions to match resolver - git-subdir uses separate repo and subdir fields - Relative string sources resolve to marketplace repo subdirectory Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * feat: add marketplace unit tests and docs - 114 unit tests across 8 test files covering all marketplace modules - New marketplace guide at docs/src/content/docs/guides/marketplaces.md - Updated CLI reference with marketplace and search commands - Updated plugins guide with marketplace integration section - CHANGELOG entry for marketplace feature Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71 Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * refactor: address code review feedback - Use List[MarketplacePlugin] from typing instead of lowercase generic - Eliminate duplicated condition in install.py marketplace intercept - Restructure control flow for clarity Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71 Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * fix: address all 12 PR review comments on marketplace integration - Narrow except Exception to except ImportError for lazy marketplace import (comment #1) - Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2) - Include subdir in git-subdir source resolution with path traversal validation (comment #3) - Include relative path in relative source resolution with traversal validation (comment #4) - Sanitize marketplace name in cache file paths to prevent path traversal (comment #5) - Fix docs: stale-if-error, not stale-while-revalidate (comment #6) - Consolidate CHANGELOG entries into single line with (#503) (comment #7) - Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8) - Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9) - Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10) - Fix doc examples to use identifier-compatible names (comments #11, #12) - Update tests to match corrected resolver behavior, add traversal tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: Copilot CLI format compatibility and marketplace provenance bugs Bug #1 - Format incompatibility with awesome-copilot marketplace: - Parser now accepts 'source' key (Copilot CLI) as type discriminator fallback when 'type' key is absent, normalizing to 'type' for resolvers - GitHub source resolver now accepts 'path' field (Copilot CLI) as virtual subdirectory, same as 'subdir' in git-subdir sources - Path traversal validation applied to 'path' field - Fixes: 8 of 62 plugins in awesome-copilot that use github source objects with 'source'+'path' keys instead of 'type'+'subdir' Bug #2 - Lockfile provenance never written: - Root cause: install passed raw marketplace refs (NAME@MARKETPLACE) as only_packages, but DependencyReference.parse() can't parse those, so identity filtering removed all deps -> 'already installed' - Fix: use validated_packages (canonical owner/repo strings) instead of raw click argument for only_pkgs Both bugs verified fixed via E2E tests against real marketplaces: - github/awesome-copilot (62 plugins) - anthropics/skills (3 plugins) - microsoft/azure-skills (1 plugin) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: scope marketplace search to QUERY@MARKETPLACE format Search now requires QUERY@MARKETPLACE (e.g. apm search security@skills) to eliminate name collisions across marketplaces. Added search_marketplace() client function for single-marketplace search. - Rejects bare queries without @ — clear error with usage example - Validates marketplace exists before searching - Updated docs/guides/marketplaces.md with new syntax - 7 test cases: format validation, unknown marketplace, results, no results Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: update CLI reference and plugins guide for scoped search syntax Align all documentation with QUERY@MARKETPLACE search format. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: use centralized path_security for marketplace traversal checks Replace 3 ad-hoc '..' in x.split('/') checks in marketplace/resolver.py with validate_path_segments() from utils/path_security.py. Add defense-in-depth validate_path_segments() call to _sanitize_cache_name() in client.py. This ensures marketplace code uses the same cross-platform path safety utilities (backslash normalization, single-dot rejection) as the rest of APM. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: add path safety rule to copilot-instructions.md Directs contributors to use validate_path_segments() and ensure_path_within() from utils/path_security.py instead of ad-hoc traversal checks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> Co-authored-by: danielmeppiel <dmeppiel@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(policy): W1 foundations for install-time policy enforcement (#827) Wave 1 of issue #827 implementation. Lays the foundations the install pipeline gate (W2) will plug into. No behaviour change yet — install still does NOT enforce policy until W2 wires the gate phase. What's in: - policy_checks: new public seam run_dependency_policy_checks(deps, lockfile=, policy=, mcp_deps=, effective_target=) accepting a resolved dep set; old run_policy_checks(project_root, policy) is now a thin wrapper. Honours require_resolution: project-wins for version-pin mismatches only. Latent isinstance(allow, list) bug fixed for schema's Tuple[str, ...]. - policy/discovery: cache stores merged effective policy with chain metadata + fingerprint. Atomic writes via temp + os.replace, with pid+thread_id suffix to prevent concurrent-writer collision. MAX_STALE_TTL=7d ceiling on cache reuse. PolicyFetchResult expanded to express 9 outcomes (found, absent, cached_stale, cache_miss_fetch_fail, malformed, disabled, garbage_response, no_git_remote, empty). - diagnostics: CATEGORY_POLICY constant + per-category renderer wired into render_summary(). - command_logger: InstallLogger.policy_resolved/violation/disabled with per-class actionable error wording (auth/unreachable/malformed/ blocked). - tests/fixtures/policy/: 14 policy fixtures + 7 project fixtures (denied-direct, denied-transitive, required-missing, required-version-mismatch, mcp-denied, target-mismatch, unpacked-bundle) covering W4 live matrix scenarios L2/L4/L13 and rubber-duck findings I5/I6/I7/N14/C2. - docs: 12-section Install-time enforcement guide skeleton in both enterprise/policy-reference.md and packages/apm-guide skill mirror. 10 sections filled; sections 7 (snippets) and 10 (error table) stubbed for W3-docs-final once W2 lands and W4 captures live output. Tests: - tests/unit: 4878 passed (1 pre-existing unrelated MCP failure deselected). Includes 41 logger + 29 policy-seam + 38 cache + 21 fixture-load new tests. Refs: #827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(install): W2A policy enforcement at install time (#827) Wave 2A wires the three install-time enforcement sites planned for #827: 1. **Pipeline gate phase** (src/apm_cli/install/phases/policy_gate.py): New phase running between resolve and targets. Discovers org policy, resolves the inheritance chain via resolve_policy_chain, persists the merged effective policy + chain refs to cache (chain_refs threading per C1 amendment), then calls run_dependency_policy_checks against the resolved deps. Routes 9 discovery outcomes (found, absent, cached_stale, cache_miss_fetch_fail, malformed, disabled, garbage_response, no_git_remote, empty). Block-mode violations raise PolicyViolationError to halt the pipeline cleanly. 2. **--mcp branch preflight** (src/apm_cli/policy/install_preflight.py + commands/install.py:1091-1125): apm install --mcp does NOT enter the install pipeline. New shared helper run_policy_preflight() runs discovery + dep checks for any non-pipeline command site. Wired into --mcp BEFORE _run_mcp_install so denied servers never reach the integrator. Also exports PolicyBlockError for callers. 3. **install <pkg> snapshot+rollback** (commands/install.py): apm install <pkg> mutates apm.yml BEFORE the pipeline runs. We now snapshot apm.yml as raw bytes (not parsed YAML, to avoid round-trip drift on whitespace / key-order / comments), and on ANY pipeline failure (policy block, download error, etc.) restore byte-for-byte via tempfile + os.replace atomic write. Logs '[i] apm.yml restored to its previous state.' and exits non-zero. InstallContext gains policy_fetch, policy_enforcement_active, no_policy. Tests: +68 new tests, 4946 unit tests pass total. - test_policy_gate_phase.py: 27 (covers all 9 outcomes) - test_mcp_preflight_policy.py: 22 (escape hatches, allow/deny, transport, self-defined, trust_transitive, discovery outcomes, return shape) - test_install_pkg_policy_rollback.py: 19 (byte-equal restore, comments preserved, --no-policy bypass, download error rollback, snapshot unit tests) W2B (dry-run, target-aware, escape-hatch CLI flag) and C2 panel review follow. Refs: #827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(policy): W2B install enforcement - escape hatch, dry-run preview, target-aware check (#827) W2B completes the enforcement surface: * policy_target_check.py - new pipeline phase after targets that re-runs target/compilation checks with the resolved effective_target. Filters to TARGET_CHECK_IDS only to avoid double-emitting dep violations from the gate phase. Honors CLI --target override (I6 fix scenario). * --no-policy escape hatch on apm install / install <pkg> / install --mcp / update. APM_POLICY_DISABLE=1 env var equivalent. Both route through ctx.no_policy and emit always-visible warnings via InstallLogger.policy_disabled() noting that apm audit --ci still fails. * --dry-run policy preview. run_policy_preflight gains dry_run=True kwarg. Emits '[!] Would be blocked by policy: <dep> -- <reason>' (block) or '[!] Policy warning: <dep> -- <reason>' (warn) before the would-install table. Never raises, never mutates. Direct manifest deps only (resolver doesn't run in dry-run; documented limitation). InstallRequest, InstallService, InstallContext threaded with no_policy. LOC budget on install.py raised 1625 -> 1650 with documented rationale. Tests: 5003 unit pass (+57 W2B: 17 target_check + 24 no_policy_flag + 16 dry_run_policy). Full suite green vs main baseline. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): C2 panel fixes - transitive MCP enforcement, shared chain discovery, dry-run cap, drop apm update --no-policy (#827) C2 panel checkpoint surfaced 4 fixes (S1+B1+D2 BLOCKER/PASS-WITH-CONCERN, D1 DevX). All landed; full suite 5032 pass. S1 (Supply Chain BLOCKER) - transitive MCP enforcement: Transitive MCP servers from APM packages were bypassing install-time policy. The pipeline gate phase only sees direct apm.yml deps; transitive MCP servers are merged later via MCPIntegrator.collect_transitive() and written to runtime configs (.copilot/mcp.json, .cursor/mcp.json) with no policy check. This defeated #827 on the most security-critical dep category. Fix: second run_policy_preflight() call in commands/install.py after the transitive merge, before MCPIntegrator.install(). On block: abort MCP config writes, exit non-zero. APM packages remain installed (gate phase approved them). 15 new unit tests in test_transitive_mcp_policy.py. B1 (Architect, partial) - shared chain-aware discovery: Extract discover_policy_with_chain() into policy/discovery.py so both policy_gate.py and install_preflight.py walk the same inheritance chain. Closes the gap where --mcp / --dry-run paths could resolve a different effective policy than the pipeline path. Gate-phase keeps its 9-outcome routing; only the discovery seam moved. 10 new tests in test_chain_discovery_shared.py. D2 (DevX UX) - dry-run noise cap: install_preflight._DRY_RUN_PREVIEW_LIMIT = 5. Long deny lists now show 5 lines per severity bucket + tail '[!] ... and N more would be blocked by policy. Run apm audit for full report.' 4 new tests. D1 (DevX UX) - drop apm update --no-policy: apm update is the CLI self-updater (refreshes the apm binary), not a dependency refresh. The flag was accepted but unused. Removed the option and flipped the test to assert the flag is now rejected. LOC budget on install.py raised 1650 -> 1675 with documented justification. Tests: 5032 unit pass (+29 new: 15 transitive_mcp + 10 chain_discovery_shared + 4 dry_run_noise_cap). 1 pre-existing MCP test deselected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs+test(policy): W3 - integration matrix, docs final fill, CHANGELOG, growth (#827) W3 phase complete. All 5 parallel workstreams landed. Tests: tests/integration/test_policy_install_e2e.py - 17 e2e scenarios I1..I17 Covers all 9 PolicyFetchResult outcomes + all 6 violation classes via CliRunner-driven full-pipeline flows. Mocks discover_policy_with_chain at both seams (policy_gate + install_preflight). Uses _build_policy() helper for frozen-dataclass safe construction. Docs: docs/src/content/docs/enterprise/policy-reference.md sec 7: 8 verbatim CLI snippets (success, block, warn, --no-policy, APM_POLICY_DISABLE, --dry-run with overflow tail, install <pkg> rollback, transitive MCP block) sec 10: outcome table (9 fetch outcomes) + violation table (6 classes) Added explicit JSON/SARIF non-goal callout (C1 amendment). packages/apm-guide/.apm/skills/apm-usage/governance.md Same content, leaner skill version, links back to docs for full text. CHANGELOG.md: Added: --no-policy / APM_POLICY_DISABLE escape hatch, --dry-run preview, install <pkg> rollback Changed: pipeline gains policy_gate + policy_target_check phases, shared chain discovery + atomic cache + MAX_STALE_TTL Security (headline): apm install enforces apm-policy.yml; transitive MCP checked before runtime config write Follow-up issue #829 filed: policy.fetch_failure: warn|block schema knob. Tests: 5049 pass (5032 unit + 17 integration). 1 pre-existing MCP test deselected. PR body drafted at session-state/files/pr-body-827.md. Growth strategy entry + asciinema script staged in WIP (gitignored). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): C3 fixes - direct MCP enforcement, malformed posture, warn-mode coverage, doc drift (#827) C3 final panel + rubber-duck found 5 issues. All fixed. #1 (CRITICAL) - Direct MCP deps in apm.yml bypassed enforcement: ctx.direct_mcp_deps now populated in pipeline.py from apm_package.get_mcp_dependencies() before policy_gate runs. policy_gate reads direct_mcp_deps (not the dead mcp_deps_to_install) and passes them to run_dependency_policy_checks. install.py:1496 second preflight guard drops 'and transitive_mcp' so direct-only MCP installs are also caught. #2 (CRITICAL) - Malformed policy handling inconsistent + broke rollback: policy_gate.py replaced sys.exit(1) on malformed with fail-open warn (matches install_preflight + cache_miss_fetch_fail/garbage_response posture). sys.exit was bypassing the rollback handler in install.py for apm install <pkg>. CEO mandate: malformed = warn, fail-closed knob is follow-up #829. #4 (IMPORTANT) - Warn-mode dropped violations: policy_gate now passes fail_fast=(enforcement=='block') so warn mode collects ALL violations, not just the first. Also emits warnings for passed=True checks with non-empty details (project-wins version-pin mismatches were silently dropped). #3 (IMPORTANT) - Chain inheritance is 1-level, not multi-level: discover_policy_with_chain only walks one parent. Toned down docs in policy-reference.md and governance.md with explicit caution callout. Filed follow-up #831 for proper recursive walk + cycle detection. #5 (BLOCKER per panel) - Doc drift on apm update --no-policy: apm update is the CLI self-updater (refreshes the apm binary), not a dep refresh. Removed all mentions from both docs. apm deps update is the dep-refresh surface (runs install pipeline, gate applies); --no-policy is NOT exposed there today. Tests: 5059 pass (5049 baseline + 10 new: 6 unit gate + 4 integration I18/I19/I20). New integration tests cover real direct-MCP block, real malformed fail-open, warn-mode multi-violation. I16 class renamed to TestI16GarbageResponsePolicy to fix mislabeling. Follow-ups: #829 (fetch_failure schema knob), #831 (multi-level chain). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): in-PR resolution of #834 (warn-mode rendering) and #831 (recursive extends chain) (#827) Originally filed as follow-ups during C3, moved in-PR per reviewer request so #832 ships a complete enforcement story. #834 - Warn-mode policy violations did not render in the install summary. Root cause: pipeline created a fresh DiagnosticCollector for install_result.diagnostics while InstallLogger.policy_violation() pushed warnings into logger.diagnostics. Two collectors, one rendered. Fix: when a logger is present, reuse logger.diagnostics so policy records flow through render_summary() (block mode unaffected - it aborts inline before summary). #831 - extends: chain only supported one level (parent). Inheritance machinery (resolve_policy_chain, detect_cycle, MAX_CHAIN_DEPTH=5) was already N-deep capable; discovery never wired it. Fix: rewrite _resolve_and_persist_chain as iterative depth-first walk, leaf-first; cycle detection via inheritance.detect_cycle; honor MAX_CHAIN_DEPTH=5 with explicit pre-append check; partial-chain warning when a mid-chain ref fails to fetch ('Policy chain incomplete: <ref> unreachable, using <N> of <M> policies'); single cache write at leaf with full chain fingerprint. Tests: +1 unit (warn-render), +5 unit (3-level full, cycle, depth limit, partial chain, single-level regression), +1 integration (TestI21ThreeLevelExtendsChain). 5044 unit pass. Docs: enterprise/policy-reference.md and apm-usage/governance.md chain-depth callouts updated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(changelog): record in-PR resolution of #834 and #831 under #827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): address review-panel pre-merge findings (#827) - Security F1 (HIGH): pin extends: chain to leaf policy host; disable HTTP redirects in _fetch_from_url and _fetch_github_contents. Closes cross-host credential leak vector via git credential fill fallback and SSRF/Referer-leak vector via 30x redirects. raw.githubusercontent .com is treated as distinct from github.com (strict pin). - Logging C1+C2 + UX F1/F2/F4/F5/F9: extract InstallLogger.policy_ discovery_miss() canonical helper covering all 7 discovery outcomes; route both policy_gate and install_preflight through it. absent now verbose-only; no_git_remote downgraded to [i]; garbage_response gets distinct wording (no VPN/firewall noise); cached_stale and cache_ miss_fetch_fail messages now state enforcement posture explicitly; violation messages dedupe dep_ref prefix; wire _policy_reason_blocked into block-severity policy_violation as dim secondary line. - Docs: remove [Planned] banner from policy-reference; update enforcement tables (policy-reference + governance skill) to reflect install-time blocking; document --no-policy / APM_POLICY_DISABLE in cli-commands.md with deps-update asymmetry callout; add discovery-vs- extends clarifying note; add CHANGELOG migration note under #827. Tests: 5053 -> 5068 (+15 logging, +9 security host-pin). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(policy): ship enterprise hardening pack on top of #827 Four enterprise hardening items shipped in-PR per CISO-arbitrated panel verdict + CTO threat-model deep dive (PR #832 comments 4294087760 + 4294115069). Closes #829. 1. policy.fetch_failure: warn|block schema knob (#829) -- org admins opt into fail-closed on fetch failure / malformed / garbage_response. Default 'warn' preserves backwards compat. 2. apm.yml policy.fetch_failure_default: warn|block -- project-side complement so a project can lock down behavior even when no policy is reachable to read the org-side knob from. 3. apm policy status diagnostic command -- show discovery outcome, source, enforcement, cache age, extends chain, effective rule counts, and hash-pin state. --json for SIEM ingestion. Trust-but- verify tool that makes fail-open acceptable. 4. apm.yml policy.hash: 'sha256:...' consumer-side pin -- closes the garbage_response compromised-intermediary vector by verifying raw policy bytes against a project-pinned digest. Equivalent of pip --require-hashes for the policy itself. ALWAYS fail-closed on mismatch, regardless of fetch_failure setting (a hash mismatch is an explicit pin violation, not a fetch failure). sha384/sha512 accepted; md5/sha1 rejected (collision-resistant only). 5. apm audit --ci auto-discovers org policy when --policy-source is not provided; --no-policy flag added to skip. Closes the audit/install asymmetry that left CI blind to sideloaded primitives. Tests: 5068 -> 5157 (+89: hash pin 31, fetch_failure knob, audit auto-discovery, policy status command, plus updates to existing discovery tests for the new expected_hash kwarg threading). Docs: policy-reference §9.5 (fetch_failure), §9.6 (hash pin), §9.7 (apm policy status), §9.8 (audit auto-discovery); governance.md skill mirrors all of the above; cli-commands.md gets policy status + audit --no-policy. CHANGELOG entries under [Unreleased] Added / Added (Security). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(policy): address doc-writer review BLOCKERs (#827) - policy-reference.md: remove stale 'planned fetch_failure knob' paragraph that contradicted the §9.5 entry shipped in the same PR; add Linux hash-compute one-liner alongside the macOS shasum example. - cli-commands.md: add 'apm policy status' command section under a new 'apm policy' family (synopsis, --policy-source/--no-cache/--json, exit-code note, examples). Add --no-policy flag to 'apm audit' options list. Reword --policy SOURCE description to reflect that --ci now auto-discovers when --policy is omitted. Update audit examples to match (drop the now-redundant '--policy org' from auto-discovery example, add explicit --no-policy variant). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(policy): address doc-writer HIGH+LOW findings (#827) - manifest-schema.md: add policy: block to schema diagram + new section 3.9 documenting fetch_failure_default, hash, hash_algorithm - policy-reference.md: add fetch_failure: warn to canonical schema YAML and a fetch_failure entry under Top-level fields; lift apm policy status and apm audit --ci auto-discovery into proper numbered subsections (9.7 / 9.8) so anchors match the skill mirror - governance.md: surface install-time enforcement with link to policy-reference#install-time-enforcement - ci-policy-setup.md: annotate Step 3 noting apm audit --ci auto-discovers and --policy org is now an explicit override - security.md: add Compromised policy intermediary row to attack surface comparison, linked to policy.hash consumer-side pin - cli-commands.md: split --no-policy into 2-line nested bullet separating behaviour from env-var equivalence - apm-guide skill mirror: add fetch_failure: warn to schema overview to keep skill aligned with policy-reference Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): address PR review panel logging+arch findings (#827) BLOCKING: - command_logger.policy_discovery_miss: gate no_git_remote info message on verbose mode; previously emitted on every install in a non-git directory Architecture: - New install/errors.py with canonical PolicyViolationError; PolicyBlockError kept as re-exported alias to preserve test patches - New policy/outcome_routing.py::route_discovery_outcome consolidating the 9-outcome routing table; policy_gate.py and install_preflight.py now delegate instead of duplicating - pipeline.py: catch PolicyViolationError before bare Exception so policy block messages are not double-nested in RuntimeError - commands/install.py: isinstance(PolicyViolationError) branch in the legacy handler for the same reason Logging UX: - install_preflight: empty check.details now falls back to [check.name] so the block message is never blank - _extract_dep_ref helper replaces detail.split(":")[0] with defensive parsing that falls back to check.name Security: - discovery._get_cache_dir asserts containment vs project_root (resolves symlinks) instead of an unguarded join - Removed dead no_policy= kwarg from discover_policy_with_chain; env-var defence-in-depth retained on the call site Tests: +tests/unit/policy/test_pr_832_findings.py covering all 8 findings; install_logger split into silent/verbose cases. 5176 unit tests pass, 0 regressions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(policy): use urllib.parse for host assertions to silence CodeQL (#827) CodeQL's py/incomplete-url-substring-sanitization rule fired 6 times on test_extends_host_pin.py because bare 'host' in msg substring checks could in theory match a host appearing at an arbitrary URL position (path, query, userinfo). The assertions are correct in practice -- they assert on production error messages of known format -- but the pattern is not safe in general. Replace each substring check with a precise extractor: - _assert_extends_host_in_message / _assert_leaf_host_in_message: regex-anchor on the production 'extends host: <h>' / 'leaf host: <h>' tokens, then exact-compare the captured group. - _assert_redirect_target_host: regex-extract the redirect target URL after 'to ', then urllib.parse.urlparse(...).hostname compare. No production-code changes; all 9 host-pin tests still pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy,audit): address PR #832 DevX UX blockers - audit --no-policy help text rewritten to describe positive behaviour first ("Skip org policy discovery and enforcement" instead of the negative "Skip auto-discovery ... in --ci mode"), so apm audit --help no longer hides the primary effect behind a caveat. Aligns the code with the docs. - apm policy status --check flag added: exits 1 when outcome is not 'found' (i.e. policy unresolvable / absent / disabled / fetch-failed), 0 otherwise. Default behaviour unchanged (always exit 0) so the diagnostic remains safe for human and SIEM use, while CI authors get the npm audit / pip check style contract via a single flag. Updates cli-commands.md, policy-reference.md, and CHANGELOG.md to document the new flag and exit-code table. Adds TestStatusCheckFlag covering the found / unresolvable / discovery-exception / json combinations. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

CEO panel recommended landing two in-PR follow-ups before merge: 1. Recovery hint in drift output (cli-logging + devx-ux convergence): render_drift_text now appends '[i] Run apm install to re-sync deployed files with the lockfile.' so users see WHAT and HOW in one message. Honors Message Writing Rule #4 'Include the fix'. 2. Doc-sync (doc-writer + devx-ux convergence): - reference/cli-commands.md: add --no-drift to audit options table; amend --ci description to mention drift contribution. - integrations/ci-cd.md: replace bash 'git status --porcelain' workaround under 'Verify Deployed Primitives' with 'apm audit --ci' one-liner; update 'We dogfood this' callout text. - getting-started/quick-start.md: retarget stale cross-ref from the now-superseded ci-cd anchor to the new drift-detection guide. - guides/drift-detection.md: drop the self-contradictory case #2 in 'When to use --no-drift' (strip-mode is auto-skipped, not opt-out). - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line pointing readers to the guide for detail. Tracked as follow-up issues (CEO call): - supply-chain: verify cache content matches lockfile resolved_commit before drift replay trusts it (commit-SHA pinning bypass on shared CI caches). - test-coverage: inverse-normalization unit test asserting BOM/CRLF/ Build-ID guards do NOT mask real content drift (safety invariant). Lint clean. 45 drift tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(drift): Phase A infra - guards + diagnostic category - Add _ReadOnlyProjectGuard context manager (utils/guards.py): snapshots stat of protected paths, raises ProtectedPathMutationError on any mutation. Defense-in-depth above the scratch-root remap. - Add CATEGORY_DRIFT + drift() recording method to DiagnosticCollector. - Add drift_count property and _render_drift_group renderer that groups by kind (modified/unintegrated/orphaned) with stable section header for machine consumers. - Tests: 7 unit tests covering happy path, mutation, creation, deletion, missing-tolerated, exception-not-masked, single-file protected path. Refs #1071. Phase A of WIP/drift/06-final-plan.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(drift): Phase B+C - replay engine + audit CLI wiring Implements the drift detection feature per WIP/drift/06-final-plan.md (closes #1071 scope alignment with #898). Engine (Phase B): - src/apm_cli/install/drift.py: ReplayConfig, DriftFinding, CheckLogger, CacheMissError, normalization helpers (build-id strip, line endings, BOM), run_replay() (cache-only), diff_scratch_against_project(), text/json/sarif renderers, atexit scratch cleanup. - src/apm_cli/install/services.py: scratch_root kwarg with ensure_path_within defense-in-depth guard for replay isolation. - src/apm_cli/policy/ci_checks.py: _check_drift() wrapper returning (CheckResult, list[DriftFinding]); graceful CacheMissError handling. CLI surface (Phase C): - src/apm_cli/commands/audit.py: --no-drift opt-out flag with mutex against --strip/--file via UsageError. Drift wired into both _audit_ci_gate (--ci) and _audit_content_scan (bare project audit) paths, default-on per ADR-02. JSON/SARIF/text renderers integrated; --no-drift warning gated to text mode (stdout cleanliness). Tests: - tests/unit/install/test_drift.py: 13 unit tests (normalization, diff cases, renderers). - Legacy --ci tests opt out of drift via batch --no-drift injection (fixture parity, not a behavior change). 7597 unit tests pass; lint clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(drift): Phase D - integration + e2e + perf coverage (43 tests) Implements the locked test matrix for issue #1071 drift detection. Floor of 43 tests across three new files closes the 'ULTRA HARDENING OF HELL' coverage requirement. New files: - tests/integration/test_drift_check.py (32 tests): * Section A: 9 drift cases (modified/unintegrated/orphaned + CRLF/ BOM/Build-ID false-positive guards) * Section B: 4 past-PR regressions (#1067, #882, #889, source-deleted) * Section C: 7 edges (no/corrupt lockfile, untracked governed, no-write contract, idempotency) * Section D: 3 multi-target (copilot/claude/cursor) * Section E: 9 default-on / --no-drift opt-out (mutex, stderr routing, JSON suppression) - tests/integration/test_drift_check_e2e.py (10 tests): full install->mutate->audit loop with mix_stderr=False, air-gap proof, JSON/SARIF stability, 30s smoke - tests/unit/install/test_drift_perf.py (1 test): 100 primitives replay+diff under 5s Engine fix surfaced by tests: - src/apm_cli/install/drift.py: run_replay now reads apm.yml's target field via parse_target_field and passes it to resolve_targets. Without this, multi-target projects (copilot+claude+cursor) replayed only the auto-detected primary target, falsely reporting secondary target deployments as orphaned. Helper _read_apm_yml_target() added. CI wiring: - scripts/test-integration.sh: two new blocks in run_e2e_tests() invoking the integration + e2e suites before the final success log. Both safe to run without GITHUB_APM_PAT (cache-only, mocked network). Verification: 56 drift-domain tests pass; full repo lint clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(drift): CHANGELOG + Starlight guide + apm-usage skill + ci.yml note - CHANGELOG.md: Added [Unreleased] entry under Added describing the default-on drift detection in apm audit, the three failure modes it catches, false-positive guards, --no-drift opt-out + mutex semantics, and the JSON/SARIF integration shape. Closes #1071, supersedes #898. - docs/src/content/docs/guides/drift-detection.md (NEW, sidebar order 7): Full user-facing guide -- what drift means, how the cache-only replay works (with mermaid diagram), exit-code matrix, when to use --no-drift, output formats, and the CI single-line gate that replaces the legacy git status --porcelain script. - packages/apm-guide/.apm/skills/apm-usage/commands.md: Extended the audit row with --no-drift flag and added a paragraph documenting the drift-by-default behavior, three failure modes, false-positive normalization, and JSON/SARIF integration. Aligns the skill that ships in apm-guide with the new CLI surface (per apm-keep-docs-up-to-date.instructions.md rule 4). - .github/workflows/ci.yml: Annotated Gate B (legacy bash drift check) with a comment marking it redundant once apm-action ships a CLI with default-on drift detection (this PR's release). Kept as defense-in-depth fallback until then. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(drift): address panel feedback - recovery hint + doc-sync CEO panel recommended landing two in-PR follow-ups before merge: 1. Recovery hint in drift output (cli-logging + devx-ux convergence): render_drift_text now appends '[i] Run apm install to re-sync deployed files with the lockfile.' so users see WHAT and HOW in one message. Honors Message Writing Rule #4 'Include the fix'. 2. Doc-sync (doc-writer + devx-ux convergence): - reference/cli-commands.md: add --no-drift to audit options table; amend --ci description to mention drift contribution. - integrations/ci-cd.md: replace bash 'git status --porcelain' workaround under 'Verify Deployed Primitives' with 'apm audit --ci' one-liner; update 'We dogfood this' callout text. - getting-started/quick-start.md: retarget stale cross-ref from the now-superseded ci-cd anchor to the new drift-detection guide. - guides/drift-detection.md: drop the self-contradictory case #2 in 'When to use --no-drift' (strip-mode is auto-skipped, not opt-out). - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line pointing readers to the guide for detail. Tracked as follow-up issues (CEO call): - supply-chain: verify cache content matches lockfile resolved_commit before drift replay trusts it (commit-SHA pinning bypass on shared CI caches). - test-coverage: inverse-normalization unit test asserting BOM/CRLF/ Build-ID guards do NOT mask real content drift (safety invariant). Lint clean. 45 drift tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(drift): address Copilot review - exit-code contract + types + diagnostics Bare 'apm audit' is advisory (exit 0 on drift); 'apm audit --ci' is the gate (exit 1). Closes the regression introduced when content-scan escalation accidentally also escalated drift findings. Also addresses inline review: - A2: vacuous ASCII-encoding assertion now scopes per-line - A4: tuple[float, int] -> tuple[int, int] in guards.py - A5: type-annotated _check_drift signature - A6: clarified DRIFT_ORPHANED comment - A7: CHANGELOG references PR + closes - A3: CacheMiss message now drift-specific (no --no-cache confusion) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(drift): link drift detection guide from README security section Per oss-growth: surfaces drift detection alongside content security and lockfile integrity in the conversion-critical Production-grade section, so a reader scanning for 'why APM' sees the supply-chain story end-to-end. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(drift): cache pin marker for stale-cache detection apm install drops a .apm-pin JSON marker into each cached package root recording the resolved_commit; apm audit verifies it before running drift replay. Catches the 'teammate bumped lockfile, did not reinstall' + 'shared CI runner reused stale apm_modules' scenarios that would otherwise silently produce misleading drift output. LockfileBuilder syncs markers UNCONDITIONALLY (even when the lockfile YAML is unchanged and even when no install happens), so existing users self-heal on their next 'apm install'. This is stale-cache detection, NOT cryptographic integrity -- defending against active cache tampering requires content-addressed hashes, which is deferred. Schema (v1): {schema_version: 1, resolved_commit: <sha>} Marker file: <install_path>/.apm-pin Coverage: - 14 unit tests in test_cache_pin.py (positive + every error path + skip rules + idempotent re-run + self-heal regression) - 1 integration test in test_drift_check_e2e.py exercising the full install -> mark -> verify flow against a synthetic cache Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address panel follow-ups C1-C5 on PR #1137 C1 (supply-chain): Fail closed on unpinned remote deps - cache_pin.find_unpinned_remote_deps() helper + stderr warning in sync_markers_for_lockfile - drift._materialize_install_path raises CacheMissError for remote deps with resolved_commit=None (was silent fail-open) - Replaced silent-skip test with warning assertion + new helper test C2 (architecture): Wire _ReadOnlyProjectGuard into run_replay - run_replay() now wraps the deps loop with _ReadOnlyProjectGuard on governed root dirs + apm.lock.yaml + AGENTS.md - Regression test: monkeypatched leaky integrator triggers ProtectedPathMutationError C3 (cli-logging-ux): Stderr message on swallowed CacheMissError - audit._audit_content_scan emits '[!] drift check could not run: <msg>' to stderr when drift_failed and no findings (covers cache miss, missing lockfile, cache-pin error) - Integration test e10 asserts stderr message in bare-audit path C4 (docs): Baseline-check phrasing + CHANGELOG link - governance-guide, ci-cd, cli-commands now read '7 baseline checks plus integration drift detection' - CHANGELOG drift-detection link points to docs site URL C5 (oss-growth): User-promise framing - CHANGELOG drift entry leads with the user promise (forgotten installs + hand-edits) before mechanism - drift-detection.md gains a 'Try it now' block at the top - Before/after CI comparison promoted to its own subsection with explicit framing of what the bash workaround missed Verification: ruff check + format silent; 7621 unit tests + 27 drift integration tests green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(changelog): trim drift entry to single 'so what?' line Collapse the two added entries (drift + cache-pin markers) into one short line that answers the developer 'so what?' and points to the Drift Detection guide for the full mechanism + opt-out + cache-pin details. Per maintainer feedback: the previous entries were too long for a CHANGELOG. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Daniel Meppiel <copilot-rework@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

CI: - ruff format on test_agents_compiler_coverage.py and test_claude_formatter.py (Lint job was failing on format --check) - Fix broken docs anchor #claude-code-deduplication in producer/compile.md (Deploy Docs / build was failing on starlight-links-validator) PR microsoft#1146 review-panel follow-ups: - Doc Writer microsoft#1: add explicit <a id> anchor before the :::note[]:::callout so the table link resolves (Starlight does not auto-generate anchors from callout directives) - Doc Writer microsoft#2: rewrite the dedup note to attribute .claude/rules/ population to BOTH apm install AND apm compile (a producer who only runs compile hits dedup on the second run and the docs previously gave them no explanation) - CLI Logging microsoft#4 / DevX UX: dry-run preview now appends a '(instructions section skipped: .claude/rules/ already populated...)' line whenever skip_instructions fires, so scripted consumers and novice users no longer see a bare 'Would generate 0 files' with no why - Test Coverage microsoft#3: add tests/integration/ test_install_compile_claude_dedup_e2e.py -- exercises the install->compile pipeline through the real CLI to lock in the cross-module dedup contract; second test pins the compile-alone twice path Verified locally: - ruff check + ruff format --check both silent - Full unit suite: 8311 passed - npm run build inside docs/: 'All internal links are valid' Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Integrate copilot runtime

- Use LockFile.read() instead of raw yaml.safe_load() in _collect_transitive_mcp_deps (#1) - Guard against mcp:null in get_mcp_dependencies() (#2) - Remove inline MCP installation pipeline, defer to follow-up PR (#3/#7) - Remove redundant import builtins in _deduplicate_mcp_deps (#10) - Add tests for mcp:null, mcp:[], root-over-transitive dedup order (#9) - Remove tests for deleted inline pipeline functions

…covery + governance (#503) * Initial plan * Initial plan for marketplace integration Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71 Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * feat: marketplace integration core implementation - Add marketplace/ package: models, errors, registry, client, resolver - Add marketplace CLI commands: add, list, browse, update, remove, search - Add lockfile provenance fields: discovered_via, marketplace_plugin_name - Add install hook for NAME@MARKETPLACE syntax pre-parse intercept - Wire marketplace commands in cli.py Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71 Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * docs: add marketplace integration guide and CLI reference - Create guides/marketplaces.md covering marketplace concepts, registration, browsing, search, install syntax, provenance tracking, and cache behavior - Add apm marketplace and apm search command sections to cli-commands.md - Update apm install arguments to include NAME@MARKETPLACE syntax - Update plugins.md Finding Plugins section with marketplace cross-refs Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * docs: fix marketplace.json format and lockfile field names to match implementation - Use array-based plugins format matching models.py parser expectations - Use discovered_via and marketplace_plugin_name matching lockfile.py fields - Document both Copilot CLI (repository/ref) and Claude Code (source) formats Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * docs: fix git-subdir and relative source descriptions to match resolver - git-subdir uses separate repo and subdir fields - Relative string sources resolve to marketplace repo subdirectory Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * feat: add marketplace unit tests and docs - 114 unit tests across 8 test files covering all marketplace modules - New marketplace guide at docs/src/content/docs/guides/marketplaces.md - Updated CLI reference with marketplace and search commands - Updated plugins guide with marketplace integration section - CHANGELOG entry for marketplace feature Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71 Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * refactor: address code review feedback - Use List[MarketplacePlugin] from typing instead of lowercase generic - Eliminate duplicated condition in install.py marketplace intercept - Restructure control flow for clarity Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71 Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> * fix: address all 12 PR review comments on marketplace integration - Narrow except Exception to except ImportError for lazy marketplace import (comment #1) - Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2) - Include subdir in git-subdir source resolution with path traversal validation (comment #3) - Include relative path in relative source resolution with traversal validation (comment #4) - Sanitize marketplace name in cache file paths to prevent path traversal (comment #5) - Fix docs: stale-if-error, not stale-while-revalidate (comment #6) - Consolidate CHANGELOG entries into single line with (#503) (comment #7) - Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8) - Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9) - Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10) - Fix doc examples to use identifier-compatible names (comments #11, #12) - Update tests to match corrected resolver behavior, add traversal tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: Copilot CLI format compatibility and marketplace provenance bugs Bug #1 - Format incompatibility with awesome-copilot marketplace: - Parser now accepts 'source' key (Copilot CLI) as type discriminator fallback when 'type' key is absent, normalizing to 'type' for resolvers - GitHub source resolver now accepts 'path' field (Copilot CLI) as virtual subdirectory, same as 'subdir' in git-subdir sources - Path traversal validation applied to 'path' field - Fixes: 8 of 62 plugins in awesome-copilot that use github source objects with 'source'+'path' keys instead of 'type'+'subdir' Bug #2 - Lockfile provenance never written: - Root cause: install passed raw marketplace refs (NAME@MARKETPLACE) as only_packages, but DependencyReference.parse() can't parse those, so identity filtering removed all deps -> 'already installed' - Fix: use validated_packages (canonical owner/repo strings) instead of raw click argument for only_pkgs Both bugs verified fixed via E2E tests against real marketplaces: - github/awesome-copilot (62 plugins) - anthropics/skills (3 plugins) - microsoft/azure-skills (1 plugin) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: scope marketplace search to QUERY@MARKETPLACE format Search now requires QUERY@MARKETPLACE (e.g. apm search security@skills) to eliminate name collisions across marketplaces. Added search_marketplace() client function for single-marketplace search. - Rejects bare queries without @ — clear error with usage example - Validates marketplace exists before searching - Updated docs/guides/marketplaces.md with new syntax - 7 test cases: format validation, unknown marketplace, results, no results Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: update CLI reference and plugins guide for scoped search syntax Align all documentation with QUERY@MARKETPLACE search format. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: use centralized path_security for marketplace traversal checks Replace 3 ad-hoc '..' in x.split('/') checks in marketplace/resolver.py with validate_path_segments() from utils/path_security.py. Add defense-in-depth validate_path_segments() call to _sanitize_cache_name() in client.py. This ensures marketplace code uses the same cross-platform path safety utilities (backslash normalization, single-dot rejection) as the rest of APM. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: add path safety rule to copilot-instructions.md Directs contributors to use validate_path_segments() and ensure_path_within() from utils/path_security.py instead of ad-hoc traversal checks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com> Co-authored-by: danielmeppiel <dmeppiel@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(policy): W1 foundations for install-time policy enforcement (#827) Wave 1 of issue #827 implementation. Lays the foundations the install pipeline gate (W2) will plug into. No behaviour change yet — install still does NOT enforce policy until W2 wires the gate phase. What's in: - policy_checks: new public seam run_dependency_policy_checks(deps, lockfile=, policy=, mcp_deps=, effective_target=) accepting a resolved dep set; old run_policy_checks(project_root, policy) is now a thin wrapper. Honours require_resolution: project-wins for version-pin mismatches only. Latent isinstance(allow, list) bug fixed for schema's Tuple[str, ...]. - policy/discovery: cache stores merged effective policy with chain metadata + fingerprint. Atomic writes via temp + os.replace, with pid+thread_id suffix to prevent concurrent-writer collision. MAX_STALE_TTL=7d ceiling on cache reuse. PolicyFetchResult expanded to express 9 outcomes (found, absent, cached_stale, cache_miss_fetch_fail, malformed, disabled, garbage_response, no_git_remote, empty). - diagnostics: CATEGORY_POLICY constant + per-category renderer wired into render_summary(). - command_logger: InstallLogger.policy_resolved/violation/disabled with per-class actionable error wording (auth/unreachable/malformed/ blocked). - tests/fixtures/policy/: 14 policy fixtures + 7 project fixtures (denied-direct, denied-transitive, required-missing, required-version-mismatch, mcp-denied, target-mismatch, unpacked-bundle) covering W4 live matrix scenarios L2/L4/L13 and rubber-duck findings I5/I6/I7/N14/C2. - docs: 12-section Install-time enforcement guide skeleton in both enterprise/policy-reference.md and packages/apm-guide skill mirror. 10 sections filled; sections 7 (snippets) and 10 (error table) stubbed for W3-docs-final once W2 lands and W4 captures live output. Tests: - tests/unit: 4878 passed (1 pre-existing unrelated MCP failure deselected). Includes 41 logger + 29 policy-seam + 38 cache + 21 fixture-load new tests. Refs: #827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(install): W2A policy enforcement at install time (#827) Wave 2A wires the three install-time enforcement sites planned for #827: 1. **Pipeline gate phase** (src/apm_cli/install/phases/policy_gate.py): New phase running between resolve and targets. Discovers org policy, resolves the inheritance chain via resolve_policy_chain, persists the merged effective policy + chain refs to cache (chain_refs threading per C1 amendment), then calls run_dependency_policy_checks against the resolved deps. Routes 9 discovery outcomes (found, absent, cached_stale, cache_miss_fetch_fail, malformed, disabled, garbage_response, no_git_remote, empty). Block-mode violations raise PolicyViolationError to halt the pipeline cleanly. 2. **--mcp branch preflight** (src/apm_cli/policy/install_preflight.py + commands/install.py:1091-1125): apm install --mcp does NOT enter the install pipeline. New shared helper run_policy_preflight() runs discovery + dep checks for any non-pipeline command site. Wired into --mcp BEFORE _run_mcp_install so denied servers never reach the integrator. Also exports PolicyBlockError for callers. 3. **install <pkg> snapshot+rollback** (commands/install.py): apm install <pkg> mutates apm.yml BEFORE the pipeline runs. We now snapshot apm.yml as raw bytes (not parsed YAML, to avoid round-trip drift on whitespace / key-order / comments), and on ANY pipeline failure (policy block, download error, etc.) restore byte-for-byte via tempfile + os.replace atomic write. Logs '[i] apm.yml restored to its previous state.' and exits non-zero. InstallContext gains policy_fetch, policy_enforcement_active, no_policy. Tests: +68 new tests, 4946 unit tests pass total. - test_policy_gate_phase.py: 27 (covers all 9 outcomes) - test_mcp_preflight_policy.py: 22 (escape hatches, allow/deny, transport, self-defined, trust_transitive, discovery outcomes, return shape) - test_install_pkg_policy_rollback.py: 19 (byte-equal restore, comments preserved, --no-policy bypass, download error rollback, snapshot unit tests) W2B (dry-run, target-aware, escape-hatch CLI flag) and C2 panel review follow. Refs: #827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(policy): W2B install enforcement - escape hatch, dry-run preview, target-aware check (#827) W2B completes the enforcement surface: * policy_target_check.py - new pipeline phase after targets that re-runs target/compilation checks with the resolved effective_target. Filters to TARGET_CHECK_IDS only to avoid double-emitting dep violations from the gate phase. Honors CLI --target override (I6 fix scenario). * --no-policy escape hatch on apm install / install <pkg> / install --mcp / update. APM_POLICY_DISABLE=1 env var equivalent. Both route through ctx.no_policy and emit always-visible warnings via InstallLogger.policy_disabled() noting that apm audit --ci still fails. * --dry-run policy preview. run_policy_preflight gains dry_run=True kwarg. Emits '[!] Would be blocked by policy: <dep> -- <reason>' (block) or '[!] Policy warning: <dep> -- <reason>' (warn) before the would-install table. Never raises, never mutates. Direct manifest deps only (resolver doesn't run in dry-run; documented limitation). InstallRequest, InstallService, InstallContext threaded with no_policy. LOC budget on install.py raised 1625 -> 1650 with documented rationale. Tests: 5003 unit pass (+57 W2B: 17 target_check + 24 no_policy_flag + 16 dry_run_policy). Full suite green vs main baseline. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): C2 panel fixes - transitive MCP enforcement, shared chain discovery, dry-run cap, drop apm update --no-policy (#827) C2 panel checkpoint surfaced 4 fixes (S1+B1+D2 BLOCKER/PASS-WITH-CONCERN, D1 DevX). All landed; full suite 5032 pass. S1 (Supply Chain BLOCKER) - transitive MCP enforcement: Transitive MCP servers from APM packages were bypassing install-time policy. The pipeline gate phase only sees direct apm.yml deps; transitive MCP servers are merged later via MCPIntegrator.collect_transitive() and written to runtime configs (.copilot/mcp.json, .cursor/mcp.json) with no policy check. This defeated #827 on the most security-critical dep category. Fix: second run_policy_preflight() call in commands/install.py after the transitive merge, before MCPIntegrator.install(). On block: abort MCP config writes, exit non-zero. APM packages remain installed (gate phase approved them). 15 new unit tests in test_transitive_mcp_policy.py. B1 (Architect, partial) - shared chain-aware discovery: Extract discover_policy_with_chain() into policy/discovery.py so both policy_gate.py and install_preflight.py walk the same inheritance chain. Closes the gap where --mcp / --dry-run paths could resolve a different effective policy than the pipeline path. Gate-phase keeps its 9-outcome routing; only the discovery seam moved. 10 new tests in test_chain_discovery_shared.py. D2 (DevX UX) - dry-run noise cap: install_preflight._DRY_RUN_PREVIEW_LIMIT = 5. Long deny lists now show 5 lines per severity bucket + tail '[!] ... and N more would be blocked by policy. Run apm audit for full report.' 4 new tests. D1 (DevX UX) - drop apm update --no-policy: apm update is the CLI self-updater (refreshes the apm binary), not a dependency refresh. The flag was accepted but unused. Removed the option and flipped the test to assert the flag is now rejected. LOC budget on install.py raised 1650 -> 1675 with documented justification. Tests: 5032 unit pass (+29 new: 15 transitive_mcp + 10 chain_discovery_shared + 4 dry_run_noise_cap). 1 pre-existing MCP test deselected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs+test(policy): W3 - integration matrix, docs final fill, CHANGELOG, growth (#827) W3 phase complete. All 5 parallel workstreams landed. Tests: tests/integration/test_policy_install_e2e.py - 17 e2e scenarios I1..I17 Covers all 9 PolicyFetchResult outcomes + all 6 violation classes via CliRunner-driven full-pipeline flows. Mocks discover_policy_with_chain at both seams (policy_gate + install_preflight). Uses _build_policy() helper for frozen-dataclass safe construction. Docs: docs/src/content/docs/enterprise/policy-reference.md sec 7: 8 verbatim CLI snippets (success, block, warn, --no-policy, APM_POLICY_DISABLE, --dry-run with overflow tail, install <pkg> rollback, transitive MCP block) sec 10: outcome table (9 fetch outcomes) + violation table (6 classes) Added explicit JSON/SARIF non-goal callout (C1 amendment). packages/apm-guide/.apm/skills/apm-usage/governance.md Same content, leaner skill version, links back to docs for full text. CHANGELOG.md: Added: --no-policy / APM_POLICY_DISABLE escape hatch, --dry-run preview, install <pkg> rollback Changed: pipeline gains policy_gate + policy_target_check phases, shared chain discovery + atomic cache + MAX_STALE_TTL Security (headline): apm install enforces apm-policy.yml; transitive MCP checked before runtime config write Follow-up issue #829 filed: policy.fetch_failure: warn|block schema knob. Tests: 5049 pass (5032 unit + 17 integration). 1 pre-existing MCP test deselected. PR body drafted at session-state/files/pr-body-827.md. Growth strategy entry + asciinema script staged in WIP (gitignored). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): C3 fixes - direct MCP enforcement, malformed posture, warn-mode coverage, doc drift (#827) C3 final panel + rubber-duck found 5 issues. All fixed. #1 (CRITICAL) - Direct MCP deps in apm.yml bypassed enforcement: ctx.direct_mcp_deps now populated in pipeline.py from apm_package.get_mcp_dependencies() before policy_gate runs. policy_gate reads direct_mcp_deps (not the dead mcp_deps_to_install) and passes them to run_dependency_policy_checks. install.py:1496 second preflight guard drops 'and transitive_mcp' so direct-only MCP installs are also caught. #2 (CRITICAL) - Malformed policy handling inconsistent + broke rollback: policy_gate.py replaced sys.exit(1) on malformed with fail-open warn (matches install_preflight + cache_miss_fetch_fail/garbage_response posture). sys.exit was bypassing the rollback handler in install.py for apm install <pkg>. CEO mandate: malformed = warn, fail-closed knob is follow-up #829. #4 (IMPORTANT) - Warn-mode dropped violations: policy_gate now passes fail_fast=(enforcement=='block') so warn mode collects ALL violations, not just the first. Also emits warnings for passed=True checks with non-empty details (project-wins version-pin mismatches were silently dropped). #3 (IMPORTANT) - Chain inheritance is 1-level, not multi-level: discover_policy_with_chain only walks one parent. Toned down docs in policy-reference.md and governance.md with explicit caution callout. Filed follow-up #831 for proper recursive walk + cycle detection. #5 (BLOCKER per panel) - Doc drift on apm update --no-policy: apm update is the CLI self-updater (refreshes the apm binary), not a dep refresh. Removed all mentions from both docs. apm deps update is the dep-refresh surface (runs install pipeline, gate applies); --no-policy is NOT exposed there today. Tests: 5059 pass (5049 baseline + 10 new: 6 unit gate + 4 integration I18/I19/I20). New integration tests cover real direct-MCP block, real malformed fail-open, warn-mode multi-violation. I16 class renamed to TestI16GarbageResponsePolicy to fix mislabeling. Follow-ups: #829 (fetch_failure schema knob), #831 (multi-level chain). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): in-PR resolution of #834 (warn-mode rendering) and #831 (recursive extends chain) (#827) Originally filed as follow-ups during C3, moved in-PR per reviewer request so #832 ships a complete enforcement story. #834 - Warn-mode policy violations did not render in the install summary. Root cause: pipeline created a fresh DiagnosticCollector for install_result.diagnostics while InstallLogger.policy_violation() pushed warnings into logger.diagnostics. Two collectors, one rendered. Fix: when a logger is present, reuse logger.diagnostics so policy records flow through render_summary() (block mode unaffected - it aborts inline before summary). #831 - extends: chain only supported one level (parent). Inheritance machinery (resolve_policy_chain, detect_cycle, MAX_CHAIN_DEPTH=5) was already N-deep capable; discovery never wired it. Fix: rewrite _resolve_and_persist_chain as iterative depth-first walk, leaf-first; cycle detection via inheritance.detect_cycle; honor MAX_CHAIN_DEPTH=5 with explicit pre-append check; partial-chain warning when a mid-chain ref fails to fetch ('Policy chain incomplete: <ref> unreachable, using <N> of <M> policies'); single cache write at leaf with full chain fingerprint. Tests: +1 unit (warn-render), +5 unit (3-level full, cycle, depth limit, partial chain, single-level regression), +1 integration (TestI21ThreeLevelExtendsChain). 5044 unit pass. Docs: enterprise/policy-reference.md and apm-usage/governance.md chain-depth callouts updated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(changelog): record in-PR resolution of #834 and #831 under #827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): address review-panel pre-merge findings (#827) - Security F1 (HIGH): pin extends: chain to leaf policy host; disable HTTP redirects in _fetch_from_url and _fetch_github_contents. Closes cross-host credential leak vector via git credential fill fallback and SSRF/Referer-leak vector via 30x redirects. raw.githubusercontent .com is treated as distinct from github.com (strict pin). - Logging C1+C2 + UX F1/F2/F4/F5/F9: extract InstallLogger.policy_ discovery_miss() canonical helper covering all 7 discovery outcomes; route both policy_gate and install_preflight through it. absent now verbose-only; no_git_remote downgraded to [i]; garbage_response gets distinct wording (no VPN/firewall noise); cached_stale and cache_ miss_fetch_fail messages now state enforcement posture explicitly; violation messages dedupe dep_ref prefix; wire _policy_reason_blocked into block-severity policy_violation as dim secondary line. - Docs: remove [Planned] banner from policy-reference; update enforcement tables (policy-reference + governance skill) to reflect install-time blocking; document --no-policy / APM_POLICY_DISABLE in cli-commands.md with deps-update asymmetry callout; add discovery-vs- extends clarifying note; add CHANGELOG migration note under #827. Tests: 5053 -> 5068 (+15 logging, +9 security host-pin). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(policy): ship enterprise hardening pack on top of #827 Four enterprise hardening items shipped in-PR per CISO-arbitrated panel verdict + CTO threat-model deep dive (PR #832 comments 4294087760 + 4294115069). Closes #829. 1. policy.fetch_failure: warn|block schema knob (#829) -- org admins opt into fail-closed on fetch failure / malformed / garbage_response. Default 'warn' preserves backwards compat. 2. apm.yml policy.fetch_failure_default: warn|block -- project-side complement so a project can lock down behavior even when no policy is reachable to read the org-side knob from. 3. apm policy status diagnostic command -- show discovery outcome, source, enforcement, cache age, extends chain, effective rule counts, and hash-pin state. --json for SIEM ingestion. Trust-but- verify tool that makes fail-open acceptable. 4. apm.yml policy.hash: 'sha256:...' consumer-side pin -- closes the garbage_response compromised-intermediary vector by verifying raw policy bytes against a project-pinned digest. Equivalent of pip --require-hashes for the policy itself. ALWAYS fail-closed on mismatch, regardless of fetch_failure setting (a hash mismatch is an explicit pin violation, not a fetch failure). sha384/sha512 accepted; md5/sha1 rejected (collision-resistant only). 5. apm audit --ci auto-discovers org policy when --policy-source is not provided; --no-policy flag added to skip. Closes the audit/install asymmetry that left CI blind to sideloaded primitives. Tests: 5068 -> 5157 (+89: hash pin 31, fetch_failure knob, audit auto-discovery, policy status command, plus updates to existing discovery tests for the new expected_hash kwarg threading). Docs: policy-reference §9.5 (fetch_failure), §9.6 (hash pin), §9.7 (apm policy status), §9.8 (audit auto-discovery); governance.md skill mirrors all of the above; cli-commands.md gets policy status + audit --no-policy. CHANGELOG entries under [Unreleased] Added / Added (Security). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(policy): address doc-writer review BLOCKERs (#827) - policy-reference.md: remove stale 'planned fetch_failure knob' paragraph that contradicted the §9.5 entry shipped in the same PR; add Linux hash-compute one-liner alongside the macOS shasum example. - cli-commands.md: add 'apm policy status' command section under a new 'apm policy' family (synopsis, --policy-source/--no-cache/--json, exit-code note, examples). Add --no-policy flag to 'apm audit' options list. Reword --policy SOURCE description to reflect that --ci now auto-discovers when --policy is omitted. Update audit examples to match (drop the now-redundant '--policy org' from auto-discovery example, add explicit --no-policy variant). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(policy): address doc-writer HIGH+LOW findings (#827) - manifest-schema.md: add policy: block to schema diagram + new section 3.9 documenting fetch_failure_default, hash, hash_algorithm - policy-reference.md: add fetch_failure: warn to canonical schema YAML and a fetch_failure entry under Top-level fields; lift apm policy status and apm audit --ci auto-discovery into proper numbered subsections (9.7 / 9.8) so anchors match the skill mirror - governance.md: surface install-time enforcement with link to policy-reference#install-time-enforcement - ci-policy-setup.md: annotate Step 3 noting apm audit --ci auto-discovers and --policy org is now an explicit override - security.md: add Compromised policy intermediary row to attack surface comparison, linked to policy.hash consumer-side pin - cli-commands.md: split --no-policy into 2-line nested bullet separating behaviour from env-var equivalence - apm-guide skill mirror: add fetch_failure: warn to schema overview to keep skill aligned with policy-reference Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): address PR review panel logging+arch findings (#827) BLOCKING: - command_logger.policy_discovery_miss: gate no_git_remote info message on verbose mode; previously emitted on every install in a non-git directory Architecture: - New install/errors.py with canonical PolicyViolationError; PolicyBlockError kept as re-exported alias to preserve test patches - New policy/outcome_routing.py::route_discovery_outcome consolidating the 9-outcome routing table; policy_gate.py and install_preflight.py now delegate instead of duplicating - pipeline.py: catch PolicyViolationError before bare Exception so policy block messages are not double-nested in RuntimeError - commands/install.py: isinstance(PolicyViolationError) branch in the legacy handler for the same reason Logging UX: - install_preflight: empty check.details now falls back to [check.name] so the block message is never blank - _extract_dep_ref helper replaces detail.split(":")[0] with defensive parsing that falls back to check.name Security: - discovery._get_cache_dir asserts containment vs project_root (resolves symlinks) instead of an unguarded join - Removed dead no_policy= kwarg from discover_policy_with_chain; env-var defence-in-depth retained on the call site Tests: +tests/unit/policy/test_pr_832_findings.py covering all 8 findings; install_logger split into silent/verbose cases. 5176 unit tests pass, 0 regressions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(policy): use urllib.parse for host assertions to silence CodeQL (#827) CodeQL's py/incomplete-url-substring-sanitization rule fired 6 times on test_extends_host_pin.py because bare 'host' in msg substring checks could in theory match a host appearing at an arbitrary URL position (path, query, userinfo). The assertions are correct in practice -- they assert on production error messages of known format -- but the pattern is not safe in general. Replace each substring check with a precise extractor: - _assert_extends_host_in_message / _assert_leaf_host_in_message: regex-anchor on the production 'extends host: <h>' / 'leaf host: <h>' tokens, then exact-compare the captured group. - _assert_redirect_target_host: regex-extract the redirect target URL after 'to ', then urllib.parse.urlparse(...).hostname compare. No production-code changes; all 9 host-pin tests still pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy,audit): address PR #832 DevX UX blockers - audit --no-policy help text rewritten to describe positive behaviour first ("Skip org policy discovery and enforcement" instead of the negative "Skip auto-discovery ... in --ci mode"), so apm audit --help no longer hides the primary effect behind a caveat. Aligns the code with the docs. - apm policy status --check flag added: exits 1 when outcome is not 'found' (i.e. policy unresolvable / absent / disabled / fetch-failed), 0 otherwise. Default behaviour unchanged (always exit 0) so the diagnostic remains safe for human and SIEM use, while CI authors get the npm audit / pip check style contract via a single flag. Updates cli-commands.md, policy-reference.md, and CHANGELOG.md to document the new flag and exit-code table. Adds TestStatusCheckFlag covering the found / unresolvable / discovery-exception / json combinations. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(drift): Phase A infra - guards + diagnostic category - Add _ReadOnlyProjectGuard context manager (utils/guards.py): snapshots stat of protected paths, raises ProtectedPathMutationError on any mutation. Defense-in-depth above the scratch-root remap. - Add CATEGORY_DRIFT + drift() recording method to DiagnosticCollector. - Add drift_count property and _render_drift_group renderer that groups by kind (modified/unintegrated/orphaned) with stable section header for machine consumers. - Tests: 7 unit tests covering happy path, mutation, creation, deletion, missing-tolerated, exception-not-masked, single-file protected path. Refs #1071. Phase A of WIP/drift/06-final-plan.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(drift): Phase B+C - replay engine + audit CLI wiring Implements the drift detection feature per WIP/drift/06-final-plan.md (closes #1071 scope alignment with #898). Engine (Phase B): - src/apm_cli/install/drift.py: ReplayConfig, DriftFinding, CheckLogger, CacheMissError, normalization helpers (build-id strip, line endings, BOM), run_replay() (cache-only), diff_scratch_against_project(), text/json/sarif renderers, atexit scratch cleanup. - src/apm_cli/install/services.py: scratch_root kwarg with ensure_path_within defense-in-depth guard for replay isolation. - src/apm_cli/policy/ci_checks.py: _check_drift() wrapper returning (CheckResult, list[DriftFinding]); graceful CacheMissError handling. CLI surface (Phase C): - src/apm_cli/commands/audit.py: --no-drift opt-out flag with mutex against --strip/--file via UsageError. Drift wired into both _audit_ci_gate (--ci) and _audit_content_scan (bare project audit) paths, default-on per ADR-02. JSON/SARIF/text renderers integrated; --no-drift warning gated to text mode (stdout cleanliness). Tests: - tests/unit/install/test_drift.py: 13 unit tests (normalization, diff cases, renderers). - Legacy --ci tests opt out of drift via batch --no-drift injection (fixture parity, not a behavior change). 7597 unit tests pass; lint clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(drift): Phase D - integration + e2e + perf coverage (43 tests) Implements the locked test matrix for issue #1071 drift detection. Floor of 43 tests across three new files closes the 'ULTRA HARDENING OF HELL' coverage requirement. New files: - tests/integration/test_drift_check.py (32 tests): * Section A: 9 drift cases (modified/unintegrated/orphaned + CRLF/ BOM/Build-ID false-positive guards) * Section B: 4 past-PR regressions (#1067, #882, #889, source-deleted) * Section C: 7 edges (no/corrupt lockfile, untracked governed, no-write contract, idempotency) * Section D: 3 multi-target (copilot/claude/cursor) * Section E: 9 default-on / --no-drift opt-out (mutex, stderr routing, JSON suppression) - tests/integration/test_drift_check_e2e.py (10 tests): full install->mutate->audit loop with mix_stderr=False, air-gap proof, JSON/SARIF stability, 30s smoke - tests/unit/install/test_drift_perf.py (1 test): 100 primitives replay+diff under 5s Engine fix surfaced by tests: - src/apm_cli/install/drift.py: run_replay now reads apm.yml's target field via parse_target_field and passes it to resolve_targets. Without this, multi-target projects (copilot+claude+cursor) replayed only the auto-detected primary target, falsely reporting secondary target deployments as orphaned. Helper _read_apm_yml_target() added. CI wiring: - scripts/test-integration.sh: two new blocks in run_e2e_tests() invoking the integration + e2e suites before the final success log. Both safe to run without GITHUB_APM_PAT (cache-only, mocked network). Verification: 56 drift-domain tests pass; full repo lint clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(drift): CHANGELOG + Starlight guide + apm-usage skill + ci.yml note - CHANGELOG.md: Added [Unreleased] entry under Added describing the default-on drift detection in apm audit, the three failure modes it catches, false-positive guards, --no-drift opt-out + mutex semantics, and the JSON/SARIF integration shape. Closes #1071, supersedes #898. - docs/src/content/docs/guides/drift-detection.md (NEW, sidebar order 7): Full user-facing guide -- what drift means, how the cache-only replay works (with mermaid diagram), exit-code matrix, when to use --no-drift, output formats, and the CI single-line gate that replaces the legacy git status --porcelain script. - packages/apm-guide/.apm/skills/apm-usage/commands.md: Extended the audit row with --no-drift flag and added a paragraph documenting the drift-by-default behavior, three failure modes, false-positive normalization, and JSON/SARIF integration. Aligns the skill that ships in apm-guide with the new CLI surface (per apm-keep-docs-up-to-date.instructions.md rule 4). - .github/workflows/ci.yml: Annotated Gate B (legacy bash drift check) with a comment marking it redundant once apm-action ships a CLI with default-on drift detection (this PR's release). Kept as defense-in-depth fallback until then. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(drift): address panel feedback - recovery hint + doc-sync CEO panel recommended landing two in-PR follow-ups before merge: 1. Recovery hint in drift output (cli-logging + devx-ux convergence): render_drift_text now appends '[i] Run apm install to re-sync deployed files with the lockfile.' so users see WHAT and HOW in one message. Honors Message Writing Rule #4 'Include the fix'. 2. Doc-sync (doc-writer + devx-ux convergence): - reference/cli-commands.md: add --no-drift to audit options table; amend --ci description to mention drift contribution. - integrations/ci-cd.md: replace bash 'git status --porcelain' workaround under 'Verify Deployed Primitives' with 'apm audit --ci' one-liner; update 'We dogfood this' callout text. - getting-started/quick-start.md: retarget stale cross-ref from the now-superseded ci-cd anchor to the new drift-detection guide. - guides/drift-detection.md: drop the self-contradictory case #2 in 'When to use --no-drift' (strip-mode is auto-skipped, not opt-out). - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line pointing readers to the guide for detail. Tracked as follow-up issues (CEO call): - supply-chain: verify cache content matches lockfile resolved_commit before drift replay trusts it (commit-SHA pinning bypass on shared CI caches). - test-coverage: inverse-normalization unit test asserting BOM/CRLF/ Build-ID guards do NOT mask real content drift (safety invariant). Lint clean. 45 drift tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(drift): address Copilot review - exit-code contract + types + diagnostics Bare 'apm audit' is advisory (exit 0 on drift); 'apm audit --ci' is the gate (exit 1). Closes the regression introduced when content-scan escalation accidentally also escalated drift findings. Also addresses inline review: - A2: vacuous ASCII-encoding assertion now scopes per-line - A4: tuple[float, int] -> tuple[int, int] in guards.py - A5: type-annotated _check_drift signature - A6: clarified DRIFT_ORPHANED comment - A7: CHANGELOG references PR + closes - A3: CacheMiss message now drift-specific (no --no-cache confusion) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(drift): link drift detection guide from README security section Per oss-growth: surfaces drift detection alongside content security and lockfile integrity in the conversion-critical Production-grade section, so a reader scanning for 'why APM' sees the supply-chain story end-to-end. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(drift): cache pin marker for stale-cache detection apm install drops a .apm-pin JSON marker into each cached package root recording the resolved_commit; apm audit verifies it before running drift replay. Catches the 'teammate bumped lockfile, did not reinstall' + 'shared CI runner reused stale apm_modules' scenarios that would otherwise silently produce misleading drift output. LockfileBuilder syncs markers UNCONDITIONALLY (even when the lockfile YAML is unchanged and even when no install happens), so existing users self-heal on their next 'apm install'. This is stale-cache detection, NOT cryptographic integrity -- defending against active cache tampering requires content-addressed hashes, which is deferred. Schema (v1): {schema_version: 1, resolved_commit: <sha>} Marker file: <install_path>/.apm-pin Coverage: - 14 unit tests in test_cache_pin.py (positive + every error path + skip rules + idempotent re-run + self-heal regression) - 1 integration test in test_drift_check_e2e.py exercising the full install -> mark -> verify flow against a synthetic cache Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address panel follow-ups C1-C5 on PR #1137 C1 (supply-chain): Fail closed on unpinned remote deps - cache_pin.find_unpinned_remote_deps() helper + stderr warning in sync_markers_for_lockfile - drift._materialize_install_path raises CacheMissError for remote deps with resolved_commit=None (was silent fail-open) - Replaced silent-skip test with warning assertion + new helper test C2 (architecture): Wire _ReadOnlyProjectGuard into run_replay - run_replay() now wraps the deps loop with _ReadOnlyProjectGuard on governed root dirs + apm.lock.yaml + AGENTS.md - Regression test: monkeypatched leaky integrator triggers ProtectedPathMutationError C3 (cli-logging-ux): Stderr message on swallowed CacheMissError - audit._audit_content_scan emits '[!] drift check could not run: <msg>' to stderr when drift_failed and no findings (covers cache miss, missing lockfile, cache-pin error) - Integration test e10 asserts stderr message in bare-audit path C4 (docs): Baseline-check phrasing + CHANGELOG link - governance-guide, ci-cd, cli-commands now read '7 baseline checks plus integration drift detection' - CHANGELOG drift-detection link points to docs site URL C5 (oss-growth): User-promise framing - CHANGELOG drift entry leads with the user promise (forgotten installs + hand-edits) before mechanism - drift-detection.md gains a 'Try it now' block at the top - Before/after CI comparison promoted to its own subsection with explicit framing of what the bash workaround missed Verification: ruff check + format silent; 7621 unit tests + 27 drift integration tests green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(changelog): trim drift entry to single 'so what?' line Collapse the two added entries (drift + cache-pin markers) into one short line that answers the developer 'so what?' and points to the Drift Detection guide for the full mechanism + opt-out + cache-pin details. Per maintainer feedback: the previous entries were too long for a CHANGELOG. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Daniel Meppiel <copilot-rework@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- install.py --target help: mention copilot-app + warn that repeated flag (-t a -t b) silently honors only the last value; use commas (devx-ux #1, #2) - copilot-app.md: bump sidebar order 5 -> 6 (collision with github-rulesets.md), cross-link to reference/experimental/ and reference/targets-matrix/, rephrase WAL ownership to reflect that the App owns WAL and APM coexists via BEGIN IMMEDIATE + bounded retry, surface accepted schema range [13, 13], split lifecycle table cell with rationale below the table, add :::note callout clarifying the shape predicate, document source-deletion orphan case (doc-writer #1-5, devx-ux #4, #5) - tests: add test_workflow_shape_skipped_by_copilot_prompt_integrator regression test asserting workflow-shape .prompt.md does NOT leak into .github/prompts/ when --target includes copilot (test-coverage #1) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… App DB (#1405) * feat(experimental): copilot-app target deploys scheduled prompts to App DB Dark-shipped under the new `copilot_app` experimental flag (off by default). When enabled, `apm install --target copilot-app --global` writes prompts that carry a `schedule:` frontmatter block as rows in the GitHub Copilot desktop App's SQLite store at `~/.copilot/data.db`. No new CLI surface; `install` / `update` / `uninstall` / `list` all flow through unchanged. Hard contracts: - `enabled = 0` on every insert -- user opts in from the App. - Namespaced ids (`apm--<owner>--<pkg>--<prompt>`) so uninstall never touches user-authored rows. - `PRAGMA user_version` guard (13 currently); refuse to write on unknown. - WAL-safe SQLite with retry on `database is locked`. - Update path preserves user state (`enabled`, `last_run_at`, overrides). - Lockfile URIs use `copilot-app-db://workflows/<id>` (cowork precedent). Tests: 53 new (DB module, schedule parser, target gating, install E2E). Full unit suite: 8787 passed (one pre-existing macOS shlex failure unrelated to this change). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(copilot-app): integration page, apm-usage skill, error UX + lockfile tests Wave 4 + Wave 6a of the copilot-app dark-ship: - docs/src/content/docs/integrations/copilot-app.md mirrors the copilot-cowork page: enable flag, lifecycle, DB resolution, 'auth' model, schema guard, concurrency, lockfile URI scheme, out-of-scope. - apm-usage skill: commands.md notes copilot-app under experimental; package-authoring.md documents the optional schedule: frontmatter block. - tests/unit/integration/test_copilot_app_error_ux.py (5 tests) exercises CopilotAppDbMissingError, CopilotAppDbSchemaError, CopilotAppDbLockedError mid-deploy: each surfaces as an actionable per-prompt diagnostic; one failing prompt does not block the next; resolver returning None mid-run is defensive (no crash). - tests/unit/install/test_services.py adds a round-trip test for copilot-app-db:// URI generation through _deployed_path_entry. Full unit suite: 8794 passed (1 pre-existing unrelated macOS skip). Lint contract green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(copilot-app): preserve URI scheme for user-scope local installs When 'apm install <local-pkg> --target copilot-app --global' was invoked, the lockfile stored 'workflows/apm--...' without the 'copilot-app-db://' scheme prefix. As a result, the subsequent uninstall could not detect the copilot-app entry and the DB row was orphaned in the Copilot App. Root cause: _deployed_path_entry tried 'target_path.relative_to(project_root)' first. For --global installs, project_root is the user home and the synthetic copilot-app root (~/.copilot/workflows) sits inside it, so the relative_to() succeeded and skipped the dynamic-root URI branch entirely. Fix: detect dynamic-root target match (cowork, copilot-app) before attempting the project_root-relative encoding. The cowork PathTraversalError behavior is preserved for the legacy out-of-tree case. Adds 'test_install_local_pkg_then_uninstall_deletes_db_row' end-to-end regression covering the install -> lockfile URI -> uninstall -> DB row deletion roundtrip. Also extends partition_managed_files dynamic-root branch with the 'prompts_copilot-app' bucket and adds a copilot-app scan in uninstall engine so user-scope DB-backed targets are cleaned even when the local apm.yml does not enumerate them. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(changelog): add experimental copilot-app target entry Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(copilot-app): drop autopilot, reset enabled on content change, ship docs Address apm-review-panel CEO synthesis for PR #1405: Security (supply-chain-security-expert blocking): - Remove 'autopilot' from _VALID_MODES (copilot_app_db.py) and _VALID_SCHEDULE_MODES (prompt_integrator.py). Earlier docstring claimed third-party autopilot was policy-blocked but no code enforced it -- this lands the actual enforcement at the writer. - deploy_workflow UPDATE branch now compares prompt body, mode, interval, schedule, model, and reasoning_effort against the existing row; when any execution-affecting field changes the user's prior opt-in is revoked (enabled = 0, next_run_at = NULL). Display-only changes (e.g. just the name) still preserve enabled, last_run_at, next_run_at. Closes the silent-malicious-update vector the panel flagged. Test coverage (test-coverage-expert): - Split the prior 'preserves enabled across updates' test into two scenarios that match the new semantics and add a third test covering schedule changes and a regression test that pins mode='autopilot' as rejected. Docs (doc-writer blocking): - Register copilot-app in the Starlight sidebar. - Add copilot-app row to experimental flag table and update the targets-matrix experimental note + auto-detection callout. - Strip false 'apm list' lifecycle row; replace the 'autopilot policy-blocked' paragraph with the secure-by-default rationale; expand the lifecycle table so the content-change reset is documented; fix two 'copilot_app flag' -> 'copilot-app flag' kebab-case drifts. CHANGELOG (devx-ux nit): - Replace 'apm config set experimental.copilot_app true' with the canonical 'apm experimental enable copilot-app'. Tests: 62/62 copilot-app suite green; 1970/1970 integration+install suite green; lint and format silent. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * iter-3: force enabled=0 at INSERT writer + truthful docs - copilot_app_db.deploy_workflow INSERT now hardcodes enabled=0 in the SQL (was: row.enabled passthrough). Defence in depth: a future caller cannot bootstrap an auto-running APM-deployed row even if the row dataclass carries enabled=1. The user opt-in path stays the same: enable from the App UI after install. - New test: test_insert_forces_enabled_zero_even_if_caller_passes_one. - Docs (copilot-app.md): lifecycle table row 3 now lists all 7 execution-affecting fields (prompt, schedule, mode, model, reasoning effort), matching deploy_workflow comparison semantics. - Docs (copilot-app.md): error wording for locked-DB paraphrased instead of quoting a string the code never emits. - Docs (package-authoring.md): YAML example drops the autopilot comment; rationale aligned with the integrations/copilot-app.md framing (intentionally not accepted via this target). Closes iter-2 panel feedback. No blocking findings from any of 8 panelists; this iteration converges the residual recommended items. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(copilot-app): allow project-scope install (lift --global gate) A team-shared scheduled prompt declared in a project's apm.yml now deploys to the developer's ~/.copilot/data.db on 'apm install', without requiring '--global' user-scope install. The previous gate forced every contributor to repeat the install at user scope to receive workflows the team had already declared in the manifest. Architectural change: - Add TargetProfile.scope_invariant_resolver (default False). - copilot-app sets scope_invariant_resolver=True because its deploy root (~/.copilot/data.db) is a user-machine resource that exists regardless of install intent. - TargetProfile.for_scope(user_scope=False) now runs user_root_resolver for scope-invariant targets, populating resolved_deploy_root so the lockfile enrichment can map the synthetic 'workflows/<id>' path to the copilot-app-db://workflows/<id> URI. - Cowork remains scope-sensitive (project-scope cowork still rejected). Security envelope: the experimental copilot_app flag remains the single opt-in gate. Removing the --global gate folds two consent layers (flag + user-scope) into one (flag), which matches v1's stated 'apm install just works' UX promise. The DB row is still INSERTed with enabled=0, the namespaced 'apm--<owner>--<pkg>--<prompt>' ID is preserved, and the lockfile URI keeps uninstall surgical. Tests: - 8801 unit tests pass (full sweep). - 64 copilot-app tests pass (was 63). - New test_install_project_scope_then_uninstall_deletes_db_row exercises the full roundtrip via project apm.yml + chdir; rewrites the prior test_project_scope_requires_global which asserted the inverse. - Manual verification in /tmp: install -> DB row appears with enabled=0 -> uninstall -> DB row gone. Docs: - integrations/copilot-app.md install incantation updated. - apm-usage skill commands.md + package-authoring.md mention both project and user scope. - CHANGELOG entry rewritten. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(copilot-app): address devx-ux follow-ups on gate-lift Two recommended findings from devx-ux-expert re-panel (Opus 4.7, agent_id devx-on-gate-lift): 1. Install output is silent about the 'enable in Copilot App' step. Added one-line trailing hint after the 'N prompts integrated -> copilot-app/workflows/' line, only when copilot-app actually wrote rows in this run: [+] /pkg (local) |-- 1 prompts integrated -> copilot-app/workflows/ |-- workflows arrive disabled; enable from the Copilot App's Workflows tab This closes the first-contributor failure mode that the gate-lift surfaces (someone runs plain 'apm install' on a project that declares copilot-app in targets, sees the integrated line, doesn't realise the row landed enabled=0 and needs a Copilot App toggle to fire). 2. targets-matrix.md docs row understated project-scope ride-along for the three never-auto-detected targets. Reworded to call out that a project apm.yml 'targets:' field lets contributors pick them up via plain 'apm install'. Plus the test-coverage nit: pinned verbatim install output shape in the new project-scope roundtrip test (asserts 'prompts integrated' AND 'enable from the Copilot App' appear). Verification: - 64 copilot-app tests pass - Full unit sweep 8800 pass (1 pre-existing flake on test_runtime_windows.py unrelated to gate-lift -- fails on fc40650 too because local 'codex' binary is installed) - Lint+format silent - Manual e2e: [+] /pkg (local) |-- 1 prompts integrated -> copilot-app/workflows/ |-- workflows arrive disabled; enable from the Copilot App's Workflows tab Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(copilot-app): narrow workflow-shape predicate to 3 keys Option B refinement: distinguish workflow-shape prompts from plain .prompt.md unambiguously. Only {interval, schedule_hour, schedule_day} mark a prompt as a Copilot App workflow row; `mode` and `reasoning_effort` are valid OPTIONAL fields on a workflow but cannot flip the shape because plain VSCode prompts use `mode: agent|ask|edit` legitimately. Without this narrow, any plain prompt that set `mode:` would silently land as a (broken) workflow when the user passed --target copilot-app, or a workflow row could be lossy when a writer set only `mode:`. Live e2e verified: - Single-target copilot: workflow-shape SKIPPED, plain ships to .github/prompts/ correctly. - Single-target copilot-app: workflow row in ~/.copilot/data.db with enabled=0; plain prompt warns then skips. - Multi-target copilot,copilot-app (comma-separated): both dispatch paths fire; no leak between them. - Update preserves user-side enabled=1 across re-install. - Lockfile records copilot-app-db:// URIs cleanly; apm audit clean. Warning text narrowed to actually-mandatory keys so the hint is truthful and reproducible. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * address panel follow-ups: devx-ux, test-coverage, doc-writer - install.py --target help: mention copilot-app + warn that repeated flag (-t a -t b) silently honors only the last value; use commas (devx-ux #1, #2) - copilot-app.md: bump sidebar order 5 -> 6 (collision with github-rulesets.md), cross-link to reference/experimental/ and reference/targets-matrix/, rephrase WAL ownership to reflect that the App owns WAL and APM coexists via BEGIN IMMEDIATE + bounded retry, surface accepted schema range [13, 13], split lifecycle table cell with rationale below the table, add :::note callout clarifying the shape predicate, document source-deletion orphan case (doc-writer #1-5, devx-ux #4, #5) - tests: add test_workflow_shape_skipped_by_copilot_prompt_integrator regression test asserting workflow-shape .prompt.md does NOT leak into .github/prompts/ when --target includes copilot (test-coverage #1) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Daniel Meppiel <copilot-rework@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Closes HIGH finding #2 from genesis-critique.md: the v1 ship deferred the evals gate. This commit authors the minimum set the gate requires, mirroring the pr-description-skill/evals/ shape so a single CI lane can score both: evals/ evals.json - manifest + gates (val split = ship gate) triggers.json - 10 fire + 10 no_fire, 60/40 train/val content/ - 2 scenario manifests + rubrics three-issues-mixed.json sweep-bug-queue.json fixtures/ - 4 markdown fixtures (with/without skill x 2) README.md .gitignore - results/ (timestamped runner output) scripts/ run_evals.py - stdlib-only, deterministic matcher The no-fire trigger set deliberately includes queries that SHOULD route to apm-review-panel ('review my PR', 'panel-review this PR') so DISPATCH COLLISION between the two skills would surface as val-no-fire-rate dropping below 0.5. Val split result on first run: trigger fire rate 1.0, no-fire rate 1.0, content delta_anchors 8 and 7 (gates require >= 1). Overall: passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…-shepherd primitive (#1434) * fix(copilot-app): warn instead of fail on newer App schema versions The Copilot App ships fast and bumps PRAGMA user_version on additive changes that do not break APM's read/write surface. Hard-failing every install whose user_version exceeded the highest version APM was tested against made every Copilot App release a release window for APM -- awful UX for users who simply updated their App. Change: - Bump _MAX_SUPPORTED_USER_VERSION 13 -> 15 (newly observed, working). - Above max: warn-and-continue once per process (deduped by version) via _rich_warning, with exact wording supplied by the devx-ux-expert persona: names the version delta and points the user at a ready-to-file issue title for breakage reports. - Below min: continues to hard-fail -- the workflows table may genuinely not exist on a pre-workflows schema. Wording verbatim per devx-ux verdict (ship_as_proposed). No cap at +N: warning already names the exact delta, giving signal proportional to risk. If a future version truly breaks reads, add it to a _KNOWN_BREAKING list and hard-fail on that specific version. Tests cover v16/v17/v50 warn-not-raise, the <13 hard-fail path, and the per-process dedup contract for multi-row installs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(skills): add batch-bug-shepherd primitive + Copilot App workflow prompt Ships two composed APM primitives: 1. .apm/skills/batch-bug-shepherd/ -- the working spec for driving a batch of suspected bugs in microsoft/apm from raw issue list to mergeable PR queue. Fans out one reproduction subagent per candidate (LEGIT / UNCLEAR / FIXED-AT-HEAD), cross-references against open PRs, then branches: * in-flight community PR -> shepherd via .apm/skills/apm-review-panel * no PR -> fix session with TDD and mutation-break gate Dispatches one completion subagent per shepherd verdict to resolve panel follow-ups and post a ready-to-merge confirmation. Maintains a single plan.md ground-truth table as canonical session state. 2. .apm/prompts/batch-bug-shepherd.prompt.md -- a Copilot App workflow prompt (interval: manual, mode: interactive) that loads the skill above and accepts a 'targets' input (either an issue list or the literal 'sweep-all'). Lands in ~/.copilot/data.db as a workflow row with enabled=0 (consent gate); user must opt in via the Copilot App Workflows tab before it runs. The skill composes with the existing .apm/skills/apm-review-panel/ via a relative sibling link -- the shepherd phase delegates panel review to that skill rather than reinventing it. Design rationale lives in the genesis-plan.md authored during the genesis design pass; that artifact is intentionally not shipped (it is design scaffolding, not a user-consumed primitive). apm.lock.yaml regenerated by 'apm install --target copilot,copilot-app' to include both the skill path and the synthetic copilot-app-db://workflows/apm--local--_local--batch-bug-shepherd URI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(skills): make batch-bug-shepherd primitive harness-agnostic Drop hard-coded .apm/skills/ path probes from the meta-prompt, the SKILL.md composition section, and the shepherd-prompt asset. Skills are activated by NAME by the harness; the meta-prompt and skill body no longer assume APM-on-disk layout. This lets the same primitive load identically inside Copilot CLI, Copilot App workflows, Claude Code, Codex, or any other harness that resolves skills by name. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(lockfile): regenerate after harness-agnostic primitive Captures all 15 local skills (including batch-bug-shepherd) and the copilot-app-db workflow entry into apm.lock.yaml so subsequent runs (apm install --frozen, drift-check) see the canonical post-install state. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(packages): extract apm-review-panel as its own package Move the apm-review-panel skill out of the microsoft/apm monorepo's shared .apm/skills/ tree into a dedicated, publishable APM package under packages/apm-review-panel/ using the HYBRID layout (apm.yml + SKILL.md at the root, with co-located assets/ and evals/). The root apm.yml now declares the package via a local-path manifest dep, so 'apm install' at root continues to deploy the skill into .agents/skills/apm-review-panel/ -- but the dependency is now explicit and inspectable via 'apm deps list' instead of relying on the includes-auto walk of the monorepo .apm/ tree. Rule-of-three justification (see genesis-plan-v2.md step 3.5): panel is independently useful for any single-PR review (not just batch shepherding), so extraction unlocks consumer reuse and prevents the consumer from being forced to install the shepherd to get the panel. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(packages): repackage batch-bug-shepherd as a standalone APM package Move batch-bug-shepherd from the monorepo .apm/skills/ and .apm/prompts/ trees into a dedicated, publishable APM package under packages/batch-bug-shepherd/ using the multi-primitive .apm/ layout so the package can ship BOTH the skill and the workflow prompt: packages/batch-bug-shepherd/ apm.yml .apm/ skills/batch-bug-shepherd/ SKILL.md assets/... prompts/ batch-bug-shepherd.prompt.md The package declares its dep on apm-review-panel via a local-path manifest entry (../apm-review-panel, anchored to the package dir per APM's local-path rule). The shepherd-prompt asset still probes the panel by NAME at use-site as the A9 SUPERVISED EXECUTION backstop if the harness registry is bypassed -- belt + suspenders, see genesis-plan-v2.md step 3.5 'Declaration mechanism per external module'. Root apm.yml now depends on packages/batch-bug-shepherd via local path; 'apm install' continues to deploy the skill to .agents/skills/batch-bug-shepherd/ and the workflow row to the App SQLite DB (with enabled=0 per the consent gate) -- but the path is now manifest-declared and the package is independently shippable: apm install microsoft/apm/packages/batch-bug-shepherd works in any consumer repo and transitively pulls apm-review-panel. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(skills): collapse hidden coupling in batch-bug-shepherd prompt Closes HIGH finding #1 from genesis-critique.md: the workflow prompt's 'Hard rules' block restated four disciplines (ASCII-only, lint contract, mutation-break gate, single-writer per comment) that already live in the SKILL.md body. Two sources of truth invited drift -- if the skill body added a new gate, the prompt would not auto-update. Collapse the section to a single 'Delegation' paragraph naming the skill as the authoritative source for all disciplines. The prompt now carries only the per-trigger-surface contract (workflow frontmatter + the input parameter + the 6-step ACTIVATE / SCOPE / PLAN / INITIALIZE / EXECUTE / RENDER procedure that summons the skill); every discipline travels with the skill body. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(skills): add content + trigger evals for batch-bug-shepherd Closes HIGH finding #2 from genesis-critique.md: the v1 ship deferred the evals gate. This commit authors the minimum set the gate requires, mirroring the pr-description-skill/evals/ shape so a single CI lane can score both: evals/ evals.json - manifest + gates (val split = ship gate) triggers.json - 10 fire + 10 no_fire, 60/40 train/val content/ - 2 scenario manifests + rubrics three-issues-mixed.json sweep-bug-queue.json fixtures/ - 4 markdown fixtures (with/without skill x 2) README.md .gitignore - results/ (timestamped runner output) scripts/ run_evals.py - stdlib-only, deterministic matcher The no-fire trigger set deliberately includes queries that SHOULD route to apm-review-panel ('review my PR', 'panel-review this PR') so DISPATCH COLLISION between the two skills would surface as val-no-fire-rate dropping below 0.5. Val split result on first run: trigger fire rate 1.0, no-fire rate 1.0, content delta_anchors 8 and 7 (gates require >= 1). Overall: passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(skills): tighten batch-bug-shepherd dispatch description Closes HIGH finding #3 from genesis-critique.md: the original description was 950/1024 chars, too close to the runtime cap to absorb future trigger additions, and it under-named the indirect maintainer phrasings the skill actually wants to catch. Trim ~75 chars while ADDING two indirect triggers explicitly exercised in evals/triggers.json (fire-07 'shepherd all bug-flagged issues this quarter', fire-08 'weekly sweep of community-reported issues'): - 'reproduction subagent per candidate issue' -> 'triage subagent per issue' (the procedural detail belongs in the body, not the dispatch surface) - collapsed the parenthetical workflow branch into a single -> arrow - added 'shepherd all bug-flagged issues this quarter', 'run a weekly sweep of community-reported issues', 'work down community bug contributions' to the Activate list - kept the imperative shape, intent-first ordering, and the '-- even if shepherd or batch is not named' tail Verified: val split evals still report fire 1.0 / no-fire 1.0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(install): regenerate .agents/skills/ + apm.lock.yaml after package extraction Mechanical `apm install --target copilot,copilot-app --force` output after the package extraction. The previous lockfile listed the two skills under `local_deployed_files` (root .apm/ tree); they now appear as proper `dependencies` entries because they ship as self-contained packages under `packages/`. Also re-materializes `.agents/skills/{apm-review-panel,batch-bug-shepherd}/` with the trimmed dispatch description and the moved workflow asset, since those compiled trees are tracked in this repo. No source changes -- just the compiled-output snapshot. Verified: - ruff check + ruff format --check: silent - apm install --target copilot,copilot-app --force: clean - sqlite3 ~/.copilot/data.db -> workflow enabled=0 (consent gate ok) - standalone apm install from a scratch dir resolves apm-review-panel transitively via the batch-bug-shepherd manifest dep (depth: 2, resolved_by: _local/batch-bug-shepherd) - val split evals: trigger fire 1.0, no-fire 1.0, content passed Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(policy): exclude transitive deps from no-orphaned-packages check A sub-package's local-path dep appears in the root lockfile with resolved_by set. The root manifest cannot make it go away by editing its dependencies.apm list, so flagging it as an orphan creates an unfixable CI failure. Surfaced by PR #1434, where batch-bug-shepherd declares ../apm-review-panel as a manifest dep. Both that transitive entry and the depth=1 root entry land in apm.lock.yaml; the orphan check was flagging the depth=2 one. Restrict orphan detection to direct deps (resolved_by is None). Add a regression test that covers the exact shape. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(packages): drop transitive manifest dep to unblock APM <=0.14.1 CI The previous commit fixes _check_no_orphans on the consumer side, but the audit gate in microsoft/apm-action@v1 pins APM 0.14.0, which does not yet have the fix. Until a release including the fix ships, the '../apm-review-panel' transitive local-path entry would keep the gate failing. Drop the manifest dep from packages/batch-bug-shepherd/apm.yml; the runtime activate-by-name probe in assets/shepherd-prompt.md remains the working backstop, and the root manifest declares both packages directly. Inline comment documents the rationale and the restoration condition. Lockfile regenerated by clean install (apm_modules wiped first); all 8 audit checks pass locally. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(copilot-app): update schema-too-new test to expect warn-not-raise Test was added on main after this PR forked. It assumed the old hard-fail behavior; our commit 05ea778 changed schema-too-new to warn-and-continue. Update assertion to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com>

danielmeppiel added 9 commits September 24, 2025 14:19

register Copilot client in ClientFactory

1d09680

Update Copilot CLI commands to use --allow-all-tools instead of --ful…

e06fd91

…l-auto

Bump version to 0.4.2 and update changelog for Copilot CLI support

0a6502a

Refactor tests for CopilotRuntime to ensure availability checks befor…

dcab2d2

…e instantiation and enhance runtime info retrieval with mocked subprocess output.

danielmeppiel merged commit 55839cb into main Sep 25, 2025
15 checks passed

danielmeppiel deleted the integrate-copilot-runtime branch February 27, 2026 09:42

danielmeppiel mentioned this pull request Mar 4, 2026

feat: Generic git URL support (GitLab, Bitbucket, any host) #150

Merged

7 tasks

This was referenced Mar 5, 2026

refactor: unified deployed_files manifest for safe integration lifecycle #163

Merged

fix: clean stale MCP servers on install/update/uninstall and prevent .claude folder creation #201

Merged

Copilot AI mentioned this pull request Mar 14, 2026

fix: CLI consistency improvements (target aliases, deps clean flags, verbose shorthand, help text) #303

Merged

7 tasks

danielmeppiel mentioned this pull request Mar 31, 2026

feat: Marketplace integration -- read marketplace.json for plugin discovery + governance #503

Merged

7 tasks

danielmeppiel mentioned this pull request Apr 3, 2026

Architecture: Unify integration dispatch, result types, and hook dedup #561

Closed

sergio-sisternes-epam mentioned this pull request Apr 9, 2026

fix(init): remove triple confirm prompt on Windows CP950 terminals (#602) #647

Merged

This was referenced Apr 13, 2026

Marketplace add source parity with the Anthropic spec #676

Open

[FEATURE] DRAFT NuGet v3 feed as marketplace backend Vicente-Pastor/apm#4

Open

This was referenced Apr 15, 2026

apm marketplace add fails to authenticate when fetching marketplace.json from private repositories #669

Closed

fix: marketplace add authenticates for private repos #688

Closed

This was referenced Apr 21, 2026

[FOLLOW-UP #788] Normalise default-scheme ports (443/80/22) on DependencyReference / HostInfo #797

Closed

fix(install): warn when --allow-protocol-fallback reuses a custom port across schemes (#786) #789

Merged

danielmeppiel mentioned this pull request Apr 22, 2026

feat(policy): enforce apm-policy.yml at install time #832

Merged

9 tasks

tanbro mentioned this pull request Apr 24, 2026

Multi-host dependency support: lockfile identity, token resolution consistency, error clarity #773

Open

github-actions Bot mentioned this pull request Apr 30, 2026

fix(install): align validation auth chain with install #941

Merged

6 tasks

danielmeppiel mentioned this pull request May 10, 2026

refactor(tests): retire script enumeration; pytest discovers tests/integration/ #1247

Merged

7 tasks

danielmeppiel mentioned this pull request May 19, 2026

[BUG] APM self-update fails on Windows due to security policies #1389

Closed

sergio-sisternes-epam pushed a commit that referenced this pull request May 19, 2026

Merge pull request #2 from danielmeppiel/integrate-copilot-runtime

f73dfbb

Integrate copilot runtime

danielmeppiel mentioned this pull request May 19, 2026

feat(experimental): copilot-app target deploys scheduled prompts to App DB #1405

Merged

2 tasks

github-actions Bot mentioned this pull request May 21, 2026

fix(hooks): stabilize root .apm hook source-ids across renames/worktrees (supersedes #1330, closes #1329) #1392

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate copilot runtime#2

Integrate copilot runtime#2
danielmeppiel merged 9 commits into
mainfrom
integrate-copilot-runtime

danielmeppiel commented Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danielmeppiel commented Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant