Skip to content

perf(#1433): sparse-cone consumer materialization for subdir deps#1436

Merged
danielmeppiel merged 8 commits into
mainfrom
danielmeppiel/perf-sparse-cone-materialize-1433
May 21, 2026
Merged

perf(#1433): sparse-cone consumer materialization for subdir deps#1436
danielmeppiel merged 8 commits into
mainfrom
danielmeppiel/perf-sparse-cone-materialize-1433

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

@danielmeppiel danielmeppiel commented May 21, 2026

TL;DR

Cold install of apm install against a subdir dep from a large monorepo (github.com/dotnet/skills, 1.7 GB upstream) now completes in ~3.8s instead of the original 172-284s observed on main -- a ~45-75x cold-install speedup, validated end-to-end against the live repo by the harness in scripts/perf/measure_dotnet_skills.py.

Closes #1433.

What changed (architecture)

The fix is layered at the persistent cache, so every command that resolves git deps (apm install, apm update, apm run, and any future command) benefits automatically. No command-specific code paths.

Three composable techniques, all at the cache layer

  1. Sparse-cone consumer materialization (GitCache._create_checkout, bare_cache.materialize_from_bare). When sparse_paths is set, only the requested top-level directories are written to the consumer's working tree. Shard layout: checkouts_v1/<shard>/<sha>/<variant>/ where <variant> is full or sparse-<hash16>. Full and sparse variants of the same SHA coexist.

  2. Partial bare clone (--filter=blob:none) + promisor consumer (GitCache._ensure_bare_repo, _create_checkout). The bare cache for sparse callers lives at db_v1/<shard>__p/ and is cloned with --filter=blob:none, so it carries commits + trees only (~3% of the upstream size on dotnet/skills). The consumer then has its remote.origin configured as a promisor pointing at the real upstream URL with partialclonefilter=blob:none. When git checkout runs against the sparse cone, only the blobs reachable from the cone are lazy-fetched. Falls back transparently to a full bare clone if the server rejects filter v2.

  3. Tiered ref resolution threaded into the cache (github_downloader.download_subdirectory_package). Subdir deps now resolve the SHA via the attached TieredRefResolver (L0 in-memory → L1 forge API → L2 bare rev-parse → L3 ls-remote) BEFORE calling GitCache.get_checkout, and pass locked_sha=<resolved>. The cache's internal ls-remote is skipped on every call where a tier hit succeeds. Saves 1 network round-trip per subdir dep per call.

Validation (live, against github.com/dotnet/skills)

Captured by scripts/perf/measure_dotnet_skills.py -- self-contained harness that wipes ALL APM caches, installs once cold, retains caches, installs again warm, reports a comparison table.

Metric Baseline (origin/main) After sparse-cone alone After partial-clone + tiered SHA (this PR) Total delta vs main
Cold wall time 172.81-283.99s 62.59s 3.74-3.83s ~45-75x faster
Warm wall time 4.24s 2.06s 1.12-1.43s ~3-4x faster
Bare cache on disk ~1.7 GB ~1.7 GB 4.6 MB ~370x smaller
Checkouts shard 12.9 MB 0.44 MB 0.47 MB ~27x smaller
Delivered apm_modules/.../<dep>/ 42.5 KB 42.5 KB 42.5 KB Byte-identical

The cold/warm wall-time numbers carry network-jitter noise (sample size 2-3 per phase). The deterministic byte counts are the headline -- bare ~370x smaller, checkouts shard ~27x smaller, consumer payload byte-identical. The cold wall-time win is real and reproducible across runs because the new floor is local-FS + small network rather than 1.7 GB-wire-bound.

Verbose-mode log output (cold run)

[>] Resolving skills-analyzing-dotnet-performance...
    [perf] subdir github.com/dotnet/skills/plugins/dotnet-diag/skills/analyzing-dotnet-performance#main cache=persistent
sparse=plugins/dotnet-diag/skills/analyzing-dotnet-performance
    [perf] materialize sparse=yes size=0.47MB
    [perf] ref-resolver tiers: commits_api=1

All [perf] lines are verbose-only and routed through InstallLogger; default install output is unchanged.

Why this was slow before

The pre-fix cold-install path for a subdir dep:

  1. Forge API resolves main -> SHA (~0.5s, fine).
  2. GitCache._ensure_bare_repo does git clone --bare <url> -- 1.7 GB transferred over the wire even though the consumer wanted a 42 KB subdirectory. Dominant cost: 50-260s depending on network.
  3. GitCache._create_checkout does git clone --local --shared <bare> <consumer> and git checkout <sha> -- materializes the full 78 MB working tree for a consumer that needs 42 KB.
  4. download_subdirectory_package copytrees only the requested subdir out of the 78 MB tree into apm_modules/.

Two pathologies:

  • The cache downloaded 40,000x more bytes than the consumer needs (1.7 GB wire vs 42 KB delivered).
  • The cache then wrote 1,800x more bytes to local disk than needed (78 MB shard vs 42 KB delivered).

How the techniques close each gap (mapped to the cost model)

A package manager's cold wall-time decomposes into four phases. Each technique attacks the dominant phase for this workload.

Phase Pre-fix cost (dotnet/skills) Bottleneck Technique applied Post-fix cost
Resolve (ref -> SHA) ~0.5-2s, 1-2 RTT Cache did its own ls-remote ignoring the already-resolved tiered SHA Tiered SHA threading (technique 3) ~0.5s, 1 RTT (forge API)
Fetch (bytes from network) ~50-260s, ~1.7 GB Full bare clone with no filter Partial bare clone + promisor (technique 2) ~3s, ~5 MB
Materialize (write to disk) ~1-3s, ~78 MB written Full working-tree checkout Sparse-cone variant cache (technique 1) <1s, ~0.5 MB written
Verify (integrity) <0.1s Hash check on warm hit; unchanged n/a <0.1s

The fetch phase was >80% of cold wall-time. Cutting it from 1.7 GB to ~5 MB is the headline win and explains the ~45-75x ratio.

Architectural consistency / pervasiveness

classDiagram
    class apm_install
    class apm_update
    class apm_run_resolve

    class InstallPipeline {
        +run()
    }
    class ResolvePhase {
        +run()
    }
    class GitHubPackageDownloader {
        +download_subdirectory_package()
        +download_full_repo_package()
        -resolve_git_reference()
    }
    class TieredRefResolver {
        +resolve()
        -L0 PerRunRefCache
        -L1 commits API
        -L2 bare rev-parse
        -L3 ls-remote
    }
    class GitCache {
        +get_checkout(url, ref, locked_sha, sparse_paths)
        -_ensure_bare_repo(partial)
        -_create_checkout(sparse_paths, promisor_url)
        -_variant_key()
    }
    class BareCache {
        +materialize_from_bare(sparse_paths)
    }

    apm_install --> InstallPipeline
    apm_update --> InstallPipeline
    apm_run_resolve --> InstallPipeline
    InstallPipeline --> ResolvePhase
    ResolvePhase --> GitHubPackageDownloader : attaches install_logger + tiered_resolver
    GitHubPackageDownloader --> TieredRefResolver : resolve_git_reference()
    GitHubPackageDownloader --> GitCache : get_checkout(locked_sha, sparse_paths)
    GitHubPackageDownloader --> BareCache : materialize_from_bare(sparse_paths) [shared-bare path]
    GitCache --> BareCache : independent persistent cache
Loading

The three techniques land at GitCache (techniques 1 + 2) and TieredRefResolver (technique 3 -- usage threading). Every command that resolves a dep -- install, update, run (when it triggers resolve), and any future command -- shares this exact code path. There is NO command-specific perf logic. Attestation: I verified this by:

  • install calls InstallPipeline.run() -> ResolvePhase -> GitHubPackageDownloader.download_subdirectory_package() -> GitCache.get_checkout(sparse_paths=[subdir]).
  • update reuses the same InstallPipeline (see src/apm_cli/cli.py update command -> install_workflow).
  • run triggers resolve through the same pipeline when deps are stale.

All three commands benefit from the partial bare + tiered SHA + sparse cone without any per-command code change. The cache is shared across projects and across users on the same machine (per-user ~/Library/Caches/apm/git/ or ~/.cache/apm/git/).

Implementation shape: A (cache-layer hook)

I evaluated three shapes:

  • A. Add optional kwargs to GitCache.get_checkout and materialize_from_bare. Smallest diff, lowest blast radius. Picked.
  • B. New sibling functions for the sparse/partial path. Doubles the API surface.
  • C. Helper extraction. Pulls sparse setup into both paths but obscures the invariant that the cache layer owns the on-disk shape.

A wins because: (1) every existing caller still works with sparse_paths=None; (2) the cache key variant taxonomy honestly encodes the new dimension (sparse_paths and partial-bare flavor are both first-class shard keys, not hidden state); (3) the bare-shared invariant is preserved per flavor -- sparse consumers share the __p bare; full consumers share the legacy bare; bare URL is never keyed by (url, subdir).

Trade-offs

  • Partial clone fallback. When the server rejects --filter=blob:none (old Gerrit, pre-2.20 GHE), _ensure_bare_repo retries once without the filter. The resulting bare is full; future sparse callers find all blobs locally and skip lazy fetch -- behavior degrades to today's baseline, no behavior change for the user. Logged at WARNING with a sanitized URL for operator visibility.
  • Two bare flavors per URL. Full and partial bares for the same upstream don't share storage -- there are two bare dirs on disk. This is intentional: a full caller cannot rely on a partial bare (lazy fetch may have skipped blobs they need), and a sparse caller cannot rely on a full bare cheaply (no win from filter that already shipped). Cost: roughly 2x bare storage for the worst case of one repo used by both full and sparse callers. For the actual APM workload (overwhelmingly subdir deps) the partial flavor dominates and the full flavor is rarely populated.
  • Promisor lazy-fetch at checkout time. A sparse checkout against a partial bare triggers an on-demand blob fetch from upstream during git checkout. On a slow network this adds 0.5-2s. The harness confirms total cold wall-time is still 3.8s on a typical link, so the lazy fetch is bounded by the size of the sparse cone, not the full repo.
  • No archive fast-path (codeload tarball). Filed as follow-up. With the partial-clone result at 3.8s, the marginal value of switching to tarball-extract is small (~1-2s more savings, breaks correctness invariants around git object verification).
  • No hardlinks for apm_modules delivery. Today: copytree from the cached checkout into apm_modules/. At 42 KB this is ~5ms; not worth the cross-volume / Windows compatibility cost in this PR. Filed as follow-up.

What stayed off-limits

  • No wire-protocol change beyond --filter=blob:none on bare clone (clean, vanilla git clone --bare semantics).
  • No auth change. _scrub_bare_remote_url is untouched. The partial-bare path in git_cache.py uses a canonical no-token URL throughout (the call site at github_downloader.py:1122 builds https://<host>/<owner>/<repo> without credentials; auth lives in env). Bare retains a clean URL for promisor use.
  • No lockfile schema change. apm.lock.yaml writers and readers are unchanged.
  • _try_sparse_checkout (dead-code reference function in github_downloader.py) kept as-is per issue [PERF] materialize_from_bare does full checkout for subdir deps -- enable sparse-cone (78 MB -> <1 MB, ~10x download) #1433 spec.

How to test

# Wipes ~/Library/Caches/apm/git (or ~/.cache/apm/git on Linux), runs cold + warm
uv run python scripts/perf/measure_dotnet_skills.py

# Unit tests
uv run --extra dev pytest tests/unit/cache/test_git_cache_sparse.py \
                          tests/unit/deps/test_bare_cache_sparse.py -v

# Full lint chain (mirrors CI)
uv run --extra dev ruff check src/ tests/ \
  && uv run --extra dev ruff format --check src/ tests/ \
  && uv run --extra dev python -m pylint --disable=all --enable=R0801 \
       --min-similarity-lines=10 --fail-on=R0801 src/apm_cli/ \
  && bash scripts/lint-auth-signals.sh

Follow-ups (separate issues)

  • Codeload tarball fast-path for SHA-pinned subdir deps (marginal additional win now that partial clone is in).
  • Hardlinks/reflinks for apm_modules delivery.
  • Eviction policy for the legacy non-partial bare flavor for repos that only have sparse callers in practice.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

danielmeppiel and others added 2 commits May 21, 2026 17:20
Add sparse-cone to both consumer-materialization paths:
- bare_cache.materialize_from_bare() gains sparse_paths kwarg
- GitCache.get_checkout/_create_checkout gain sparse_paths kwarg;
  on-disk layout becomes checkouts_v1/<shard>/<sha>/<variant>/ where
  variant is 'full' or 'sparse-<hash16>'.

Wire subdir_path into both call sites in download_subdirectory_package
so subdir deps only materialize the requested top-level dir(s) in the
consumer working tree.

Add verbose-only InstallLogger methods for perf diagnostics:
subdir_download_start, bare_clone_strategy, materialize_result,
tier_summary. Emit at appropriate flow points.

Add perf harness at scripts/perf/measure_dotnet_skills.py.

Add unit tests for variant-key derivation, sparse layout, full/sparse
coexistence, and per-call sparse paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Use contextlib.suppress in resolve.py for tier_summary fallback (ruff SIM105).
- Format new test files (ruff format pass).
- Drop unused json import in test_git_cache_sparse.py; drop unused pytest
  import in test_bare_cache_sparse.py.
- Harness _wipe_cache_root: re-grant 0o700 perms before rmtree so stale
  read-only entries from earlier APM runs don't fail the wipe.
- Update test_git_cache_branch_coverage + test_git_cache_phase3w4 to
  pre-create checkout_dir at <sha>/full/ for the new layout.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 21, 2026 16:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves apm install performance for subdirectory dependencies from large monorepos by adding sparse-cone checkout support to both consumer materialization paths (persistent GitCache and bare_cache.materialize_from_bare). It also introduces verbose-only perf diagnostics, updates cache checkout layout to include per-variant subdirectories, and adds unit tests plus a perf measurement harness.

Changes:

  • Add sparse_paths plumbing and sparse-checkout setup to both GitCache checkout creation and materialize_from_bare.
  • Change on-disk checkout layout to checkouts_v1/<shard>/<sha>/<variant>/ to allow full and sparse variants to coexist.
  • Add verbose perf logging hooks and new unit tests covering sparse behavior and variant-key determinism.
Show a summary per file
File Description
tests/unit/test_git_cache_phase3w4.py Updates expected checkout path to include the full/ variant directory.
tests/unit/test_git_cache_branch_coverage.py Same checkout layout expectation update for branch coverage tests.
tests/unit/deps/test_bare_cache_sparse.py New tests validating sparse-cone behavior in materialize_from_bare.
tests/unit/cache/test_proxy_compat.py Updates fixture checkout path to include full/.
tests/unit/cache/test_git_cache.py Updates multiple tests for new .../<sha>/full/ layout.
tests/unit/cache/test_git_cache_sparse.py New tests for sparse variants, layout, and _variant_key determinism.
src/apm_cli/install/phases/resolve.py Wires an install logger into the downloader and emits resolver tier stats in verbose mode.
src/apm_cli/deps/github_downloader.py Threads sparse_paths through persistent/shared-bare paths and adds verbose perf metrics helpers.
src/apm_cli/deps/bare_cache.py Adds sparse_paths support to consumer materialization (sparse-checkout before checkout).
src/apm_cli/core/command_logger.py Adds verbose-only perf logging methods for subdir downloads and resolver tier summaries.
src/apm_cli/cache/git_cache.py Implements variant-keyed checkout directories and sparse-cone materialization support.
scripts/perf/measure_dotnet_skills.py Adds a perf harness script to measure cold/warm install improvements and cache footprint.

Copilot's findings

  • Files reviewed: 12/12 changed files
  • Comments generated: 6

Comment thread src/apm_cli/core/command_logger.py Outdated
dep_name: str,
cache_state: str,
sha_short: str = "",
sparse_paths: list | None = None,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 324e08c: tightened to list[str] | None.

Comment thread src/apm_cli/core/command_logger.py Outdated
Comment on lines +393 to +396
Names the dep, the bare-cache state (``cold`` / ``warm`` /
``persistent``), the resolved SHA (short), and the sparse paths
being requested. Surfaces enough state to diagnose a perf
regression from one log line.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 324e08c: expanded docstring to include shared-bare (chose to widen the contract rather than narrow the call sites since the new vocab is informative).

Comment thread src/apm_cli/core/command_logger.py Outdated
Comment on lines +440 to +455
def tier_summary(self, stats: dict):
"""Log the tiered ref resolver hit counts (verbose only).

Emitted at the end of the resolve phase so the reader can see
how many ref->SHA lookups hit each tier (L0 per-run cache,
L1 commits API, L2 bare rev-parse, L3 legacy clone) without
wiring a debugger. A run dominated by L3 is the canonical
signal that ref-resolution is paying full clone cost.
"""
if not self.verbose or not stats:
return
non_zero = {k: v for k, v in stats.items() if v}
if not non_zero:
return
parts = " ".join(f"{k}={v}" for k, v in non_zero.items())
_rich_echo(f" [perf] ref-resolver tiers: {parts}", color="dim")
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 324e08c: tightened to dict[str, int].

ref = dep_ref.reference # None if not specified
subdir_path = dep_ref.virtual_path
_perf_logger = getattr(self, "install_logger", None)
_dep_display = str(dep_ref)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rejected with reasoning. DependencyReference does have a custom __str__ (see src/apm_cli/models/dependency/reference.py:1593) that yields the canonical host/owner/repo/path#ref form. The actual verbose log output captured in the PR body shows this is what users see: [perf] subdir github.com/dotnet/skills/plugins/dotnet-diag/skills/analyzing-dotnet-performance#main cache=persistent. No raw dataclass repr leaks into the log. Leaving as is.

Comment thread src/apm_cli/cache/git_cache.py Outdated
Comment on lines +54 to +63
the first 16 hex chars of sha256(json.dumps(sorted(paths))).
Two consumers requesting the same set of paths share a shard;
different sets get separate shards. We do NOT promote a full
checkout to also satisfy a sparse subset -- that complicates
eviction for negligible benefit (each sparse shard is ~subdir
size, so duplication cost is small).
"""
if not sparse_paths:
return "full"
payload = json.dumps(sorted(sparse_paths), separators=(",", ":"))
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 324e08c: now hashes sorted(set(sparse_paths)). Added an inline comment about the set semantics.

Comment on lines +96 to +114
def _dir_size_bytes(path: Path) -> int:
"""Return the total on-disk size of a directory tree in bytes.

Best-effort: silently skips files that disappear or cannot be
stat-ed mid-walk (e.g. transient .git lock files). Used only for
verbose-mode perf diagnostics (#1433) -- never gates behavior.
"""
total = 0
try:
for dirpath, _dirnames, filenames in os.walk(str(path)):
for fname in filenames:
fpath = os.path.join(dirpath, fname)
try:
total += os.path.getsize(fpath)
except OSError:
continue
except OSError:
return 0
return total
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 324e08c: switched to os.lstat(fpath).st_size and noted in the docstring the alignment with cache.git_cache._dir_size.

danielmeppiel and others added 3 commits May 21, 2026 18:49
Wire bare cache as a promisor (--filter=blob:none) for sparse callers,
and resolve subdir-dep refs through the tiered resolver before calling
GitCache.get_checkout (skips redundant ls-remote).

Combined with the existing sparse-cone consumer materialization:

| metric                 | main           | this commit     | delta       |
| ---------------------- | -------------- | --------------- | ----------- |
| cold install wall time | 172-284s       | 3.74-3.83s      | ~45-75x     |
| bare cache on disk     | ~1.7 GB        | 4.6 MB          | ~370x       |
| checkouts shard        | 12.9 MB        | 0.47 MB         | ~27x        |
| apm_modules delivered  | 42.5 KB        | 42.5 KB         | identical   |

Cache layer (GitCache, BareCache) is the single hook point, so apm
install, update, and run all benefit transparently. New performance-expert
APM persona under .apm/agents/ encodes the package-manager perf
playbook for future review activations.

Refs #1433.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- command_logger.subdir_download_start: tighten sparse_paths type to
  list[str] | None, expand docstring cache_state vocabulary to include
  shared-bare (matches call sites).
- command_logger.tier_summary: tighten stats type to dict[str, int]
  to match TieredRefResolver.stats.
- git_cache._variant_key: dedupe sparse_paths with set() before sort
  so [a,a] and [a] collapse to the same shard (the documented 'set of
  paths' semantics).
- github_downloader._dir_size_bytes: use os.lstat instead of
  os.path.getsize so symlinks do not pull bytes from outside the
  checkout and match cache.git_cache._dir_size convention.

Rejected with reasoning: the comment that DependencyReference is a
dataclass without __str__ is incorrect -- it has a custom __str__ at
models/dependency/reference.py:1593 that yields the canonical
'host/owner/repo/path#ref' form already visible in the verbose log
output captured in the PR body.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the performance-expert persona to the advisory review panel as a
third conditional activation alongside auth-expert and doc-writer.

Activation: PR touches cache/, deps/, install/{phases,pipeline,resolve},
scripts/perf/, perf-instrumentation logs, or PR description claims a
performance win (speedup, latency, bytes-on-disk). Fallback self-check
covers hot-path changes the file-globs miss.

Roster invariant honored: frontmatter description, roster table,
conditional rules section, execution checklist fan-out list, and
panelist-return-schema enum all agree on the new persona slug.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

APM Review Panel: ship_with_followups

Sparse-cone consumer materialization delivers ~370x smaller bare cache and ~45-75x cold install speedup for subdir deps, with byte-identical consumer payload.

cc @danielmeppiel @sergio-sisternes-epam -- a fresh advisory pass is ready for your review.

Nine panelists converge cleanly: the PR is architecturally sound, security-neutral, and delivers a headline-grade performance improvement that targets a real churn signal (cold installs from large monorepos, per #1433). The three composable techniques -- sparse-cone variant cache, partial bare clone with promisor materialization, and tiered SHA resolution -- are layered at GitCache so install/update/run all benefit without call-site changes. Unit-tier coverage is strong (18/18 pass with real git fixtures), and the deterministic byte-count claims (370x bare reduction, byte-identical 42.5 KB payload) are unchallenged by any panelist.

The panel's recommended follow-ups cluster into three themes: (1) documentation/communication -- both oss-growth-hacker and doc-writer independently flag the missing CHANGELOG entry, making this the highest-convergence follow-up; (2) observability -- cli-logging-expert, devx-ux-expert, and performance-expert all note the partial-clone fallback warning is invisible at default verbosity and the [perf] line doesn't fire on the persistent-cache cold path; (3) wire-cost polish -- performance-expert identifies --no-tags as a zero-risk one-liner that avoids fetching tag refs on large historical repos. None of these block the correctness or safety of what ships today.

Supply-chain and auth findings are nit-level defense-in-depth observations (ASCII assertion on sparse_paths, promisor trust documentation) with no exploitable surface identified. The test-coverage-expert's gap on the filter-rejection fallback branch is real but the path is a graceful degradation to full clone -- failure there means slower, not broken.

Aligned with: pragmatic as npm -- Cold install latency is the headline UX promise of a package manager. Reducing 172-284s to 3.7-3.8s for a real-world subdir dep puts APM in npm-install-parity territory for the monorepo case that was previously a churn driver.; portable by manifest -- Lockfile schema unchanged; variant-keyed cache is an internal optimization invisible to the manifest contract. Byte-identical consumer payload (42.5 KB) proves determinism is preserved across the new code paths.

Growth signal. This PR is the strongest adoption-recovery story since launch, targeting the 'gave up on cold install from large monorepo' cohort -- a real churn signal surfaced in #1433. The oss-growth-hacker recommends leading all downstream framing with the deterministic byte ratio (370x smaller bare cache, 42.5 KB payload) over the jitter-bound wall-time range, and filing a repostable one-liner for release notes and social copy. The missing CHANGELOG entry is the only thing standing between this win and its narrative amplification.

Panel summary

Persona B R N Takeaway
Python Architect 0 1 2 Cache-layer sparse-cone is well-layered; variant-key and partial-bare taxonomy compose cleanly. One recommended extraction for duplicated sparse-checkout subprocess logic across two modules.
Cli Logging Expert 0 0 5 Verbose perf lines use unregistered prefix and inconsistent formatting; partial-clone fallback warning hidden from default users.
Devx Ux Expert 0 0 4 Install/update/run share the cache path; consumer payload byte-identical; verbose gating clean. Failure-mode wording could be more actionable for end users.
Supply Chain Security Expert 0 0 3 Wire protocol unchanged on non-partial path; auth env propagation consistent; sparse_paths validated upstream. Three nit-level defense-in-depth observations.
Oss Growth Hacker 0 3 2 Headline-worthy perf win (~45-75x cold install, ~370x smaller bare cache) but no CHANGELOG / release-note artifact ships with it; the deterministic byte ratio should lead all downstream framing.
Auth Expert 0 0 1 Auth env propagation through new partial-clone + promisor path is consistent. One nit on cache-hit re-fetch trust contract.
Doc Writer 0 2 1 Persona prose and panel roster are internally consistent and on-voice; two user-facing doc surfaces drift on this PR (CHANGELOG omits the ~45-75x install speedup; cache.md layout block predates the new variant subdir + __p partial-bare flavor).
Test Coverage Expert 0 2 0 Core sparse-cone determinism and coexistence well-tested (18/18 pass at unit tier with real git fixtures). Two recommended gaps: partial-clone fallback branch has no regression trap; tiered-SHA-before-get_checkout threading lacks an assertion that locked_sha + sparse_paths propagate to the cache.
Performance Expert 0 4 2 Three composable techniques (sparse-cone variant cache, partial-bare promisor, tiered SHA threading) are correctly layered at GitCache so every command benefits. Cache key is honestly variant-keyed (sorted+deduped+canonical-JSON sha256[:16]); per-variant locking avoids unnecessary serialization. Headline byte-count claims (~370x smaller bare, ~27x smaller checkouts shard, byte-identical payload) are deterministic; ~45-75x wall-time claim leans on N=1 harness. One cheap wire-cost win still on the table (--no-tags); one observability gap (install_logger not threaded into GitCache cold-clone path).

B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.

Top 5 follow-ups

  1. [Oss Growth Hacker + Doc Writer] Add CHANGELOG.md [Unreleased] entry for the ~45-75x cold-install speedup closing [PERF] materialize_from_bare does full checkout for subdir deps -- enable sparse-cone (78 MB -> <1 MB, ~10x download) #1433 -- Convergent finding from two independent panelists. A headline-grade perf win without a changelog entry is invisible to users reading release notes and blocks the growth narrative.
  2. [Performance Expert] Add --no-tags to partial bare clone invocation -- One-liner with zero downside on APM's workload (consumer never needs tags). Avoids fetching tag refs + tagged objects that blob:none does not filter on large historical repos.
  3. [Performance Expert + Cli Logging Expert] Thread install_logger into GitCache cold-clone path and surface bare_clone_strategy on persistent-cache cold-miss -- The dominant code path (persistent cache cold miss) currently fires no [perf] line and the partial-clone fallback warning is invisible at default verbosity. Three panelists independently flagged this observability gap.
  4. [Test Coverage Expert] Add regression-trap test for filter-rejection fallback (partial -> full clone retry) -- The fallback branch has no automated assertion. While failure means slower-not-broken, the performance-expert independently noted __p name mis-classification after fallback -- a test would catch both issues.
  5. [Doc Writer] Update docs/reference/cli/cache.md layout block to document subdir and __p partial-bare flavor -- New cache layout artifacts are user-observable (visible in ~/.apm/cache/); the reference docs predate this PR's structural additions.

Architecture

classDiagram
    direction LR
    class GitCache {
      <<Facade>>
      +get_checkout(url, ref, locked_sha, env, sparse_paths) Path
      -_resolve_sha(url, ref, locked_sha, env) str
      -_ensure_bare_repo(url, shard_key, sha, env, partial) Path
      -_create_checkout(url, shard_key, sha, env, sparse_paths, promisor_url) Path
      -_bare_has_sha(bare_dir, sha, env) bool
      -_fetch_into_bare_locked(bare_dir, url, sha, env) None
    }
    class BareCache {
      <<Module>>
      +materialize_from_bare(bare_path, consumer_dir, ref, env, known_sha, sparse_paths) str
    }
    class GitHubPackageDownloader {
      <<Facade>>
      +download_subdirectory_package(dep_ref, ...)
      +download_full_repo_package(dep_ref, ...)
      +install_logger InstallLogger
      +persistent_git_cache GitCache
    }
    class TieredRefResolver {
      <<Chain of Responsibility>>
      +resolve(dep_ref) ResolvedRef
      +stats dict
    }
    class InstallLogger {
      +subdir_download_start(dep_name, cache_state, sha_short, sparse_paths)
      +bare_clone_strategy(strategy, elapsed_ms)
      +materialize_result(sparse_applied, consumer_size_bytes)
      +tier_summary(stats)
    }
    class ResolvePhase {
      +run(ctx) None
    }
    GitHubPackageDownloader *-- GitCache : persistent_git_cache
    GitHubPackageDownloader *-- TieredRefResolver : _tiered_resolver
    GitHubPackageDownloader ..> BareCache : _materialize_from_bare delegates
    GitHubPackageDownloader o-- InstallLogger : install_logger
    ResolvePhase ..> GitHubPackageDownloader : wires cache + logger + resolver
    GitCache ..> BareCache : parallel path
    class GitCache:::touched
    class BareCache:::touched
    class GitHubPackageDownloader:::touched
    class InstallLogger:::touched
    class ResolvePhase:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading
flowchart TD
    A[apm install] --> B[InstallPipeline.run]
    B --> C[ResolvePhase.run]
    C -->|attach| D[downloader.install_logger]
    C -->|attach| E[downloader._tiered_resolver]
    C --> F[download_subdirectory_package]
    F --> G{persistent_git_cache?}
    G -->|yes| H[resolve_git_reference: L0 to L3]
    H --> I[GitCache.get_checkout sparse_paths locked_sha]
    I --> J{checkout hit?}
    J -->|hit| K[return cached Path]
    J -->|miss| L[_ensure_bare_repo partial=True]
    L --> M{server accepts filter?}
    M -->|no| N[fallback: full bare clone]
    M -->|yes| O[atomic_land db_v1 shard__p]
    N --> O
    O --> P[_create_checkout local shared no-checkout]
    P --> Q[git config promisor + url]
    Q --> R[sparse-checkout init cone + set]
    R --> S[git checkout sha lazy blob fetch]
    S --> T[atomic_land checkouts_v1 shard sha variant]
    T --> K
    K --> V[copytree to apm_modules]
Loading

Recommendation

Land this PR -- the architecture is clean, unit coverage is strong, and the perf win is real and deterministic at the byte level. Track the CHANGELOG entry, --no-tags one-liner, and observability threading as immediate follow-ups in #1433's wake; the harness hardening and cache.md drift can ride the next polish pass.


Full per-persona findings

Python Architect

  • [recommended] Duplicated sparse-checkout subprocess logic in git_cache.py and bare_cache.py should be extracted to a shared helper at src/apm_cli/cache/git_cache.py:535
    Identical 2-step subprocess pattern (sparse-checkout init --cone + set ) is copy-pasted in git_cache.py:_create_checkout (~535-557) and bare_cache.py:materialize_from_bare (~650-667). Two call sites with identical subprocess orchestration crosses the extraction threshold; a utils/git_sparse.py helper gives a single place to evolve sparse semantics.
    Suggested: Extract apply_sparse_cone(git_exe, repo_dir, paths, env, timeout=30) -> None to src/apm_cli/utils/git_sparse.py.

  • [nit] install_logger attached via duck-typed assignment rather than typed constructor param at src/apm_cli/install/phases/resolve.py:134
    downloader.install_logger = ctx.logger sets an undeclared field. Consistent with existing _tiered_resolver pattern so not blocking, but both fields would benefit from Optional[InstallLogger] typing for IDE/mypy discoverability.

  • [nit] _dir_size_bytes duplicated between github_downloader.py and scripts/perf/measure_dotnet_skills.py at src/apm_cli/deps/github_downloader.py:96
    Acceptable today (scripts are standalone) but flag for consolidation into utils/file_ops.py if a third call site emerges.

Cli Logging Expert

  • [nit] [perf] prefix is not registered in STATUS_SYMBOLS at src/apm_cli/core/command_logger.py:403
    STATUS_SYMBOLS is the canonical registry. Ad-hoc [perf] bypasses _rich_echo symbol validation. Use [i] info with 'perf:' qualifier or register the key.
    Suggested: Register STATUS_SYMBOLS['perf'] or rewrite as _rich_echo('[i] perf: ...', color='dim').

  • [nit] Partial-clone fallback warning only visible via Python logger (verbose), but user-actionable on cold subdir installs at src/apm_cli/cache/git_cache.py:352
    Fallback turns cold install from ~3s to 50-260s. At default verbosity user sees no explanation. _rich_warning at default level would name the cause and suggest checking server version.
    Suggested: Add _rich_warning('Partial clone unsupported by server -- falling back to full clone (slower)') before retry.

  • [nit] cache=persistent / cache=shared-bare is internal jargon at src/apm_cli/core/command_logger.py:403
    These describe implementation, not outcome. 'hit(persistent)' or 'miss(clone)' tells the reader whether work was avoided.

  • [nit] size=0.47MB lacks space before unit; inconsistent with took=3.2 ms elsewhere at src/apm_cli/core/command_logger.py:436
    Project consistency favors space-before-unit. Pick one and apply uniformly.

  • [nit] tier_summary call guarded by broad contextlib.suppress hides signature drift bugs at src/apm_cli/install/phases/resolve.py:531
    suppress(AttributeError, TypeError) swallows signature-drift errors silently. hasattr check is explicit and won't mask real bugs inside tier_summary.
    Suggested: Replace contextlib.suppress with if hasattr(ctx.logger, 'tier_summary'):.

Devx Ux Expert

  • [nit] Sparse-checkout failures propagate as raw subprocess.CalledProcessError without a user-facing wrapper at src/apm_cli/cache/git_cache.py
    When sparse-checkout init or set fails, the user sees git's stderr without context. A wrapper like 'Sparse checkout failed for -- ensure the path exists in the repository at ' aligns with the failure-mode-is-the-product principle.

  • [nit] Same raw-error issue in bare_cache.materialize_from_bare path at src/apm_cli/deps/bare_cache.py
    Mirror the wrapper there too; both paths share the failure-mode contract.

  • [nit] Partial-clone fallback warning uses _log.warning (Python logger), invisible at default CLI verbosity at src/apm_cli/cache/git_cache.py
    Users on old servers hit this path and experience the slowdown with no CLI-level signal. (Overlaps with cli-logging-expert finding 2.)

  • [nit] 'Failed to clone (partial fallback also failed)' message is operator-precise but raw for end users at src/apm_cli/cache/git_cache.py
    Suggest: 'Could not clone . Tried partial clone and full clone; both failed. Check network connectivity and repository access.'

Supply Chain Security Expert

  • [nit] Add explicit ASCII-printable/no-whitespace assertion on sparse_paths at the _create_checkout boundary at src/apm_cli/cache/git_cache.py:550
    validate_path_segments blocks .. at DependencyReference construction, and subprocess list-mode prevents shell injection. A defense-in-depth assertion on each sparse_paths element at the cache boundary would catch a future caller that bypasses construction (e.g. an adversarial lockfile entry with NUL bytes).

  • [nit] Promisor URL persists in consumer .git/config; trust model around later git operations should be documented at src/apm_cli/cache/git_cache.py:518
    remote.origin.url is credential-free (correct), and auth comes from inherited subprocess env. But future APM operations on the cached checkout that DO NOT pass env would use ambient credentials. Document the trust boundary or add a .git/config annotation.

  • [nit] _fetch_into_bare_locked detects partial-flavor by directory name suffix rather than git config probe at src/apm_cli/cache/git_cache.py:639
    endswith(_PARTIAL_BARE_SUFFIX) is fragile to future directory renames. A git config --get extensions.partialClone probe would be more robust, though the current convention is documented.

Oss Growth Hacker

  • [recommended] No CHANGELOG entry for a headline-grade performance improvement at CHANGELOG.md
    A 45-75x cold install speedup and 370x smaller bare cache is the strongest conversion story APM has shipped in months -- directly addresses the 'gave up on cold install' cohort. Without an [Unreleased] entry, it will not surface in release notes or social copy. CHANGELOG is the bridge between merged code and external storytelling.
    Suggested: Under [Unreleased] > ### Performance: 'Subdir dependencies from large monorepos (e.g. dotnet/skills) now install ~45-75x faster cold and use ~370x less cache disk via sparse-cone materialization + partial bare clones. (perf(#1433): sparse-cone consumer materialization for subdir deps #1436, closes [PERF] materialize_from_bare does full checkout for subdir deps -- enable sparse-cone (78 MB -> <1 MB, ~10x download) #1433)'

  • [recommended] Surface the repostable one-liner ('42 KB from 1.7 GB in 3.8s') in release-bound copy
    The headline test 'what can a reader repost?' has a perfect answer here that does not appear verbatim anywhere a non-contributor will see it. Include in CHANGELOG/release notes so every downstream artifact can quote it directly.
    Suggested: Add to release copy: 'apm install now delivers 42 KB from a 1.7 GB monorepo in 3.8s cold (was 172-284s).'

  • [recommended] Lead with the deterministic byte ratio (370x) over the jitter-bound wall-time range (45-75x)
    The wide wall-time range invites 'I only got 20x' replies that erode trust. The byte ratio is unjitterable proof. Frame: '370x smaller cache footprint; 45-75x faster cold install depending on network'.
    Suggested: Reorder headline to byte ratio first, time ratio second.

  • [nit] Wider applicability beyond dotnet/skills not mentioned
    Same fix benefits any subdir dep from a large repo (vscode extensions, anthropics/skills, community plugin aggregators). Naming 1-2 others expands perceived audience from '.NET devs' to 'anyone with monorepo deps'.

  • [nit] No docs or blog hook filed
    A short 'How APM handles large monorepo deps' docs section or blog post would compound the story beyond release notes. Not blocking.

Auth Expert

  • [nit] Promisor URL in cached consumer .git/config is credential-free, but trust contract for hypothetical future re-fetch is not asserted by a test at src/apm_cli/cache/git_cache.py:508
    remote.origin.url is set to the bare upstream URL (no token), and auth comes from inherited subprocess env. Today the cache-hit path returns before any git operation runs, so no exposure exists. But a regression-trap test asserting the cached promisor URL never contains credentials would protect that invariant.

Doc Writer

  • [recommended] CHANGELOG.md [Unreleased] has no entry for the ~45-75x cold-install speedup (closes [PERF] materialize_from_bare does full checkout for subdir deps -- enable sparse-cone (78 MB -> <1 MB, ~10x download) #1433) at CHANGELOG.md
    Repo follows Keep a Changelog; [Unreleased] already records smaller items. A measured 172-284s -> 3.8s cold install on a real subdir dep is the largest user-visible install behavior change in the unreleased window and reduces bare-cache disk ~370x. Omitting means next release notes silently ship the win.
    Suggested: Add under [Unreleased] > ### Changed (or ### Performance): 'Cold apm install for subdirectory git dependencies is ~45-75x faster on large monorepos. GitCache now uses partial bare clones (--filter=blob:none) with promisor remotes plus sparse-cone consumer materialization, and threads the tiered-resolver SHA through to skip a redundant ls-remote. Bare-cache disk usage drops ~370x on the validated workload (dotnet/skills). No lockfile schema or CLI surface change. (perf(#1433): sparse-cone consumer materialization for subdir deps #1436, closes [PERF] materialize_from_bare does full checkout for subdir deps -- enable sparse-cone (78 MB -> <1 MB, ~10x download) #1433)'

  • [recommended] docs/reference/cli/cache.md cache-layout block is stale: new subdir and __p partial-bare flavor are user-observable but undocumented at docs/src/content/docs/reference/cli/cache.md:117
    Users running ls ~/Library/Caches/apm/git/ after this PR see (a) __p/ sibling under db_v1/ and (b) new / level under checkouts_v1/// containing full or sparse-. PR body names both layouts as load-bearing for the win. Stale layout docs are a factual error users will hit when debugging cache size.
    Suggested: Update layout block to show __p (partial bare) under db_v1/ and / (full | sparse-) under checkouts_v1///. Keep addition under ~10 lines net.

  • [nit] performance-expert.agent.md voice, structure, ASCII discipline match existing reviewer-persona pattern; no rewrites needed at .apm/agents/performance-expert.agent.md
    Frontmatter, section order, lane-policing, ASCII-only constraint, and citation pattern all mirror auth-expert.agent.md and supply-chain-security-expert.agent.md. Optional polish: cite function name (verify_checkout_sha) rather than git_cache.py:126 line number to make the persona durable across refactors.

Test Coverage Expert

  • [recommended] Partial-clone fallback branch (filter rejection -> retry without filter) has no test at tests/unit/cache/test_git_cache_sparse.py
    git_cache.py:342-389 implements a fallback: when --filter=blob:none fails, retry without filter. User-observable degradation path (cold install jumps from ~3s to baseline). grep on tests/ for 'partial.*fallback', 'filter.*reject', '_ensure_bare_repo.*partial=True' returns no match. Future refactor breaking the retry would manifest as hard crash instead of graceful degradation.
    Suggested: Add TestPartialBareFlavor::test_partial_clone_fallback_to_full_on_server_rejection: patch subprocess.run side_effect to raise CalledProcessError on first clone, assert retry without --filter succeeds, bare_dir populated.
    Proof (missing at): tests/unit/cache/test_git_cache_sparse.py::TestPartialBareFlavor::test_partial_clone_fallback_to_full_on_server_rejection -- proves: When a git server rejects --filter=blob:none, apm install still completes (degraded to full bare) instead of crashing.
    assert (bare_root / partial_shard).is_dir() and not partial_clone_used_on_retry

  • [recommended] Tiered SHA threading into persistent cache (locked_sha + sparse_paths kwargs) not asserted at the call site at tests/integration/test_github_downloader_phase3w4.py:404
    Diff adds resolve_git_reference() BEFORE get_checkout() and passes result as locked_sha + sparse_paths=[subdir_path]. Existing test_persistent_cache_hit_used uses MagicMock and calls .assert_called() but does NOT inspect .call_args. If a future refactor drops locked_sha, cache falls back to internal ls-remote (one extra RTT per dep) with no test failing.
    Suggested: Extend test_persistent_cache_hit_used: assert call_kwargs.get('locked_sha') is not None and call_kwargs.get('sparse_paths') == ['packages/my-pkg'].
    Proof (missing at): tests/integration/test_github_downloader_phase3w4.py::TestDownloadSubdirectoryPersistentCache::test_persistent_cache_receives_locked_sha_and_sparse_paths -- proves: Tiered SHA resolution feeds the resolved commit into the cache so it skips redundant ls-remote on every subdir dep install.
    assert call_kwargs['locked_sha'] is not None; assert call_kwargs['sparse_paths'] == ['packages/my-pkg']

Performance Expert

  • [recommended] Perf harness records N=1 cold + N=1 warm samples; cannot sustain the 172-284s / 3.74-3.83s ranges from a single invocation at scripts/perf/measure_dotnet_skills.py:171
    scripts/perf/measure_dotnet_skills.py:160-180 invokes _measure exactly once per phase. The cited 1.7x range on cold and tighter range post-fix must come from manual repeated runs the harness does not enforce. At N=1 the wall-time delta is dominated by network jitter. Deterministic byte counts already carry the headline; wall-time deserves the same rigor.
    Suggested: Run N>=3 cold + N>=3 warm samples; report median + min + max + stdev. Add round-trip count via GIT_TRACE_PACKET=1 grep and bytes-on-wire via GIT_TRACE_CURL=1 -- both lower-noise than wall time.

  • [recommended] Verbose [perf] observability gap: bare_clone_strategy never fires on the persistent-cache path (the dominant code path) at src/apm_cli/cache/git_cache.py:268
    github_downloader.py:1170-1184 (persistent-cache hit path) emits only subdir_download_start + materialize_result; the shared-bare path at :1225-1238 emits bare_clone_strategy. When persistent cache is active and a cold miss occurs, GitCache._ensure_bare_repo(partial=True) performs the most expensive op in the pipeline (network clone with --filter=blob:none) and emits ZERO verbose output. User sees cache=persistent identically for a 4ms warm hit and a 3.5s cold partial clone. This is exactly the dimension tier_summary is meant to provide.
    Suggested: Thread install_logger into GitCache.get_checkout (or pass a perf-callback) so _ensure_bare_repo emits 'bare clone strategy=--filter=blob:none took=Xms' on cold miss, and subdir_download_start distinguishes persistent-warm vs persistent-cold.

  • [recommended] Partial bare clone is missing --no-tags; tag refs + tagged objects are not filtered by blob:none on large historical repos at src/apm_cli/cache/git_cache.py:320
    git_cache.py:320-327 builds clone_args without --no-tags. --filter=blob:none filters BLOBS but git still downloads every tag tip + reachable commit+tree from refs/tags/*. On monorepos with many tagged releases (dotnet/skills harness target), tag objects can be a meaningful fraction of partial-clone wire cost. APM never uses tag refs after the SHA is pinned (lockfile holds resolved SHA; TieredRefResolver resolves via commits API). Zero behavioral downside.
    Suggested: Add --no-tags to clone_args and to the partial-flavor fetch_args at ~:635. One-liner; additional wire savings most pronounced on the exact long-tail monorepos this PR exists to optimize.

  • [recommended] Partial-bare flavor keeps __p name after server-rejection fallback turns it into a full clone; future eviction logic will mis-classify at src/apm_cli/cache/git_cache.py:340
    git_cache.py:340-388: when --filter=blob:none is rejected, code retries with plain --bare but keeps bare_dir at __p. _fetch_into_bare_locked at ~:638 detects partial-ness purely by the __p suffix. Subsequent fetches against a de-facto-full bare keep passing --filter=blob:none. Listed PR follow-up (eviction policy) should also detect this case (a __p bare whose object DB exceeds a sanity threshold relative to its commits+trees).

  • [nit] Three sequential git config subprocesses for promisor setup; fork+exec overhead is trivially avoidable at src/apm_cli/cache/git_cache.py:517
    git_cache.py:517-528 issues three separate subprocess.run(...) for remote.origin.url, .promisor, .partialclonefilter. Each is fork+exec (~10-30ms macOS, more on Windows). Cheaper shapes: append a single [remote 'origin'] block to .git/config atomically. ~1-3% of cold subdir wall time.

  • [nit] materialize_result.consumer_size_bytes includes .git/ overhead; metric name implies delivered payload at src/apm_cli/deps/github_downloader.py:1184
    github_downloader.py:1184 reports _dir_size_bytes(_persistent_checkout). The persistent_checkout is the full shard (working tree + .git/, including partial pack index). User sees 'size=0.47MB' attributed to materialization but a meaningful fraction is git metadata. For sparse_applied=True the proper signal is size of cone excluding .git/, or rename to checkout_dir_size_mb.

This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.

Address apm-review-panel recommended_followups on PR #1436 in three
disciplined buckets so the panel's substantive feedback lands in the
same PR rather than as orphan follow-ups.

Python architect -- extract apply_sparse_cone() helper:
- New src/apm_cli/utils/git_sparse.py: single helper for
  'sparse-checkout init --cone' + 'sparse-checkout set <paths>'.
- Both call sites (cache/git_cache.py persistent path; deps/bare_cache.py
  shared-bare path) now delegate. Future sparse-checkout evolution
  (timeouts, --no-sparse-index) lands in one place; the R0801 pylint
  guard stays clean and won't drift on the next near-duplicate addition.

CLI logging UX -- address all five logging nits:
- Replace literal '[perf]' prefix (not in STATUS_SYMBOLS) with
  '[i] perf:' so the four new verbose lines use the existing info
  glyph instead of inventing prefix vocabulary. Keeps the perf:
  semantic grouping cue without growing the symbol registry.
- Add missing space in size literal: 'size=0.47MB' -> 'size=0.47 MB'.
- Partial-clone fallback warning now uses _rich_warning(...) instead
  of the silent _log.warning(...) channel, so the user sees the cause
  of any pre-2.20 GHE / old Gerrit slowdown at default verbosity.
- Replace contextlib.suppress(AttributeError, TypeError) on the
  tier_summary call with an explicit hasattr() check (drops the
  contextlib import). Intent is now legible to the next reader.
- Relabel persistent-cache state as 'persistent-hit' so the hit
  semantics are explicit at the call site.

Test coverage -- add regression-trap tests:
- New test_partial_clone_fallback_to_full_on_server_rejection: patches
  subprocess.run to reject --filter=blob:none on the first call and
  asserts the retry without --filter succeeds, defending the documented
  fallback (CHANGELOG line) against silent regression.
- Extend test_persistent_cache_hit_used with call_args.kwargs
  assertions on sparse_paths and locked_sha so the variant-keyed
  shard plumbing from #1433 is locked down at the integration boundary.

Doc drift -- add CHANGELOG [Unreleased] > Performance entry summarizing
the speedup, the layered techniques (partial bare + promisor + sparse-
cone + threaded SHA), and the documented filter fallback.

Lint chain silent: ruff check, ruff format --check, pylint R0801
(no new duplication), lint-auth-signals.sh. Targeted pytest:
tests/unit/cache + tests/unit/deps + tests/unit/install + the touched
integration file all pass (44 + 1109 + 1388 collected).

Refs #1436 #1433

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel and others added 2 commits May 21, 2026 22:24
Two remaining recommendations from the apm-review-panel folded into
this PR so the changeset matches the panel's documented expectations.

--no-tags on partial bare clone (cache/git_cache.py):
  Skip fetching tag objects on the perf-fixed path. Release tags can
  sum to MBs on monorepos (e.g. dotnet/skills) and the cache resolves
  via SHAs, never via tag refs. Applied to both the initial partial
  clone and the no-filter fallback retry so the saving is preserved
  when an old server rejects --filter=blob:none. Scoped to git_cache.py
  only: bare_cache.py's --branch <ref> path can legitimately receive a
  tag pin and is left untouched.

docs/reference/cli/cache.md layout drift:
  Document the variant-keyed checkout layout (full/, sparse-<hash>/)
  and the partial bare suffix (<shard>__p) so the page no longer lies
  about a pre-#1433 directory shape. Explains why the variant suffix
  exists (two consumers of the same SHA can want different subdirs).

Lint silent (ruff check + format check). Targeted pytest: 44 passes on
tests/unit/cache + the touched integration file.

Refs #1436 #1433

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel danielmeppiel merged commit 53c13da into main May 21, 2026
14 checks passed
@danielmeppiel danielmeppiel deleted the danielmeppiel/perf-sparse-cone-materialize-1433 branch May 21, 2026 20:41
sergio-sisternes-epam pushed a commit that referenced this pull request May 22, 2026
…o CI integration gate

Add two new scaling guards for the sparse-cone materialisation code
paths introduced in #1436:

- TestVariantKeyScaling: guards _variant_key() stays O(n log n) in
  sparse path count (threshold < 15, measured ~9x), with
  canonicalisation correctness assertions (order-insensitive,
  duplicate-insensitive, deterministic).

- TestVariantLookupScaling: guards that checkout variant resolution
  (is_dir check) stays O(1) regardless of sibling variant count
  (threshold < 5, measured ~1x).

Wire scaling guards into ci-integration.yml as a dedicated job that
runs in parallel with the build (no binary, secrets, or network
needed). The fan-in 'Integration Tests (Linux)' gate now requires
both integration shards and scaling guards to pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PERF] materialize_from_bare does full checkout for subdir deps -- enable sparse-cone (78 MB -> <1 MB, ~10x download)

2 participants