perf(scaling-guards): add variant-key and cache-lookup guards, wire to CI integration gate#1439
perf(scaling-guards): add variant-key and cache-lookup guards, wire to CI integration gate#1439sergio-sisternes-epam wants to merge 7 commits into
Conversation
…o CI integration gate Add two new scaling guards for the sparse-cone materialisation code paths introduced in #1436: - TestVariantKeyScaling: guards _variant_key() stays O(n log n) in sparse path count (threshold < 15, measured ~9x), with canonicalisation correctness assertions (order-insensitive, duplicate-insensitive, deterministic). - TestVariantLookupScaling: guards that checkout variant resolution (is_dir check) stays O(1) regardless of sibling variant count (threshold < 5, measured ~1x). Wire scaling guards into ci-integration.yml as a dedicated job that runs in parallel with the build (no binary, secrets, or network needed). The fan-in 'Integration Tests (Linux)' gate now requires both integration shards and scaling guards to pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4f78289 to
2af99c6
Compare
There was a problem hiding this comment.
Pull request overview
Adds new performance regression guards for the sparse-cone checkout “variant” code paths (introduced in #1436) and wires them into the merge-queue Tier 2 integration workflow so algorithmic regressions are caught before auto-merge.
Changes:
- Add two new scaling guards covering
_variant_key()complexity and variant directory existence lookup complexity. - Add a new
scaling-guardsjob to the merge-queue integration workflow and make the existing fan-inIntegration Tests (Linux)job fail if scaling guards fail.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
tests/benchmarks/test_scaling_guards.py |
Adds Guard 7 (variant key scaling + canonicalization) and Guard 8 (variant directory lookup scaling). |
.github/workflows/ci-integration.yml |
Introduces a new Scaling Guards (Linux) job and aggregates its result into the existing integration gate job. |
Reduce all guard thresholds from generous margins (40-96%) to a uniform ~30% margin above measured baselines: 1. Children-index: 25 -> 15 (32% margin) 2. Discovery: 25 -> 14 (33% margin) 3. Console singleton: 15 -> 15 (31% margin, unchanged) 4. Package hash: 25 -> 16 (34% margin) 5. Semantic equiv: 25 -> 15 (30% margin) 6. should_exclude: 25 -> 2 (49% margin) 7. Variant key: 15 -> 13 (32% margin) 8. Variant lookup: 5 -> 2 (48% margin) Guards 6 and 8 have slightly higher margins (~48%) because their measured ratios (~1x) round to a minimum integer threshold of 2. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Increase input sizes where the previous scale was too small to reveal the true algorithmic behaviour: 3. Console singleton: 100/1k -> 1k/10k (t_small was 10us) 7. Variant key: 50/500 -> 200/2000 (reveals O(n log n) log factor) 8. Variant lookup: 10/100 -> 50/500 (wider O(1) proof range) Variant key threshold adjusted from 13 to 17 to maintain ~30% margin over the 11.7x ratio now visible at larger scale. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add scaling guards for 7 hot paths identified by architectural review: - Guard 9: sidecar re-injection (O(n) with dict-based lookup) - Guard 10: hook entry dedup (O(n) with set-based lookup) - Guard 11: flatten_dependencies tree walk - Guard 12: JSON key diff (sorted union) - Guard 13: render_instructions_md - Guard 14: get_installed_paths - Guard 15: rewrite_hooks_data path matching Production fixes in hook_integrator.py: - _reinject_apm_source_from_sidecar: replace O(n*m) list pool.pop() with O(n) dict-based lookup using json.dumps for canonical hashing - _integrate_merged_hooks dedup: replace O(n^2) list scan with O(n) set-based lookup using json.dumps for canonical hashing - Fix duplicated deduped.append(entry) that doubled every entry Calibrate all thresholds to ~30% margin over measured baselines. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Guards 11, 13, and 14 had limits below the theoretical O(n log n) ratio of 15.9x (for 50->500 growth). Guard 13 was already flaking at 17.4x. Raise all three to 21 (~32% margin over theoretical). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace list.pop(0) with deque.popleft() in sidecar reinject to avoid O(k) regression on many identical hook entries - Sync guard 6 docstring: ratio < 2 (not < 25) - Sync guard 7 docstring: ratio < 25 with 74% margin over 14.3x theoretical (not ratio < 17 with 30% margin) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 1 | 2 | O(n)->O(1) fixes are correct; 2 latent-import antipatterns; guard 10 tests a test-local copy, not production code -- coverage gap. |
| DevX UX Expert | 0 | 0 | 1 | No CLI surface changes; perf fix is invisible to users. One nit: CI job name could hint at user impact for changelog readers. |
| Supply Chain Security | 0 | 0 | 1 | No new supply-chain surface introduced; CI actions follow existing tag-pin pattern already present in the workflow file. |
| OSS Growth Hacker | 0 | 1 | 1 | Perf infra + silent O(n^2) fix: strong engineering signal, zero conversion surface touched. One story beat worth surfacing in CHANGELOG. |
| Test Coverage Expert | 0 | 2 | 1 | Duplicate-append bug has regression trap in test_content_dedup_same_package; Guard 10 covers a standalone helper not the production path, but production-path integration tests exist. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 5 follow-ups
- [Python Architect + Test Coverage Expert] Extract the inline dedup logic from
_integrate_merged_hooksinto a named module-level helper inhook_integrator.py; update guard 10 to import and exercise that helper rather than a test-local copy. -- Guard 10 currently proves O(n) complexity on a shadow copy. If production dedup regresses, guard 10 still passes. Evidence=missing on the integration tier for this guard path. One extraction closes the divergence permanently. - [OSS Growth Hacker] Add a CHANGELOG 'Performance' section documenting the 64x O(n^2)->O(n) hook-dedup fix and O(n*m)->O(n) sidecar-reinject fix with the measured ratios. -- Confirmed algorithmic fixes at this magnitude are enterprise-evaluator proof content. Invisible today; one paragraph captures the signal permanently and feeds release post copy.
- [Python Architect] Remove redundant function-level
import jsonandfrom collections import dequeinside_reinject_apm_source_from_sidecarand_integrate_merged_hooks; use module-level imports throughout. -- Redundant imports in hot loops add minor overhead and create misleading signals for future readers about whether the module-level import is safe to remove. - [Test Coverage Expert] Add an integration-tier regression test that exercises the old double-append symptom path through the full install flow, not just the unit-tier len==1 assertion. -- The unit trap exists but a full-flow integration test would survive refactors that break the unit boundary.
- [DevX UX Expert] Rename CI job from 'Scaling Guards (Linux)' to 'Perf Scaling Guards (Linux)' for contributor discoverability in workflow logs. -- Low-cost, self-documenting change that communicates intent to external contributors scanning CI output.
Architecture
classDiagram
direction LR
class BaseIntegrator {
<<Abstract>>
+find_hook_files(path) list
+_hook_entry_content_key(entry) str
}
class HookIntegrator {
<<ConcreteIntegrator>>
+_integrate_merged_hooks(config, ...) HookIntegrationResult
+_should_remove_prior_merged_entry(...) bool
+_rewrite_hooks_data(...) tuple
}
class HookIntegrationResult {
<<ValueObject>>
+files_integrated int
+files_updated int
+files_skipped int
+target_paths list
}
class IntegrationResult {
<<ValueObject>>
+files_integrated int
}
class _MergeHookConfig {
<<ValueObject>>
+target_key str
+config_filename str
+schema_strict bool
+require_dir bool
}
class _reinject_apm_source_from_sidecar {
<<Pure>>
+__call__(hooks, sidecar_data) None
}
class _dedup_hook_entries_testlocal {
<<Pure>>
note: defined only in test file not imported from src
}
BaseIntegrator <|-- HookIntegrator
IntegrationResult <|-- HookIntegrationResult
HookIntegrator ..> HookIntegrationResult : returns
HookIntegrator ..> _MergeHookConfig : reads
HookIntegrator ..> _reinject_apm_source_from_sidecar : calls
class HookIntegrator:::touched
class _reinject_apm_source_from_sidecar:::touched
class _dedup_hook_entries_testlocal:::touched
classDef touched fill:#fff3b0,stroke:#d47600
flowchart TD
A[apm install CLI entry] --> B[HookIntegrator._integrate_merged_hooks]
B --> C{sidecar file exists?}
C -- yes --> D[read sidecar JSON]
D --> E[_reinject_apm_source_from_sidecar]
E --> E1[Build pool: dict-of-deques O-n]
E1 --> E2[Match disk entries via popleft O-n]
C -- no --> F[per hook_file loop]
E2 --> F
F --> G[idempotent upsert: strip prior entries]
G --> H[extend event list with fresh entries]
H --> I[inline dedup: set-based seen_keys O-n]
I --> J[write merged JSON config]
J --> K[copy referenced scripts]
K --> L[return HookIntegrationResult]
style E1 fill:#d4edda
style E2 fill:#d4edda
style I fill:#fff3b0
Recommendation
Ship. The production algorithmic fixes are correct, the CI gate is wired, and existing integration tests provide regression coverage on the dedup path today. The Guard 10 shadow-copy issue is real maintenance debt and should be resolved before the next PR modifies _integrate_merged_hooks -- open a followup issue and assign it before merge. The CHANGELOG entry for the 64x fix is a one-paragraph addition that should land in this PR or immediately after; do not let this signal ship dark. All other findings are nits with no correctness impact.
Full per-persona findings
Python Architect
-
[recommended] Guard 10 (HookDedup) tests a test-local copy of the dedup logic, not the production code path at
tests/benchmarks/test_scaling_guards.py:523
The guard defines_dedup_hook_entriesintest_scaling_guards.pyas a verbatim copy of the inline dedup logic from_integrate_merged_hooks. If the production dedup regresses (e.g. reverts to a list scan), guard 10 still passes because it exercises the local copy, not the source. Extracting the dedup into a module-level function inhook_integrator.pyand importing it in both the call site and the test would close this gap with one refactor.
Suggested: Extract the dedup block into a module-level helper_dedup_hook_entries(entries)in hook_integrator.py. Import it in the guard test instead of redefining it locally. -
[nit] Function-level
import json/from collections import dequeinside_reinject_apm_source_from_sidecarare redundant atsrc/apm_cli/integration/hook_integrator.py:216
jsonis already imported at module top (line 46). The function-level imports add a dict-lookup overhead on every call and mislead readers into thinking these are lazy imports for circular-dependency reasons.
Suggested: Movefrom collections import dequeto the module-level import block. Remove the redundantimport jsoninside the function body. -
[nit]
import json as _jsoninside_integrate_merged_hookshot loop is redundant atsrc/apm_cli/integration/hook_integrator.py:1190
jsonis already available at module scope. The inline alias inside a hot per-event loop adds unnecessary overhead.
Suggested: Removeimport json as _json; replace_json.dumpswithjson.dumps(already in scope).
CLI Logging Expert -- inactive
No CLI output, logging, or user-facing message changes in this PR. No findings from cli-logging-expert.
DevX UX Expert
- [nit] CI job name 'Scaling Guards (Linux)' is opaque to external contributors scanning workflow logs at
.github/workflows/ci-integration.yml
Developers reading CI output see 'Scaling Guards (Linux)' with no hint that this guards install-time performance. A self-documenting label costs nothing.
Suggested:name: Perf Scaling Guards (Linux)
Supply Chain Security
- [nit] GitHub Actions not SHA-pinned (pre-existing repo-wide pattern) at
.github/workflows/ci-integration.yml
The new scaling-guards job uses mutable tags (@v4/@v5/@v6) consistent with the rest of the file. A compromised tag could inject malicious code into CI. This is pre-existing and not introduced by this PR.
Suggested: Pin all action refs to full commit SHAs and track updates via Dependabot or Renovate. Address repo-wide as a separate hardening PR.
OSS Growth Hacker
-
[recommended] 64x hook-dedup fix deserves a CHANGELOG entry and a story beat -- it is invisible today
A confirmed O(n^2)->O(n) regression fix at 64x measured ratio on a hot path (every hook integration) is exactly the kind of proof point that converts skeptical power users into advocates. It is currently buried in PR body prose. Adding a CHANGELOG entry under a 'Performance' section costs one line and pays forward into release notes, social copy, and the 'why apm is fast' narrative. Silent perf fixes are the most common form of growth waste in OSS projects. -
[nit] PR title mentions 'variant-key and cache-lookup guards' but the headline value is the O(n^2) production fix -- consider reordering the story for future release-note drafting.
Auth Expert -- inactive
Changed files (.github/workflows/ci-integration.yml, src/apm_cli/integration/hook_integrator.py, tests/benchmarks/test_scaling_guards.py) do not touch any auth trigger paths; PR is a performance fix for hook deduplication with no credential resolution, token management, or host classification changes.
Doc Writer -- inactive
No changed files match doc-writer trigger paths (docs/src/content/docs/, README.md, CHANGELOG.md, MANIFESTO.md, .apm/skills//*.md, packages/apm-guide/**, etc.). The PR touches CI workflow, an internal hook deduplication fix, and benchmark tests -- none of which are user-facing documentation or natural-language artifacts.
Test Coverage Expert
-
[recommended] Guard 10 (HookDedup) tests a standalone extracted helper, not the production
_integrate_merged_hookspath attests/benchmarks/test_scaling_guards.py
test_scaling_guards.py defines a local_dedup_hook_entriesfunction that reproduces the dedup logic from_integrate_merged_hooks, then benchmarks it. The scaling guard proves O(n) complexity of the extracted copy, not the production function. If the production dedup logic drifts, Guard 10 will not catch it. Production-path coverage does exist viatest_hook_integrator.py::test_content_dedup_same_package(calls integrate_package_hooks_claude and asserts len(entries) == 1), so the user promise is defended -- but the scaling guard is testing a shadow copy, which is maintenance debt: the two can silently diverge.
Suggested: Either import the dedup logic directly fromhook_integrator(if extracted to a named helper), or add a comment in the guard that explicitly names the production symbol it mirrors.
Proof (missing):tests/benchmarks/test_scaling_guards.py::test_hook_dedup_scaling-- proves: That the dedup scaling guard exercises the actual production dedup code path, not a shadow copy -
[recommended] Duplicate-append bug fix has a unit-tier regression trap but no integration-tier test specifically targeting the old double-append symptom at
tests/unit/integration/test_hook_integrator.py
test_content_dedup_same_package(line 3589) asserts len==1 after installing a package with two identical hook files -- a valid regression trap at unit tier. The floor tier for install-pipeline surfaces is integration-with-fixtures.test_reinstall_heals_preexisting_duplicates(line 659) seeds duplicates in a real settings.json and checks they collapse, which is close but pre-existing, not PR-authored.
Suggested: Confirm thattest_reinstall_heals_preexisting_duplicates(line 659) runs cleanly against the new code -- it seeds a real settings.json with three duplicate entries and asserts they collapse, which is the strongest regression trap for this bug.
Proof (unknown):tests/unit/integration/test_hook_integrator.py::test_reinstall_heals_preexisting_duplicates-- proves: That identical entries from the same package appear exactly once after integration -
[nit] PR body claims '156 hook integrator unit tests pass' -- count is unverifiable without running the suite; no Scenario Evidence table present
The behavioral changes (FIFO order preservation via deque, dedup via set+json.dumps key) are not mapped to APM principles. Minor authoring gap; underlying test coverage appears adequate from file inspection.
Suggested: Add a Scenario Evidence table to the PR body mapping the two behavioral changes to their scenario, principle, and test path.
This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.
Generated by PR Review Panel for issue #1439 · ● 2.4M · ◷
Description
Add 15 hermetic scaling guards covering several hot paths in the CLI, wire them into the CI integration gate, and fix an O(n^2) hook deduplication regression discovered during baseline measurement.
Each guard compares wall-clock time at two input sizes and asserts the ratio stays below a calibrated threshold, catching algorithmic regressions at merge-queue time without full benchmarking infrastructure.
Production fix: O(n^2) hook deduplication
Baseline measurement of guard 10 revealed a 64x ratio at 50->500 entries -- confirming an O(n^2) regression in
_integrate_merged_hooks. Two fixes applied tohook_integrator.py:_reinject_apm_source_from_sidecar: replaced O(n*m)list.pop()scan with O(n) dict-based lookup usingjson.dumps(sort_keys=True)for canonical hashing_integrate_merged_hooksdedup: replaced O(n^2) list scan with O(n) set-based lookup; also fixed a duplicatedappend()call that was doubling every entryCalibrated guard table
Hot paths covered (guards 9-15)
_reinject_apm_source_from_sidecar_integrate_merged_hooksflatten_dependencies(BFS tree walk)json_key_diffrender_instructions_blockapply_topattern with sorting.LockFile.get_installed_paths_rewrite_hooks_dataCI wiring
The
scaling-guardsjob runs hermetically (no binary, no secrets, no network) inci-integration.ymland is wired into the fan-in gate alongside integration test shards.Changes
tests/benchmarks/test_scaling_guards.py-- 15 scaling guards + 1 correctness test (16 total)src/apm_cli/integration/hook_integrator.py-- O(n^2) -> O(n) dedup fix + sidecar reinject fix.github/workflows/ci-integration.yml--scaling-guardsjob + fan-in wiringValidation
ruff check+ruff format --check)