Skip to content

Improve Warden synchronize performance without weakening full-PR review #378

Description

@dcramer

Context

Warden currently treats a pull request as the review unit. On pull_request.synchronize, it rebuilds the full PR file/hunk set and re-runs matching skills across the full PR diff.

That is thorough, but it can be very slow on large PRs when follow-up commits only touch a small part of the change. In one observed run, a small follow-up push still caused code-review to scan the full 22-file PR diff, including large hunks that took 10-15 minutes each.

We do not want to simply switch synchronize to latest-push-only scanning. A PR is an entire unit, and a later commit can change the meaning of earlier hunks. Examples:

  • a finding in one file is fixed by a later commit in another file
  • a new shared guard/schema/auth path invalidates an earlier finding
  • a later wiring change makes an earlier hunk risky
  • cross-file invariants change across route registration, tool definitions, handlers, schemas, and tests

Proposal

Explore a performance model that preserves full-PR semantics while reducing repeated work.

Suggested shape: cached full-PR review with an auxiliary change-impact planner.

Every run would still build the full current PR hunk set. New or changed hunks are always analyzed normally. Previously reviewed, unchanged hunks may be reused only after an auxiliary planner evaluates whether the latest push changes their meaning.

The planner would not produce findings. It would decide which previously reviewed hunks need fresh analysis.

Inputs could include:

  • latest push diff
  • current full PR file/hunk inventory
  • previous Warden run summary for the PR
  • prior hunk review records, including file, range, hunk summary, and finding/no-finding outcome
  • PR title/body and changed file list
  • skill name and short skill purpose

Example structured output:

{
  "reanalyze": [
    {
      "path": "packages/mcp-core/src/server.ts",
      "hunkId": "abc123",
      "reason": "Latest push changes the tool result shape consumed by this dispatch path."
    }
  ],
  "reuse": [
    {
      "path": "openspec/changes/issue-linking/proposal.md",
      "hunkId": "def456",
      "reason": "Latest push only refactors implementation helpers and does not affect this design-doc hunk."
    }
  ],
  "fullRescan": false
}

Safety requirements

This optimization must be conservative:

  • full PR remains the logical review scope
  • new or changed hunks are always analyzed
  • existing unresolved findings are still evaluated against current head
  • if the planner is uncertain, malformed, incomplete, low confidence, or times out, fall back to full rescan
  • if the latest push is broad or hard to reason about, fall back to full rescan
  • logs must clearly show fresh, reused, invalidated, and full-rescan decisions

The goal is to reduce repeated work without introducing a deterministic path allowlist or making users maintain invalidation rules.

Open questions

  • What compact hunk-review record should Warden persist?
  • Should this live in GitHub Actions artifacts, Warden service state, or both?
  • How should we summarize prior clean hunks so the planner has enough context without re-reading everything?
  • What threshold should trigger full rescan instead of partial reuse?
  • Should this be default behavior once mature, or gated behind a config flag first?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions