Skip to content

fix: TASK-357 — TASK-322 Wave 4 lows (10 findings) + transcription tests#3603

Merged
alekspetrov merged 5 commits into
mainfrom
pilot/TASK-357-wave4-lows
Jun 15, 2026
Merged

fix: TASK-357 — TASK-322 Wave 4 lows (10 findings) + transcription tests#3603
alekspetrov merged 5 commits into
mainfrom
pilot/TASK-357-wave4-lows

Conversation

@alekspetrov

Copy link
Copy Markdown
Collaborator

Context

Closes the final tranche of the TASK-322 security/code audit — the 13 deferred low-severity findings (Wave 4). All were re-verified against main (ab15125) before fixing per the audit gate; none had been incidentally fixed by Waves 2–3, so all 10 distinct findings are addressed here.

What changed

Leak evictions (autopilot, alerts)

  • B6a controller.goremovePR now evicts recordedMerges alongside activePRs/prFailures (was never deleted → unbounded over daemon lifetime).
  • B6b ci_monitor.go — grace-expired terminal paths now delete(discoveryStart, sha); previously only the "checks found" path evicted, leaking every no-CI SHA + superseded commit.
  • E7 engine.go — added retryLastSeen + evictStaleRetryTrackers (24h TTL) on the existing checkStuckTasks ticker, mirroring fix(alerts): task_stuck floods Slack — progress never emitted, per-rule cooldown rotates, no orphan cleanup #2204 orphan eviction; abandoned/escalated sources no longer leak.

Error-handling hygiene (executor, orchestrator, discord)

  • G1 worktree.goCleanupOrphanedWorktrees returns (removedCount, freedBytes, err) with err reserved for real failures, instead of abusing a non-nil error to signal success (a trap for idiomatic callers). 4 call sites + 3 tests updated.
  • A4b runner.go — clean up the MkdirTemp hook dir on the WriteEmbeddedScripts failure branch (was leaking for the process lifetime).
  • G2 bridge.gorunPython wraps with %w not %v so callers can errors.Is(context.DeadlineExceeded) / errors.As(*exec.ExitError).
  • G3 transport.go — discord heartbeat logs the WriteJSON error and returns to trigger immediate reconnect instead of discarding it.

Bug + diagnosability (executor, github)

  • A4a parallel.go — research subagent passes --model as two argv elements ("--model", name) instead of one joined "--model haiku" string the CLI parsed as a single unknown flag. Subagents were silently running on the default model. (tagged bug · conf high)
  • board-low client.goExecuteGraphQL aggregates all gqlResp.Errors (message + type + path) instead of only Errors[0]. Partial-data tolerance stays on the TASK-319 board track per the audit doc.

Test coverage (transcription)

  • G4 internal/transcription 0 → 3 test files: whisper_api error/parse/empty-transcript paths (httptest + URL-rewrite RoundTripper), setup status, Service primary/fallback routing.

Acceptance

  • go build ./...
  • make lint ✅ 0 issues
  • Tests ✅ all changed packages pass (autopilot, alerts, executor, orchestrator, discord, github, transcription, cmd/pilot)

Re-audit + ledger tick-off and TASK-357 archival to follow on merge — that closes the entire TASK-322 audit.

B6a: delete recordedMerges entry in Controller.removePR alongside
activePRs/prFailures — recordMergeSuccess set it but nothing deleted it.

B6b: delete discoveryStart[sha] on CIMonitor grace-expired terminal paths
(no-CI and commit-status branches); previously only the checks-found path
evicted, leaking every no-CI SHA and superseded intermediate commit.

E7: add retryLastSeen TTL map + evictStaleRetryTrackers sweep (24h) to the
existing checkStuckTasks ticker, mirroring GH-2204 orphan eviction for
taskLastProgress; abandoned/escalated sources no longer leak forever.
G1: CleanupOrphanedWorktrees now returns (removedCount, freedBytes, err) with
err reserved for real failures, instead of abusing a non-nil error to signal a
successful N-orphan cleanup (a trap for idiomatic if err!=nil callers). Updated
4 call sites + 3 tests.

A4b: clean up the MkdirTemp hook dir on the WriteEmbeddedScripts failure branch
(runner.go) — it previously logged and fell through, leaking the temp dir for
the process lifetime.

G2: orchestrator/bridge.go runPython wraps with %w not %v so callers can
errors.Is(context.DeadlineExceeded) / errors.As(*exec.ExitError).

G3: discord heartbeat logs the WriteJSON error and returns to trigger immediate
reconnect, instead of silently discarding it.
…l, coverage

A4a: research subagent now passes --model as TWO argv elements ("--model",
name) instead of one joined "--model haiku" string the claude CLI parsed as a
single unknown flag — subagents were silently running on the default model.

board-low: ExecuteGraphQL aggregates ALL gqlResp.Errors (message + type + path
via GraphQLError.String) instead of surfacing only Errors[0], so multi-error
Projects V2 responses are diagnosable. Error prefix unchanged; partial-data
tolerance stays on the TASK-319 board track.

G4: add table-driven tests for internal/transcription (0 -> 3 test files):
whisper_api error/parse/empty-transcript paths via httptest + URL-rewrite
RoundTripper, setup status, and Service primary/fallback routing.
@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 26.66667% with 55 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
cmd/pilot/main.go 0.00% 14 Missing ⚠️
cmd/pilot/commands.go 0.00% 10 Missing ⚠️
internal/alerts/engine.go 41.17% 9 Missing and 1 partial ⚠️
internal/adapters/github/client.go 50.00% 5 Missing and 2 partials ⚠️
internal/executor/parallel.go 0.00% 6 Missing ⚠️
internal/adapters/discord/transport.go 0.00% 4 Missing ⚠️
internal/executor/runner.go 0.00% 2 Missing ⚠️
internal/executor/worktree.go 66.66% 1 Missing ⚠️
internal/orchestrator/bridge.go 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@alekspetrov alekspetrov merged commit cc30c4d into main Jun 15, 2026
3 checks passed
@alekspetrov alekspetrov deleted the pilot/TASK-357-wave4-lows branch June 15, 2026 17:45
alekspetrov added a commit that referenced this pull request Jun 15, 2026
- Roadmap: mark Wave 4 complete (PR #3603, squash cc30c4d) and TASK-322
  audit CLOSED; record the one board-GraphQL partial-data tolerance
  carry-over to the TASK-319 board track.
- Archive gh-3456 task doc with corrected status (merged via PR #3459 in
  v2.169.0, not 'committed on branch').
alekspetrov added a commit that referenced this pull request Jun 15, 2026
… TASK-365 (#3608)

- README Current State: lead with v2.186.9 (TASK-357 Wave 4 closes the TASK-322
  audit via #3603; flaky-test de-flake shipped by Pilot via #3606), soften the
  v2.186.8 restart note to past tense, collapse the archived v2.186.0/.1 bullets
  to keep the block lean.
- Archive TASK-365 (de-flake task doc) — shipped in v2.186.9.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants