Skip to content

FEA-1437: UI Numbers Audit Phase 3/4/6 — selector-bound DOM audits + CI gate#252

Merged
aeyeCEO merged 16 commits into
mainfrom
FEA-1437-phase3-preexplorer
May 29, 2026
Merged

FEA-1437: UI Numbers Audit Phase 3/4/6 — selector-bound DOM audits + CI gate#252
aeyeCEO merged 16 commits into
mainfrom
FEA-1437-phase3-preexplorer

Conversation

@aeyeCEO
Copy link
Copy Markdown
Contributor

@aeyeCEO aeyeCEO commented May 29, 2026

Pre-explorer pass and Phase 3/4/6 increments for FEA-1437 (UI Numbers Audit Phase 3-6), per PLN-760. Builds on FEA-1415 / PLN-738 (PR #246, 2a3a371).

Part 1 — Pre-explorer (manifest annotations + 4 docs)

inventory/ gains PHASE3-RENDERER-MAP.md (render file:line + render_kind for all 53 entries), PHASE3-WEAK-TRIAGE.md (all 118 cross_ref_weak classified), PHASE3-SIDECAR-REUSE.md (launch-sidecar.mjs already serves Playwright), and a selector object on every manifest tile.

Part 2 — Phase 3/4: selector-bound DOM audits

Shared helpers helpers/playwright-region.ts + helpers/audit-tile.ts (assertTileMatchesOracle, with text / trend / money / dom_count branches) make each screen spec a ~10-line table loop. data-testids added to in-repo overlays only.

Bound + green (manifest-driven audit specs):

  • Dashboard Monitor — total_sessions (+active trend), total_cost, total_events (StatPill testids)
  • PullRequests — pull_requests, sessions_with_pr, repos (stat-grid testids)
  • Sessions — sessions.list.total (i18n subtitle)
  • Tools — tools.list.length (count of tool rows == events_facets_tool_names_length)
  • Skills — skills.list.length (count of skill rows == skills_total)

test:audit:ui10 passed / 5 skipped / 0 failed. Skips: FEA-1418 (cost bug), FEA-1442/1443 (non-rendering tiles), tools.event_types.length (selector pending).

Build bug fixed: build-agent-monitor.mjs currentStamp() omitted the Dashboard overlay, so editing it shipped a stale bundle. Added to the stamp inputs.

Part 3 — Phase 6: CI gate

scripts/check-audit-coverage.sh + audit:coverage npm script + apps/desktop/ci/audit-gate.yml (coverage gate + headless ui-audit jobs). Staged in apps/desktop/ci/ (not .github/workflows/) because the automation PAT lacks the workflow scope — apps/desktop/ci/README.md has the one-command activation.

Part 4 — cross_ref_weak reduction

Reclassified the 12 never-in-scope detections (server-runtime health gauges, hooks config state, timestamp formatters) to out_of_scope via new classifier rules. cross_ref_weak 118 → 106.

Triage — audit findings filed as gap features

  • FEA-1442 — Dashboard "Active Subagents" tile not rendered
  • FEA-1443 — Dashboard "Events Today" tile not rendered

Deferred (documented in manifest followup notes — next session)

  • Plans / Packs dom_count audits — data-testids + the dom_count assertion are in place, but blocked on test-env data: the fixture seeds no plans, and the installed-packs list renders 0 in the test sidecar (SKIP_CATALOG_DETECTORS=1). Needs fixture seeding / an /api/packs check (possible real bug).
  • SubAgents — blocked on FEA-1419 (api drops ?type= filter).
  • Upstream screens (Analytics, Workflows, ActivityFeed, SessionDetail, Kanban) — can't add data-testids without extending the build overlay to those upstream pages; their numbers are API-covered today. Needs an overlay-infra decision.
  • Dashboard Health tab — values render only in hover tooltips (Tip raw=), not visible text; needs a product decision (surface as text vs assert tooltip).
  • Phase 5 (tile-region visual regression + real-Electron) — not started; flaky/baseline-heavy by design.

Review

Pre-explorer + first Phase 3 commit: independent /codex review (gate PASS, findings fixed) + GitHub Codex bot (1 P2 fixed). Phase 4/6 commits: codex CLI rate-limited overnight → reviewed via an independent local reviewer instead. All CI checks green.

🤖 Generated with Claude Code

Andrew Eye and others added 2 commits May 28, 2026 20:20
Read-only pre-explorer pass (PLN-760) ahead of Phase 3 UI-test work. No app
code changed — manifest annotations + four docs only.

- Renderer map: located the React file+line for all 53 manifest tiles
  (45 tiles + 8 structural assertions) across in-repo overlays and the pinned
  upstream agent-dashboard-client. PHASE3-RENDERER-MAP.md, grouped by screen,
  with a render_kind per tile (text / embedded / per-group / DOM-count /
  chart-only / absent).
- Selector annotation: added a `selector` object to every manifest tile. No
  data-testid exists on any tile render path today, so every selector is a
  PROPOSED value gated on a one-line React follow-up (present_in_code:false).
- Weak triage: classified all 118 cross_ref_weak detections — 12 reclassify to
  out_of_scope, 4 auto-resolve via selector binding, 102 are chart aggregates
  needing an explicit covered-by annotation. PHASE3-WEAK-TRIAGE.md.
- Sidecar reuse: confirmed helpers/launch-sidecar.mjs already serves Playwright
  (via playwright-global-setup.ts) and node:test alike; SANDBOX_BASE_DIRECTORY=/
  is baked into the shared helper and UI specs bypass the hook enqueue/drain
  path entirely. PHASE3-SIDECAR-REUSE.md.

Testing: `node manifest-loader.mjs` loads the patched manifest cleanly (45
tiles / 8 structural). No app code or tests modified; nothing to run.

Risks: none to runtime — manifest `selector` field is additive and the loader
tolerates it; docs are non-executable. Line numbers for upstream files are
pinned to the current package hash and will drift on an upstream bump.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two accuracy fixes from independent codex review (gate: pass, P2+P3):

- Weak triage: the 4 Analytics costData.total_cost detections were marked
  category (a) auto-resolve, but coverage-classifier.mjs:178 binds candidates
  screen-scoped (t.screen === row.screen) and there is no Analytics cost tile —
  only dashboard.monitor.total_cost. Moved them to category (b). Net: (a) is now
  0, which sharpens the headline: NO weak detection is a simple tile-text value
  awaiting a selector; every one is out-of-scope (c) or a chart aggregate (b).
- successRate selector follow-up: the generic per-group text said "sum across
  groups", but workflow_success_rate is a percentage oracle. Corrected to expose
  the aggregate percentage, not a summed count.

Testing: manifest still loads via manifest-loader.mjs (45 tiles / 8 structural).

Risks: none — docs + one manifest selector.followup string.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aeyeCEO aeyeCEO requested a review from a team May 29, 2026 01:25
@aeyeCEO
Copy link
Copy Markdown
Contributor Author

aeyeCEO commented May 29, 2026

Pre-explorer done → what Phase 3 should pick up first

Linking back: FEA-1437 · PLN-760 (Phase 3 task list). This PR lands the pre-explorer deliverables PLN-760 asked for before Phase 3 kickoff.

Start with the Dashboard screen. It's the right first target because:

  • It's the highest-priority P0 surface (PLN-738: highest-trafficked, biggest credibility hit if a number is wrong).
  • It's the in-repo overlay (apps/desktop/scripts/agent-monitor-client/Dashboard.tsx), so adding the data-testids is a clean ClosedLoop-side change — no upstream agent-dashboard-client patch required.
  • Most Monitor + Health tiles are render_kind: text (directly assertable). The cleanest single change: add a keyed data-testid to the shared StatPill component (Dashboard.tsx:545) — that one follow-up covers total_sessions, active_agents, total_events, total_cost, plus their sub trends. Then the Health Tip cards (db.counts.*, compaction.totalCompactions, errorRate).
  • Pair it with Analytics overview (total_sessions / total_agents / total_events already scanner-bound at Analytics.tsx:816/824/852, all text via the same StatPill pattern) for an easy second screen.

Defer / flag before writing specs (from PHASE3-RENDERER-MAP.md): events_today is not rendered in the current overlay; the Workflows total* / .length tiles and sessionsWithCompactions are chart-only (no sliceable text) — these need either a follow-up to expose a number or an API-only + covered_by annotation, not a text assertion. Don't burn Phase 3 time trying to slice a chart.

Sidecar reuse — confirmed yes, no new boot path. helpers/launch-sidecar.mjs already serves both Playwright (via playwright-global-setup.ts) and the node:test API audits. SANDBOX_BASE_DIRECTORY=/ (the FEA-1407 fix) is baked into the shared helper, so Phase 3 specs inherit it for free; UI specs seed the fixture DB directly and never traverse the hook enqueue/drain path (that stays covered by claude-hooks.contract.test.mjs). New P0 specs go in specs/audit/ (run via playwright.audit.config.ts / test:audit:ui), not specs/ui/, so known-buggy assertions don't block the default test:e2e gate. Full reasoning in inventory/PHASE3-SIDECAR-REUSE.md.

Weak-count reality check: driving cross_ref_weak to 0 is not a selector task — 0 of 118 weak detections are simple tile-text values. 12 should be reclassified out_of_scope, 106 are chart aggregates needing a covered_by convention. See inventory/PHASE3-WEAK-TRIAGE.md.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 006f865600

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +74 to +76
"render_kind": "text",
"renders_at": "scripts/agent-monitor-client/Dashboard.tsx:1814",
"followup": "Add data-testid to the StatPill value element, keyed per tile."
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Correct active-agent selector annotation

For the dashboard.monitor.active_agents Phase-3 binding, the current Dashboard render path puts active_agents only in the StatPill sub text (Dashboard.tsx:1814); the corresponding value text is overview.total_agents (Dashboard.tsx:1812), and there is no separate “Active Agents” stat pill. If the follow-up adds this selector to the value element as instructed here, the audit will read total agents while comparing against the active-agents oracle, producing a false failure or masking the intended binding.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in bb-range commit just pushed. active_agents renders only in the sub text of the "Total Agents" pill ({active_agents} active), and that pill's value is overview.total_agents. Corrected the manifest annotation to render_kind: text_sub with a follow-up that targets the sub element (subTestid), not the value — and updated the matching PHASE3-RENDERER-MAP.md row. No test impact today (the tile is still label-sliced and passing); this prevents the wrong follow-up binding later.

Andrew Eye and others added 2 commits May 28, 2026 20:54
…r tiles

First Phase-3 increment: replace fragile label-slicing with stable data-testid
selectors on the Dashboard Monitor StatPills.

- StatPill gains optional testid/subTestid props (additive); wired at the three
  genuinely-text Monitor pills: Total Sessions (+ active trend), Total Cost,
  Total Events.
- New helpers/playwright-region.ts: sliceAtSelector / sliceAtLabel / sliceForTile
  (the PLN-760 helper layer). sliceForTile prefers the data-testid when the
  manifest records selector.present_in_code, else falls back to label slice.
- dashboard.ui-audit.spec.ts refactored to use the shared helper.
- manifest.json: selector.present_in_code flipped to true for the four testid'd
  entries.

Build fix (latent bug): build-agent-monitor.mjs currentStamp() omitted the
client Dashboard.tsx overlay, so editing it never invalidated the build stamp —
the sidecar shipped a stale Dashboard bundle. Added clientOverlayDashboardSource
to the stamp inputs (matches the CLAUDE.md learned pattern: overlay changes must
invalidate the stamp). This is what made the testids actually reach the DOM.

Testing: `pnpm --filter desktop test:audit:ui` → 4 passed / 1 skipped (total_cost,
bug FEA-1418) / 3 failed. The 3 failures (active_subagents, its trend, events_today)
are PRE-EXISTING and unchanged by this commit — those tiles do not render in the
current overlay (confirmed by the pre-explorer). They need a triage decision
(file as bug/gap features with bug_ref so they skip, or drop the manifest rows)
before the Monitor audit can go fully green.

Risks: testid props are additive and default undefined, so non-audit rendering is
unchanged. Version already bumped to 0.15.98 earlier in this PR.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Phase 3 audit proved 3 Monitor manifest tiles render no number in the UI.
Filed as gap features (RELATES_TO FEA-1415) and set bug_ref so the audit
test.skips them — Dashboard Monitor audit now green (4 passed / 4 skipped).

- active_subagents + active_subagents.trend_total -> FEA-1442 (no Active
  Subagents tile; subagents only appear as agent-tree nodes)
- events_today -> FEA-1443 (no Events Today tile rendered at all)

Each gap feature documents expected-vs-actual and the open decision (build the
tile with a data-testid, or descope the manifest row). The skipped tests
auto-flip to real assertions when the tile ships.

Testing: `pnpm --filter desktop test:audit:ui` -> 4 passed / 4 skipped / 0 failed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aeyeCEO aeyeCEO changed the title FEA-1437: pre-explorer pass for Phase 3-6 kickoff FEA-1437: pre-explorer + Phase 3 selector binding (Dashboard) May 29, 2026
Andrew Eye and others added 7 commits May 28, 2026 21:41
The pre-explorer marked dashboard.monitor.active_agents as render_kind "text"
with "add data-testid to the value element". Wrong: active_agents renders only
in the SUB text of the "Total Agents" pill ("{active_agents} active"); that
pill's VALUE is overview.total_agents. A follow-up that put the testid on the
value element would read total_agents while comparing against the active_agents
oracle — a false failure / masked binding.

Corrected the manifest selector annotation (render_kind text_sub, renders_at,
followup -> target the sub element) and the matching PHASE3-RENDERER-MAP.md row.

No code/test impact now (tile is present_in_code:false, still label-sliced and
passing); this prevents the wrong follow-up later.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… helper

- helpers/audit-tile.ts: shared assertTileMatchesOracle(page,row) +
  resolveFixtureDbPath(), lifted from dashboard.ui-audit.spec.ts so each screen
  spec is a thin table loop (count/trend/money branches in one place).
- dashboard.ui-audit.spec.ts refactored onto the shared helper (no behavior
  change).
- PullRequests.tsx: data-testid on the 3 summary stat-grid values.
- specs/audit/pull-requests.ui-audit.spec.ts: new manifest-driven spec.
- manifest.json: present_in_code true for the 3 pr.stats.* tiles.

Testing: test:audit:ui -> 7 passed / 4 skipped / 0 failed (Dashboard + PR).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Sessions.tsx: data-testid on the list-subtitle <p> (the "N sessions" count).
- specs/audit/sessions.ui-audit.spec.ts: new manifest-driven spec.
- manifest.json: present_in_code true for sessions.list.total.

Testing: test:audit:ui -> 8 passed / 4 skipped / 0 failed (Dashboard + PR + Sessions).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wires the audit coverage validator + headless tile audit into a CI workflow.
The validator already enforces the Phase 6 rules (needs_review=0, every detection
classified, cross_ref files exist, out_of_scope reasons, bug_ref format) but was
never run in CI (Tests runs `pnpm test`, not `test:audit`).

- scripts/check-audit-coverage.sh (+ `audit:coverage` npm script): regenerates
  scan+classify, asserts every harness parser has a contract test, runs the
  coverage validator, prints by_status. Fast, no build.
- apps/desktop/ci/audit-gate.yml: the workflow (coverage gate + headless
  Playwright ui-audit jobs). Staged here, NOT .github/workflows/, because the
  automation PAT lacks the `workflow` scope — see apps/desktop/ci/README.md for
  the one-command activation a maintainer with workflow scope runs.
- coverage.json: refreshed detection line numbers after the Dashboard testid
  edits (line shifts only; by_status unchanged).

Testing: `pnpm -C apps/desktop audit:coverage` -> 203 checks pass, needs_review=0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…cope

Per PHASE3-WEAK-TRIAGE.md category (c): the 12 detections that were never in
scope (server-runtime health gauges, local hooks config state, timestamp
formatters) now classify as out_of_scope via three new content-pattern rules in
coverage-classifier.mjs (HEALTH_GAUGE / CONFIG_STATE / TIMESTAMP).

cross_ref_weak 118 -> 106; out_of_scope 29 -> 41; needs_review still 0.

The remaining 106 weak detections are category (b) — chart/sparkline aggregates
whose underlying numbers ARE oracle-backed at the API layer but don't render as
sliceable per-value DOM. Those need the covered_by annotation convention (next).

Testing: coverage-validator 203/203 pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- audit-tile.ts: new dom_count branch — for list_length/section_card_count tiles
  (no single numeric element), assert the rendered item count == oracle via
  toHaveCount(selector). Generalizes the table-driven audit to count tiles.
- Tools.tsx: data-testid='audit-tool-row' on each tool button. tools.list.length
  now asserts count == events_facets_tool_names_length (passes).
- specs/audit/tools.ui-audit.spec.ts: new spec; skips tiles not yet bound
  (tools.event_types.length — event_type has no countable rendered list; only
  shows per-event in the detail panel) with a visible "(skip — selector pending)".
- Plans.tsx: data-testid='audit-plan-row' added (infra for later) but the Plans
  UI audit is NOT enabled — the fixture seeds zero plans, so it can't be
  meaningfully asserted yet; manifest followup documents the fixture-seeding
  prerequisite. No Plans spec shipped (would be empty/false-confidence).
- coverage.json: detection line numbers refreshed after the testid edits.

Testing: test:audit:ui -> 9 passed / 5 skipped / 0 failed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Skills.tsx: data-testid='audit-skill-row' on each skill button (across pack
  groups). specs/audit/skills.ui-audit.spec.ts asserts toHaveCount == skills_total
  (= sum of per-group counts). Passes.
- PacksInstalled.tsx: data-testid='audit-pack-row' added (infra), but the Packs
  UI audit is NOT enabled — in the test sidecar the installed-packs list renders
  0 rows (SKIP_CATALOG_DETECTORS=1; /api/packs may need install detection to
  surface the seeded agent_packs). manifest followup documents the open question
  (possible real bug vs test-env limitation). No Packs spec shipped.
- coverage.json: detection line numbers refreshed.

Testing: test:audit:ui -> 10 passed / 5 skipped / 0 failed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aeyeCEO aeyeCEO changed the title FEA-1437: pre-explorer + Phase 3 selector binding (Dashboard) FEA-1437: UI Numbers Audit Phase 3/4/6 — selector-bound DOM audits + CI gate May 29, 2026
@aeyeCEO
Copy link
Copy Markdown
Contributor Author

aeyeCEO commented May 29, 2026

Overnight handoff (FEA-1437 Phase 3/4/6)

PR is green (10 passed / 5 skipped / 0 failed on test:audit:ui; all CI checks pass). Full details in the updated PR description. Quick orientation for the morning:

Landed & verified: selector-bound DOM audits for Dashboard Monitor, PullRequests, Sessions, Tools, Skills; shared audit-tile.ts helper (text/trend/money/dom_count); Phase 6 CI gate; cross_ref_weak 118→106; a build-stamp bug fix; gap features FEA-1442/1443.

3 things that need you (can't be done unattended):

  1. Activate the CI gateapps/desktop/ci/audit-gate.yml is staged outside .github/workflows/ because the automation PAT lacks the workflow scope. One command (in apps/desktop/ci/README.md) moves it, or merge via the GitHub UI with a workflow-scoped account.
  2. Codex review — was rate-limited all night (resets aside, also hit a token issue you cleared). I reviewed Phase 4/6 with an independent local reviewer instead. If you still want a codex pass, ping me when the limit is clear.
  3. Two product/infra decisions that gate further screens (details + repro in manifest followup notes):
    • Upstream screens (Analytics, Workflows, etc.) need an overlay-infra decision to add data-testids (they're API-covered today, not DOM-audited).
    • Dashboard Health tab values live in tooltips, not visible text — surface as text, or assert the tooltip?
    • Plans/Packs dom_count audits are wired but blocked on test-fixture data (no plans seeded; installed-packs renders 0 with detectors disabled — possibly a real /api/packs bug worth a look).

Not started: Phase 5 (visual regression + real-Electron) — deferred as flaky/baseline-heavy per PLN-760.

…w High)

Independent local review caught a High bug in the staged workflow: the coverage
summary step did `require(process.env.COV)` with a bare relative path, which
Node treats as a node_modules specifier — it throws "Cannot find module" once
coverage.json exists, failing the (non-continue-on-error) step and thus the
whole coverage gate job, blocking PRs on activation.

- Resolve to an absolute path: require(require("path").resolve(process.env.COV)).
- Mark the summary step continue-on-error: true — reporting must never fail the
  gate.

Verified: the fixed snippet prints the by_status table from repo root.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
// is patched. Skip so the audit-ui suite stays green without masking the
// bug — the node-side audit keeps the assertion alive as a `todo` that
// flips to "todo passed" the moment the fix lands.
const testFn = row.bug_ref ? test.skip : test;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only skips on bug_ref, but tools.ui-audit and skills.ui-audit also skip tiles where selector.present_in_code is false. active_agents is present_in_code:false with no bug_ref, so it runs here through the label fallback. sliceAtLabel("Active Agents") lands on the full-width Active Agents section heading (t("activeAgentsSection")), not the Total Agents pill sub where the oracle value actually renders, then asserts the number shows up somewhere in an 80-char window. So that tile is green because some number in the agent grid happened to match, not because the rendered active_agents value equals the oracle. And it'll flip to a hard failure the day the activeAgentsSection translation stops containing "Active Agents". Either give it the same present_in_code skip the other two specs have, or hoist that guard into assertTileMatchesOracle so all five specs behave the same.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in dd3cec2 — and you're right that it was the better lever than a one-off skip. Two changes: (1) bound active_agents for real — added a subTestid to the Total Agents pill's sub (data-testid='audit-dashboard-monitor-active-agents') and flipped present_in_code: true, so it now asserts the rendered active count against the oracle on the element where it actually renders, not the section heading. (2) Hoisted the guard: new tileSkip(row) in helpers/audit-tile.ts (skip on bug_ref OR !present_in_code) now used by all five specs, so nothing can silently fall through to the label slice again. Verified active_agents passes via the selector path (not coincidence).

Comment thread apps/desktop/ci/audit-gate.yml Outdated
pull_request:
types: [opened, synchronize, reopened]
paths:
- "apps/desktop/**"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads up for when you git mv this into .github/workflows. The README calls coverage the merge gate, so it'll presumably be a required check. A required check plus a paths filter is the classic trap: a PR that doesn't touch apps/desktop/** never triggers these jobs, the required status sits pending forever, and the PR can't merge. If it gates merges you need the no-op-on-skip job pattern or a non-paths trigger.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in dd3cec2. Removed the paths: filter entirely so the workflow always triggers (the coverage job is fast), and documented in apps/desktop/ci/README.md to mark only coverage as required and leave ui-audit optional. That avoids the required-check + paths deadlock when this gets git mv'd into .github/workflows/.

Copy link
Copy Markdown
Contributor

@thadeusb thadeusb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test infra is solid and the build-stamp fix is a real one. The catch: active_agents passes for the wrong reason because this spec doesn't skip unbound selectors the way the tools/skills specs do, so a green suite is overstating its coverage by one tile. See the comment.

Andrew Eye and others added 4 commits May 29, 2026 08:01
…dex P2)

Codex review caught a race: the Sessions subtitle renders "0 sessions" as its
initial placeholder before api.sessions.list() resolves. The beforeEach poll
used /\d/, which matches that "0" immediately, so the assertion could read a
stale zero and compare it against a non-zero oracle. Changed the poll to /[1-9]/
(the same anti-race trick the Dashboard "Total Sessions" audit uses) so it waits
for real, non-placeholder data.

Testing: test:audit:ui -> 10 passed / 5 skipped / 0 failed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ate trigger

Review (approved-with-comments) caught two real issues:

1. active_agents was a false green. It was present_in_code:false with no bug_ref,
   and the Dashboard spec (unlike tools/skills) didn't skip unbound tiles — so it
   ran through the label fallback, which sliced the "Active Agents" SECTION
   heading (not the Total Agents pill sub where active_agents renders) and passed
   only because some number landed in the 80-char window (and would break if the
   translation changed). Fix:
   - Bind it for real: subTestid on the Total Agents pill sub
     (data-testid='audit-dashboard-monitor-active-agents'), manifest
     present_in_code:true. Now asserts the rendered active count == oracle.
   - Unify skip logic: new tileSkip() helper (bug_ref OR !present_in_code) used by
     ALL five specs, so no spec can silently fall back to a coincidental label
     match again.

2. audit-gate.yml required-check + paths-filter deadlock. A required check with a
   paths filter leaves PRs that don't touch apps/desktop/** pending forever.
   Removed the paths filter (coverage job is fast, always triggers) and documented
   in README: mark only `coverage` required, leave `ui-audit` optional.

Testing: test:audit:ui -> 10 passed / 5 skipped / 0 failed (active_agents now
passes via its real selector).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PLN-750 merged to main and took 0.15.98 — the same version this PR had. On a
pull_request event GitHub builds a merge commit, so version-bump-check's
merge-base became main's tip (0.15.98), equal to ours -> "not bumped" failure.
Bumping to 0.15.99 (above main) resolves the gate and avoids a release-version
collision with PLN-750.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…plorer

# Conflicts:
#	apps/desktop/package.json
@aeyeCEO aeyeCEO merged commit f1cdd67 into main May 29, 2026
5 checks passed
@aeyeCEO aeyeCEO deleted the FEA-1437-phase3-preexplorer branch May 29, 2026 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants