Skip to content

feat(cockpit): aimock harness library + per-example e2e (Phase 2)#356

Merged
blove merged 15 commits into
mainfrom
claude/aimock-harness-lib
May 16, 2026
Merged

feat(cockpit): aimock harness library + per-example e2e (Phase 2)#356
blove merged 15 commits into
mainfrom
claude/aimock-harness-lib

Conversation

@blove
Copy link
Copy Markdown
Contributor

@blove blove commented May 16, 2026

Summary

Phase 2 of the cockpit aimock e2e plan. Restructures the harness so each cockpit example owns its own e2e dir next to its Angular app, backed by a shared internal library at `libs/internal/aimock-harness`.

  • New library `@ngaf-internal/aimock-harness` (internal-only, not published) exporting `createGlobalSetup`, `sendPromptAndWait`, `startAimock`, and a default global-teardown.
  • Migrated the Phase 1 streaming spec from `apps/cockpit/e2e/` to `cockpit/langgraph/streaming/angular/e2e/`.
  • Added `c-tool-calls` as the first new-pattern example with full multi-turn fixture (parent tool_call → tool result → continuation). Asserts the chat-tool-calls UI primitive activates AND continuation surfaces flight data.
  • Deleted `apps/cockpit/e2e/` entirely; dropped its e2e target from `apps/cockpit/project.json`.
  • CI `Cockpit — e2e` job now runs `nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1`.

Sits on Phase 1 (#349) + the c-* aviation refactor (#347 + #350).

Helper improvement (multi-turn safety)

Mid-implementation discovery: `sendPromptAndWait`'s per-message `data-streaming="false"` wait races on tool-call flows — the first finalized assistant bubble (tool-call chip) trips the helper before the continuation bubble streams in. Fixed by waiting for the SEND button to flip back from "Stop generating" to "Send" — the agent-level idle signal that survives multi-turn flows. Streaming spec (single-LLM-call) still passes; tool-calls spec now 3/3 stable.

Path-alias note for reviewers

Implementer found that TypeScript path aliases work in vitest/tsc but NOT at Playwright config-load time in this repo. The `@ngaf-internal/aimock-harness` alias is kept (helps `tsc` typechecking) but per-example playwright configs use relative imports (`../../../../../libs/internal/aimock-harness/src`). Investigating a Playwright-side fix (e.g., `tsconfig-paths/register`) is out of scope for this PR; relative imports work today.

Test plan

  • Library vitest suite green (5 tests: runner + helpers)
  • streaming spec passes after migration (1/1)
  • c-tool-calls spec passes 3/3 stability runs
  • `nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1` runs both green locally
  • No production code touched (only harness lib, per-example e2e dirs, project.json e2e targets, CI workflow)
  • CI green on this PR

Out-of-scope follow-ups noted during implementation

Spec: `docs/superpowers/specs/2026-05-15-cockpit-aimock-harness-lib-design.md`
Plan: `docs/superpowers/plans/2026-05-15-cockpit-aimock-harness-lib.md`

blove added 13 commits May 15, 2026 20:44
…le layout

Phase 1's single-globalSetup pattern doesn't scale to 15+ cockpit
examples. Phase 2 introduces libs/internal/aimock-harness with a
createGlobalSetup factory and migrates per-example e2e dirs to live
next to each example's Angular app. Streaming gets migrated; c-tool-calls
lands as the first new-pattern example. Future phases each add one
example as a small additive PR.
11 tasks. Task 0 de-risks path-alias resolution at Playwright runtime
(falls back to relative imports if aliases don't work). Tasks 1-5
scaffold + implement the library. Task 6 wires the alias (or skips if
relative imports needed). Tasks 7-8 migrate streaming + add c-tool-calls.
Tasks 9-10 delete the old layout + update CI. Task 11 verifies + ships.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
cacheplane Ready Ready Preview, Comment May 16, 2026 1:51pm

Request Review

…r sequential CI runs

CI's sequential per-example e2e loop hit `OSError: Port 8123 is already in
use` on the second run. Three compounding causes:

1. The factory spawned langgraph and Angular non-detached, so SIGTERM to
   the `uv`/`npx` parent didn't propagate to the actual server children
   (`python langgraph dev`, `node nx serve`). Process tree survived
   teardown.
2. The teardown's "wait for port free" did a TCP connect-refused check,
   not a real bind() check. langgraph's _is_port_available does a real
   bind, which fails on TIME_WAIT sockets that connect refuses don't
   surface.
3. Even with both fixed, TIME_WAIT sockets on 8123 from the first run's
   client connections (Playwright + Angular both opened many) blocked
   langgraph's bind() on the second run for far longer than the 5s sleep
   between targets.

Fixes:
- spawn(detached: true) + process.kill(-pid, 'SIGKILL') in teardown to
  kill the whole process group.
- waitForPortFree now does a real bind() check (mirrors langgraph's check).
- Each per-example pins its OWN langgraph port: streaming keeps 8123,
  tool-calls offsets to 8124. Angular proxy.conf.json target updated to
  match. Future examples pick the next unused port. Decouples examples
  from each other — TIME_WAIT on one example's port no longer blocks the
  next example.
- CI loop replaced with explicit shell loop (was nx run-many --parallel=1)
  for clearer per-example failure attribution and a 5s settle between
  targets.

Verified locally: 2-run sequential loop passes consistently.
@blove blove merged commit 382cdb2 into main May 16, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant