Skip to content

test(windows): stabilize advisory loop and bootstrap cases#579

Merged
Astro-Han merged 1 commit into
devfrom
slock/windows-advisory-fix
May 12, 2026
Merged

test(windows): stabilize advisory loop and bootstrap cases#579
Astro-Han merged 1 commit into
devfrom
slock/windows-advisory-fix

Conversation

@Astro-Han

@Astro-Han Astro-Han commented May 12, 2026

Copy link
Copy Markdown
Owner

Summary

  • switch the two Windows-sensitive session loop tests to the existing slowIOTimeout budget
  • fix the bootstrap status-hydration error test to wait for the state it actually asserts instead of the unrelated parallel provider_ready branch
  • keep both fixes scoped to test code only; no product abort, retry, or bootstrap behavior changes

Why

Two chronic Windows advisory failures were open at the same time:

These are different failure modes, but both close cleanly in the test layer.

Related Issue

Closes #555.
Closes #546.

Human Review Status

Pending. A human should make the final merge decision after reviewing the final diff and verification evidence.

Review Focus

  • packages/opencode/test/session/prompt-effect.test.ts: only the two named advisory loop tests should move from 3000 to slowIOTimeout
  • packages/app/src/context/global-sync/bootstrap.test.ts: the error-path test should now wait for session_status_state === "error", matching the behavior it actually verifies

Risk Notes

Low. Test-only changes. No runtime or product behavior changes.

How To Verify

Session loop advisory slice: bun --cwd packages/opencode test test/session/prompt-effect.test.ts -t 'cancel interrupts loop and resolves with an assistant message|concurrent loop callers all receive same error result' -> 2 passed
Bootstrap advisory slice: bun --cwd packages/app test src/context/global-sync/bootstrap.test.ts --preload ./happydom.ts -t 'marks session status as error when status hydration fails' -> 1 passed
Opencode typecheck: bun --cwd packages/opencode typecheck -> pass
App typecheck: bun --cwd packages/app typecheck -> pass
Diff check: git diff --check -> clean

Screenshots or Recordings

Not needed. No visible UI changes.

Checklist

  • Human review status is stated above as pending, approved, or not required
  • I linked the related issue, or stated why there is no issue
  • This PR has type, primary area, and priority labels, or I requested maintainer labeling
  • I described the review focus and any meaningful risks
  • I listed the relevant verification steps and the key result for each
  • I did not introduce unrelated refactors, dependencies, generated files, or file changes beyond the stated scope
  • I manually checked visible UI or copy changes when needed, with screenshots or recordings
  • I considered macOS and Windows impact for platform, packaging, updater, signing, paths, shell, or permissions changes
  • I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file changes when relevant
  • I reviewed the final diff for unrelated changes and suspicious dependency changes
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English

Summary by CodeRabbit

  • Tests
    • Improved session error handling test reliability
    • Enhanced cross-platform test timeout handling

Review Change Stack

@github-actions github-actions Bot added app Application behavior and product flows harness Model harness, prompts, tool descriptions, and session mechanics labels May 12, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested priority: P2 (includes user-path files (packages/app/src/context/global-sync/bootstrap.test.ts)).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

@coderabbitai

coderabbitai Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 39737898-282d-4274-afee-b9bfa93bdfb3

📥 Commits

Reviewing files that changed from the base of the PR and between 7d53dd7 and 662cb40.

📒 Files selected for processing (2)
  • packages/app/src/context/global-sync/bootstrap.test.ts
  • packages/opencode/test/session/prompt-effect.test.ts

📝 Walkthrough

Walkthrough

This PR addresses Windows test flakiness with two changes: session status error synchronization that waits for the correct state assertion, and platform-aware timeout constants for session loop tests that replace hardcoded millisecond values.

Changes

Windows Test Stability Fixes

Layer / File(s) Summary
Session status error synchronization
packages/app/src/context/global-sync/bootstrap.test.ts
Test synchronization waits for store.session_status_state === "error" instead of provider readiness, aligning the wait condition with the specific assertion that follows.
Platform-aware timeout constants
packages/opencode/test/session/prompt-effect.test.ts
Two test timeouts replace hardcoded 3_000 with slowIOTimeout variable (10_000 on Windows, 3_000 otherwise), following the established pattern from PR #543.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • Astro-Han/pawwork#543: Fixed earlier Windows advisory test timeout with the same slowIOTimeout pattern applied here to prompt-effect.test.ts.
  • Astro-Han/pawwork#575: Also modifies packages/opencode/test/session/prompt-effect.test.ts for test improvements.

Suggested labels

windows, flaky-test, P2, harness

Poem

🐰 Windows runners can be slow and wise,
Timeouts hardcoded cause surprise!
With slowIOTimeout now in place,
Tests run steady at their own pace.
Sync points aligned, no more waits astray—
Stability wins the testing day! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly and concisely summarizes the main changes: fixing Windows flakiness in test cases for advisory loop and bootstrap scenarios.
Description check ✅ Passed Description fully covers required sections: summary of changes, problem context, linked issues, human review status, review focus, risk assessment, verification steps, and completed checklist.
Linked Issues check ✅ Passed All code changes directly address the two linked issues: #555 updates two session loop tests to use slowIOTimeout, and #546 fixes bootstrap test to wait for the actual asserted error state.
Out of Scope Changes check ✅ Passed All changes are strictly scoped to test files only; no product code, runtime behavior, or unrelated refactors are present.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch slock/windows-advisory-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates test logic to wait for an explicit error state in bootstrap.test.ts and replaces hardcoded timeouts with slowIOTimeout in prompt-effect.test.ts to improve cross-platform stability. Feedback suggests further replacing the default timeout in the waitFor call with slowIOTimeout to avoid potential flakiness on Windows CI and identifies several other locations in prompt-effect.test.ts where hardcoded timeouts should be updated for consistency.

Comment thread packages/app/src/context/global-sync/bootstrap.test.ts
Comment thread packages/opencode/test/session/prompt-effect.test.ts
@Astro-Han Astro-Han added the P2 Medium priority label May 12, 2026
@Astro-Han Astro-Han merged commit c2fe9a9 into dev May 12, 2026
25 checks passed
Astro-Han added a commit that referenced this pull request May 15, 2026
…656)

Why
windows-advisory has been flaking at ~55% on dev push (27 failures /
50 runs in the latest sample). bun's default 3s it.live timeout is
consistently tight on the Windows runner for Effect-fiber + SQLite +
tmpdir-server tests in packages/opencode/test/session/prompt-effect.test.ts.
Four prior PRs (#543, a3b8e54, cf6d1cd, #579) each bumped one or two
tests at a time to a slowIOTimeout constant without converging — today's
failure on "cancel records MessageAbortedError on interrupted process"
was the fifth instance of the same root cause.

What
Wrap testEffect's live runner via withDefaultLiveTimeout so every
it.live in prompt-effect.test.ts picks up a Windows-aware default
(10s Windows / 3s elsewhere). Remove 12 third-arg timeout literals
(5 x 3_000, 5 x slowIOTimeout, 2 x shellQueueTimeout) and the two
duplicated `const ... Timeout` definitions. The wrapper also covers
.only and .skip for symmetry. Explicit non-default timeouts
(5_000, 10_000, 30_000) still override.

Out of scope
bun 1.3.13 watcher.node segfault on Windows process exit, and transient
actions/cache failures, are upstream / infra and intentionally left to
fail "normally" per d6fa1e6. The advisory workflow is not in branch
protection and existing if: always() artifact and summary uploads
preserve the diagnostic signal.

Verification
- tsgo --noEmit: clean
- bun test test/session/prompt-effect.test.ts --timeout 30000:
  54 pass / 0 fail / 28.69s (macOS)
- bun test test/github/ci-workflow.test.ts --timeout 30000:
  9 pass / 0 fail
- PR CI on 4d4792c: all green (typecheck, lint, unit-app, unit-desktop,
  unit-opencode, e2e-artifacts, smoke-macos-arm64, analyze-js-ts, CodeQL)

Review follow-ups
Gemini suggested also wrapping .only / .skip; applied and thread
resolved. An external review flagged a P1 about .only not being
wrapped; audited against HEAD and the claim referenced the pre-amend
version (8ad2c1a), not the current 4d4792c — no change required.

Risk
Detection of a true hang on Windows live tests now takes up to 10s
rather than 3s. Acceptable for an advisory signal. No production code
paths touched. windows-advisory will still show occasional red runs
from upstream / infra causes by design.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app Application behavior and product flows harness Model harness, prompts, tool descriptions, and session mechanics P2 Medium priority

Projects

None yet

1 participant