Fix test-quarantine unquarantine matching and re-quarantine detection by wtgodbe · Pull Request #67356 · dotnet/aspnetcore

wtgodbe · 2026-06-21T13:48:58Z

Problem

The test-quarantine workflow run 27903471243 produced two classes of incorrect output, each confirmed against live AzDO/Helix pipeline data.

1. Bad unquarantines — matched by leaf method name

The agent matched tests by bare leaf method name instead of fully-qualified name, so it borrowed an unrelated test's pass statistics:

[test-quarantine] Unquarantine HelloWorldTest.HelloWorld #67355 unquarantined HelloWorldTests.HelloWorld using pass-stats from the unrelated StartupTests.HelloWorld (the target test had zero run evidence in the cited pipeline).
[test-quarantine] Unquarantine NavigationLock and CanNavigateToQueryStringPageWithNoQuery #67354 unquarantined the subclass override ServerExecutionTests.ServerRoutingTest.NavigationLock_... using the passing base class Tests.RoutingTest.NavigationLock_..., even though the actual quarantined subclass was failing on main (builds 1473858, 1473100).

Both PRs were closed.

2. Missed re-quarantine — `LongPollingWorks` treated as new

TestServerTests.LongPollingWorks was re-quarantined as a brand-new Case A quarantine (new issue #67348 via #67349) instead of a Case B re-quarantine reusing the original tracking issue #65702, because:

Case B only triggered within a 14-day window, but this test was unquarantined 62 days earlier ([test-quarantine] Unquarantine TestServerTests.LongPollingWorks #66367), so it fell through to Case A; and
even in Case B, the agent searched for a "Quarantine ..."-titled issue, but [Known Build Error] TestServerTests.LongPollingWorks - ObjectDisposedException in OnDisconnectedAsync during shutdown #65702 is titled/labeled Known Build Error, so a title search missed it.

Fixes (all in `test-quarantine.md`)

Unquarantine matching:

Group AzDO rows by the exact, full automatedTestName.
Compute pass rate and require real passing-run evidence only from rows whose fully-qualified identity (namespace + concrete declaring class + method) matches the [QuarantinedTest] attribute in source; fail closed on zero rows.
Explicit handling for subclass overrides, inherited base methods, parameterized/theory rows, and the IIS multi-assembly prefix exception.

Re-quarantine (Case B):

Detect previously-unquarantined tests per-test, not per-file — walk the file's QuarantinedTest history newest→oldest and stop at the most recent commit changing this exact test's attribute (a single file holds many tests).
Removed the 14-day window (no time limit).
Evaluate Case B before Case A, with an explicit "does not match Case B" gate on Case A.
Derive the original tracking issue from the issue URL removed by the unquarantine commit — reusable regardless of label (test-failure or Known Build Error), open or closed.
Fall back to Case A when no valid numeric issue URL is present.

The diff was reviewed by two independent models to convergence (both APPROVE).

Recompiled test-quarantine.lock.yml with gh aw v0.79.8.

The test-quarantine workflow (run 27903471243) produced two classes of incorrect output, confirmed against live AzDO/Helix pipeline data: 1. Bad unquarantines (PRs dotnet#67354, dotnet#67355). The agent matched tests by bare leaf method name instead of fully-qualified name, so it borrowed another test's pass statistics: - dotnet#67355 unquarantined HelloWorldTests.HelloWorld using pass-stats from the unrelated StartupTests.HelloWorld. - dotnet#67354 unquarantined the subclass override ServerExecutionTests.ServerRoutingTest.NavigationLock_... using the passing base class Tests.RoutingTest.NavigationLock_..., even though the actual quarantined subclass was failing on main. 2. Missed re-quarantine (PR dotnet#67349 / issue dotnet#67348). TestServerTests. LongPollingWorks was re-quarantined as a brand-new Case A quarantine (new issue) instead of a Case B re-quarantine reusing original issue dotnet#65702, because (a) Case B only triggered within a 14-day window and this test was unquarantined 62 days earlier, and (b) the original issue is titled/labeled "Known Build Error", which a title search for a "Quarantine ..."-prefixed issue would miss. Fixes in test-quarantine.md: - Unquarantine matching: group AzDO rows by exact full automatedTestName; compute pass rate and require real passing run evidence only from rows whose fully-qualified identity (namespace + concrete declaring class + method) matches the attribute in source, with explicit handling for subclass overrides, inherited base methods, parameterized/theory rows, and the IIS multi-assembly prefix exception. Fail closed on zero rows. - Re-quarantine (Case B): detect previously-unquarantined tests per-test (not per-file) by walking the file's QuarantinedTest history newest to oldest and stopping at the most recent commit that changes this exact test's attribute; removed the 14-day window; evaluate Case B before Case A (with an explicit "does not match Case B" gate on Case A); derive the original tracking issue from the issue URL removed by the unquarantine commit (reusable regardless of label, open or closed); fall back to Case A when no valid numeric issue URL is present. Recompiled test-quarantine.lock.yml with gh aw v0.79.8. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR updates the Daily Test Quarantine Management workflow’s source-of-truth documentation to prevent incorrect unquarantine matching (by leaf method name) and to correctly detect/handle re-quarantines (Case B) by reusing the original tracking issue regardless of how long ago the unquarantine happened.

Changes:

Tighten unquarantine guidance to match/aggregate using fully-qualified identities (and to fail closed on “no run evidence”) instead of leaf method names.
Rework Case B guidance to be per-test (not per-file), remove the 14-day window, and derive the original tracking issue from the unquarantine commit diff.
Regenerate test-quarantine.lock.yml via gh aw compile to reflect the updated markdown body.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`.github/workflows/test-quarantine.md`	Updates the workflow instructions for correct fully-qualified test matching and robust Case B re-quarantine detection/issue reuse.
`.github/workflows/test-quarantine.lock.yml`	Regenerated compiled workflow (metadata/body hash update) to reflect the updated `.md`.

…ping, set-based run evidence Responds to the automated review on dotnet#67356: - Case B history walk and the 60-day attribute-age check now pass `--follow` so renamed/moved test files keep their full quarantine history. The walk uses `git log --follow -p` to read each commit's patch inline (with the file's historical path), instead of a separate `git show <sha> -- <current path>` that returns an empty diff for any commit predating a rename. The original-issue lookup reads the same `-p` patch. - Step 2.1 aggregation groups by the exact, full `automatedTestName` (the verbatim AzDO string, including theory/parameter arguments), deferring `{DeclaringClass}.{Method}` normalization to Step 2.3 — this removes the description inconsistency between the two steps. - The unquarantine "real run evidence" gate now evaluates the set of rows that resolve to the test's source identity (matching {DeclaringClass}.{Method}, including allowed IIS multi-assembly variants and inherited concrete-subclass rows) rather than a single exact name. A 0-pass/0-fail variant (e.g. a [ConditionalFact]-skipped IIS variant) no longer disqualifies a candidate as long as at least one row in the set passed and none failed; fail closed only when the whole resolved set is 0 pass / 0 fail. Step 2.3's "must pass" wording is softened to "zero failures" to stay consistent. Recompiled test-quarantine.lock.yml. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings June 21, 2026 13:48

wtgodbe requested a review from a team as a code owner June 21, 2026 13:48

Copilot started reviewing on behalf of wtgodbe June 21, 2026 13:49 View session

Copilot AI reviewed Jun 21, 2026

View reviewed changes

Comment thread .github/workflows/test-quarantine.md

Comment thread .github/workflows/test-quarantine.md Outdated

Comment thread .github/workflows/test-quarantine.md Outdated

wtgodbe requested a review from Copilot June 21, 2026 14:17

Copilot started reviewing on behalf of wtgodbe June 21, 2026 14:17 View session

Copilot AI reviewed Jun 21, 2026

View reviewed changes

wtgodbe merged commit 1ca52e4 into dotnet:main Jun 21, 2026
10 checks passed

dotnet-maestro Bot mentioned this pull request Jun 23, 2026

[main] Source code updates from dotnet/aspnetcore dotnet/dotnet#7330

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test-quarantine unquarantine matching and re-quarantine detection#67356

Fix test-quarantine unquarantine matching and re-quarantine detection#67356
wtgodbe merged 2 commits into
dotnet:mainfrom
wtgodbe:wtgodbe/fix-quarantine-unquarantine-matching

wtgodbe commented Jun 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wtgodbe commented Jun 21, 2026

Problem

1. Bad unquarantines — matched by leaf method name

2. Missed re-quarantine — LongPollingWorks treated as new

Fixes (all in test-quarantine.md)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

2. Missed re-quarantine — `LongPollingWorks` treated as new

Fixes (all in `test-quarantine.md`)