feat!: Detect retries and flaky tests by gca3020 · Pull Request #28 · ctrf-io/go-ctrf-json-reporter

gca3020 · 2026-04-03T20:49:15Z

Detect Retries and Flaky Tests

Note

This could be considered a breaking change, since a test with RetryAttempts will be reported differently than it was before. This will only occur when using a test harness that supports retries (see below), and since this project is still <1.0, I figured this would be acceptable, but wanted to call it out regardless. The new ctrf results file continues to conform to the specification documented at https://ctrf.io/docs/specification/

What's in this PR?

This PR refactors some of the detection logic, to support accurately reporting test retries, and flaky tests. This is a gap in the current implementation, and this functionality is requested in #12.

Additionally, this PR fixes some of the Start/Stop/Duration logic, which is required to accurately grab the test output of only a single retry.

But I thought Go didn't do retries?

They don't, yet. There is currently an open (and accepted) proposal to add this to the core go test functionality. See golang/go#62244 for more details.

In the meantime, the gotestsum package supports re-running failed tests using the --rerun-fails=n and --rerun-fails-max-failures=n flags. Using these flags, after the full suite is complete, the subset of failed tests is re-run as part of a new suite. This repeats as much as specified, with each new run containing just the tests that failed in the last run.

So how do we detect them?

Essentially, as we are parsing the JSON "events" coming from the go test output, we look back through the list of already-captured TestResults to see if we already have an instance of this test in our report. If no, the behavior continues as it does today, and we append the new TestResult to the list, and update the summary counters as normal.

However, if we already have a "matching" test, then we instead update that test object, populating the list of RetryAttempts as per the spec in https://ctrf.io/docs/specification/test#retry-attempt-object.

What about the Start/Stop time changes?

With the nature of Go's test2json output, we know that a test has a "Start Timestamp" when the run event appears in the JSON output. We capture this time as the "start" time of a specific test in a map. When we eventually see a "pass/fail/skip" event for the same test, we use that as the "stop" time of the test.

Because we are guaranteed that a test won't retry before it has actually failed, we can use these start/stop times to gate how far we look back when building the "Messages" for the failed test. We only need to look backwards to the "startTime" of the current in-progress test, which prevents us from grabbing logs from previous runs.

Since I needed these start/stop times anyway, I figured I would go ahead and instrument the optional Start/Stop fields in each TestResult and RetryAttempt, where ctrf can use them to compute better timing information.

AI Usage Disclosure

Just trivial autocomplete suggestions from Copilot, which ended up being mostly useless anyway.

Using gotestsum with --rerun-failed to run these tests will generate output that simulates passing, skipped, failing, and flaky tests. The output from this can be used to generate test data for the flaky test detection test.

- Detect a test being run multiple times - Capture these additional runs as test retries - Detect a fail->pass transition as a flaky test - Additionally, fix start/stop time detection of tests

Ma11hewThomas · 2026-04-04T20:45:17Z

Thank you @gca3020, I really appreciate your contribution and also for highlighting the future direction of Go regarding flaky tests.

I've reviewed the code and tested end-to-end using gotestsum --rerun-fails and it works nicely, a great addition.

On the breaking change point, thanks for flagging, I agree it's the correct approach. The current behaviour of emitting a separate TestResult for each retry attempt was never the intended design and collapsing retries into a single result with nested retryAttempts is the right model per spec.

That being said I'll release this as v0.1.0 to give users a clear upgrade signal.

I'll merge and release soon! Thanks again!

gca3020 · 2026-04-04T22:00:30Z

Thanks! One other thing I noticed while implementing this and fighting the linter, is that gotestsum provides a package (testjson) for parsing the Go test json format. It might be worth updating the reporter to use that in the future, as it's maintained, well tested, and supports go test functionality like benchmarks and race detection.

Regardless, thanks for creating this project in the first place; it's really great to be able to have good annotations in test runs!

gca3020 added 3 commits April 3, 2026 14:08

chore: Add example flaky tests

3a050a7

Using gotestsum with --rerun-failed to run these tests will generate output that simulates passing, skipped, failing, and flaky tests. The output from this can be used to generate test data for the flaky test detection test.

feat: Add support for detecting flaky tests

f0a95da

- Detect a test being run multiple times - Capture these additional runs as test retries - Detect a fail->pass transition as a flaky test - Additionally, fix start/stop time detection of tests

fix package name for flaky example

e7d058d

gca3020 mentioned this pull request Apr 3, 2026

Support for flaky tests #12

Open

gca3020 force-pushed the flaky-test-support branch from f5495cd to caff748 Compare April 4, 2026 14:13

chore: Fix some linter errors

962b295

gca3020 force-pushed the flaky-test-support branch from caff748 to 962b295 Compare April 4, 2026 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat!: Detect retries and flaky tests#28

feat!: Detect retries and flaky tests#28
gca3020 wants to merge 4 commits intoctrf-io:mainfrom
gca3020:flaky-test-support

gca3020 commented Apr 3, 2026

Uh oh!

Ma11hewThomas commented Apr 4, 2026

Uh oh!

gca3020 commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gca3020 commented Apr 3, 2026

Detect Retries and Flaky Tests

What's in this PR?

But I thought Go didn't do retries?

So how do we detect them?

What about the Start/Stop time changes?

AI Usage Disclosure

Uh oh!

Ma11hewThomas commented Apr 4, 2026

Uh oh!

gca3020 commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants