Skip to content

Fix non-deterministic ordering in ParallelAnalyzer#445

Merged
dundee merged 3 commits into
dundee:masterfrom
ShivamB25:fix-parallel-analyzer-ordering
Nov 22, 2025
Merged

Fix non-deterministic ordering in ParallelAnalyzer#445
dundee merged 3 commits into
dundee:masterfrom
ShivamB25:fix-parallel-analyzer-ordering

Conversation

@ShivamB25

Copy link
Copy Markdown
Contributor

Fixes #444

Problem

The ParallelAnalyzer produced non-deterministic results due to processing subdirectories in parallel and adding them to the parent directory in completion order rather than filesystem order.

Solution

  • Modified processDir to track the original index of all items (both files and directories)
  • Both files and subdirectories are now collected with their indices and added to the parent in the original order from os.ReadDir
  • Subdirectories are still processed in parallel for performance, but results are reordered before adding

Changes

  • Added indexedItem struct to pair items with their original position
  • Use buffered channel to collect all items without blocking
  • Collect items by index and add in original order

Testing

  • Added TestParallelAnalyzerDeterminism: verifies parallel analyzer produces identical results across multiple runs
  • Added TestParallelVsSequentialConsistency: verifies parallel and sequential analyzers produce identical results
  • Added TestFileDirectoryInterleaving: specifically tests preservation of file/directory ordering
  • All existing tests pass

…ering/determinism tests

Use indexed items and a buffered channel to collect files and subdirs with their original indices,
then append them in-order to preserve the original file/directory interleaving when running the
parallel analyzer. This makes output deterministic and consistent with the sequential analyzer.

Also add tests and helper:
- TestParallelAnalyzerDeterminism
- TestParallelVsSequentialConsistency
- TestFileDirectoryInterleaving
- getFileNames helper
@ShivamB25

Copy link
Copy Markdown
Contributor Author

@rafl please verify for the same by having a building yourself of my fork.

@codecov

codecov Bot commented Nov 4, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 77.16535% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.72%. Comparing base (4212583) to head (0d05b34).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
pkg/analyze/parallel_stable.go 77.16% 25 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #445      +/-   ##
==========================================
- Coverage   83.88%   83.72%   -0.17%     
==========================================
  Files          46       47       +1     
  Lines        3792     3919     +127     
==========================================
+ Hits         3181     3281     +100     
- Misses        537      561      +24     
- Partials       74       77       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dundee

dundee commented Nov 19, 2025

Copy link
Copy Markdown
Owner

We need to do proper performance benchmarks for this PR before merging.

@dundee

dundee commented Nov 22, 2025

Copy link
Copy Markdown
Owner

I will introduce this as a new analyzer, because there's a performance penalty (approx 1-5%) for keeping the stable sorting.

@ShivamB25

Copy link
Copy Markdown
Contributor Author

@dundee i will try to improve the performance.

@dundee dundee merged commit 2840ec6 into dundee:master Nov 22, 2025
10 checks passed
@ShivamB25 ShivamB25 deleted the fix-parallel-analyzer-ordering branch November 24, 2025 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nondeterministic behaviour when scanning in parallel

2 participants