Skip to content

Improve source indexer deduplication to prefer real implementation over *.notsupported.cs files#192

Merged
ericstj merged 18 commits into
mainfrom
copilot/fix-188
Aug 15, 2025
Merged

Improve source indexer deduplication to prefer real implementation over *.notsupported.cs files#192
ericstj merged 18 commits into
mainfrom
copilot/fix-188

Conversation

Copilot AI commented Jul 22, 2025

Copy link
Copy Markdown
Contributor

This PR addresses the issue where https://source.dot.net/ shows platform-agnostic source code containing mostly empty classes that throw PlatformNotSupportedException instead of the actual platform-specific implementation files.

Problem

The current deduplication logic in BinLogToSln/Program.cs was too simplistic, using only file path-based deduplication. When multiple builds existed for the same assembly (e.g., generic + Linux-specific), it would select whichever was processed first, often resulting in .notsupported.cs stub files being chosen over real implementation files.

Solution

Implemented improved deduplication logic that:

  1. Groups by Assembly Name: All compiler invocations are now grouped by AssemblyName instead of simple file path deduplication
  2. Scores Each Build: Uses a 5-tier scoring system to select the "best" build based on:
    • UseForSourceIndex property: int.MaxValue score for builds explicitly marked for indexing
    • Platform support status: -10,000 penalty for IsPlatformNotSupportedAssembly builds
    • Framework version: Higher scores for newer frameworks (major1000 + minor100)
    • Platform specificity: +500 bonus for platform-specific frameworks, with additional bonuses for Linux (+100) and Unix (+50)
    • Source file count: +1 per source file as a tiebreaker
  3. Selects Best Build: For each assembly, only the highest-scored build is included in the solution

Documentation and Testing

  • Documentation: Added comprehensive documentation in docs/source-selection-algorithm.md explaining the scoring priorities, examples with concrete scores, and configuration options
  • Unit Tests: Added BinLogToSln.Tests project with 12 test methods covering all scoring scenarios, edge cases, and priority ordering to protect against regressions

Example

Before this change, if there were two builds for System.Net.NameResolution:

  • Generic build: System.Net.NameResolution.notsupported.cs (score: -1,955)
  • Linux build: actual implementation files (score: 8,727)

The tool would pick whichever was processed first. Now it consistently selects the Linux build with real implementation.

The changes maintain backward compatibility while significantly improving source quality on https://source.dot.net/.

Fixes #188.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 3 commits July 22, 2025 16:39
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
…er *.notsupported.cs

Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
…utput

Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Copilot AI changed the title [WIP] Missing actual source code from https://source.dot.net/ Improve source indexer deduplication to prefer real implementation over *.notsupported.cs files Jul 22, 2025
Copilot AI requested a review from ericstj July 22, 2025 16:47

@ericstj ericstj left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please try plumbing project properties from the binlog as suggested in the feedback.

Comment thread src/SourceBrowser/src/BinLogToSln/Program.cs Outdated
Comment thread src/SourceBrowser/src/BinLogToSln/Program.cs Outdated
Comment thread src/SourceBrowser/src/BinLogToSln/Program.cs Outdated
Comment thread src/SourceBrowser/src/BinLogToSln/Program.cs Outdated
Comment thread src/SourceBrowser/src/BinLogToSln/Program.cs
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>

Copilot AI commented Jul 22, 2025

Copy link
Copy Markdown
Contributor Author

@copilot please try plumbing project properties from the binlog as suggested in the feedback.

Implemented project properties plumbing as requested. The BinLogReader now captures project properties from the binlog and stores them in a ProjectProperties dictionary on CompilerInvocation. The scoring logic uses these properties instead of file/command-line conventions. Commit c59f2a1

Copilot AI requested a review from ericstj July 22, 2025 17:30

@ericstj ericstj left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot give this a try.

Comment thread src/SourceBrowser/src/BinLogParser/BinLogReader.cs Outdated
Comment thread src/SourceBrowser/src/BinLogParser/BinLogReader.cs Outdated
Comment thread src/SourceBrowser/src/BinLogToSln/Program.cs Outdated
Comment thread src/SourceBrowser/src/BinLogToSln/Program.cs Outdated
…rder scoring priorities

Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>

Copilot AI commented Jul 23, 2025

Copy link
Copy Markdown
Contributor Author

@copilot give this a try.

Implemented all the requested changes! The BinLogReader now optimizes project access, the scoring system follows the proper priority hierarchy, and platform-specific precedence has been removed. All builds are passing. Commit 56384ee

Copilot AI requested a review from ericstj July 23, 2025 22:49
Comment thread NuGet.config Outdated
Comment thread global.json Outdated
Comment thread src/SourceBrowser/src/BinLogParser/BinLogParser.csproj Outdated
Comment thread src/SourceBrowser/src/BinLogParser/BinLogReader.cs Outdated
Comment thread src/SourceBrowser/src/BinLogToSln/BinLogToSln.csproj Outdated
Comment thread src/SourceBrowser/src/Common/Common.csproj Outdated
Comment thread src/SourceBrowser/src/SourceIndexServer.Tests/SourceIndexServer.Tests.csproj Outdated
Comment thread src/SourceBrowser/src/SourceIndexServer/SourceIndexServer.csproj Outdated
…BinLogReader

Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>

Copilot AI commented Jul 24, 2025

Copy link
Copy Markdown
Contributor Author

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/repos/KirillOsenkov/MSBuildStructuredLog/commits/00926a0debc2facd65611f68f91adafe0d56ab14
    • Triggering command: curl -s REDACTED (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI requested a review from ericstj July 24, 2025 02:52
Comment thread src/SourceBrowser/src/BinLogToSln/Program.cs
Comment thread src/SourceBrowser/src/BinLogToSln/Program.cs Outdated

@joperezr joperezr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM barring those two last comments I left. Thanks @ericstj

@ericstj

ericstj commented Aug 15, 2025

Copy link
Copy Markdown
Member

@copilot can you address @joperezr's feedback?

Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>

Copilot AI commented Aug 15, 2025

Copy link
Copy Markdown
Contributor Author

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • dnceng.pkgs.visualstudio.com
    • Triggering command: dotnet build src/SourceBrowser/src/BinLogToSln/BinLogToSln.csproj (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Comment thread src/SourceBrowser/src/BinLogToSln.Tests/InvocationScoringTests.cs Outdated
…t score values

Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>

Copilot AI commented Aug 15, 2025

Copy link
Copy Markdown
Contributor Author

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • dnceng.pkgs.visualstudio.com
    • Triggering command: dotnet test src/SourceBrowser/src/BinLogToSln.Tests/BinLogToSln.Tests.csproj --logger console;verbosity=normal (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI requested a review from ericstj August 15, 2025 16:21
Comment thread src/SourceBrowser/src/BinLogToSln.Tests/InvocationScoringTests.cs Outdated
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>

Copilot AI commented Aug 15, 2025

Copy link
Copy Markdown
Contributor Author

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • dnceng.pkgs.visualstudio.com
    • Triggering command: dotnet build src/SourceBrowser/src/BinLogParser/BinLogParser.csproj (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI requested a review from ericstj August 15, 2025 17:17

@joperezr joperezr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, this looks ready to go! Thanks @ericstj! :shipit:

@ericstj

ericstj commented Aug 15, 2025

Copy link
Copy Markdown
Member

Almost, we need to enable tests to run. We should add test step to https://github.com/dotnet/source-indexer/blob/main/.github/workflows/dotnet.yml

@ericstj ericstj merged commit 07dd0cc into main Aug 15, 2025
2 checks passed
joperezr added a commit to joperezr/source-indexer that referenced this pull request May 15, 2026
The fork has diverged ~21 commits since the last upstream sync (PR dotnet#184,
2025-05-12). Blindly re-running update-source-browser.ps1 risks silently
dropping local features (dotnet#183 signing key, dotnet#192 dedup, dotnet#193 source-generated
files, dotnet#255 net10 retarget, dotnet#257 Linux binlog fix, plus Dependabot bumps).

- 02: prepend warning block to the 'Updating the vendored SourceBrowser'
  section listing the divergent PRs and recommending cherry-picks over a
  full re-sync.
- 00: cross-reference the warning from the overview bullet.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing actual source code from https://source.dot.net/

4 participants