Skip to content

Memoize copy-to-output-directory item filtering in graph predictor to improve perf#143

Open
derekantrican wants to merge 1 commit into
microsoft:mainfrom
derekantrican:perf/graph-predictor-memoization
Open

Memoize copy-to-output-directory item filtering in graph predictor to improve perf#143
derekantrican wants to merge 1 commit into
microsoft:mainfrom
derekantrican:perf/graph-predictor-memoization

Conversation

@derekantrican
Copy link
Copy Markdown
Contributor

Cache the filtered CopyToOutputDirectory items per dependency ProjectInstance in GetCopyToOutputDirectoryItemsGraphPredictor. When multiple root nodes share transitive dependencies (common in large graphs), the same dependency's items were being filtered repeatedly across 6 item types.

The cache uses ConcurrentDictionary<ProjectInstance, Lazy<CopyItemResult[]>>:

  • ConcurrentDictionary for thread-safe access from parallel graph node processing
  • Lazy<> ensures the filtering work runs exactly once per dependency under contention
  • Results stored as immutable arrays of (InputPath, TargetPath) structs
  • Output paths (which depend on the caller's OutDir) are computed at report time

Cache the filtered CopyToOutputDirectory items per dependency ProjectInstance
in GetCopyToOutputDirectoryItemsGraphPredictor. When multiple root nodes share
transitive dependencies (common in large graphs), the same dependency's items
were being filtered repeatedly across 6 item types.

The cache uses ConcurrentDictionary<ProjectInstance, Lazy<CopyItemResult[]>>:
- ConcurrentDictionary for thread-safe access from parallel graph node processing
- Lazy<> ensures the filtering work runs exactly once per dependency under contention
- Results stored as immutable arrays of (InputPath, TargetPath) structs
- Output paths (which depend on the caller's OutDir) are computed at report time

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dfederm
Copy link
Copy Markdown
Member

dfederm commented May 6, 2026

Do you have any benchmarks to share?

@derekantrican
Copy link
Copy Markdown
Contributor Author

@dfederm - I don't right now (been swamped with a lot of other things), but if that's absolutely necessary then I can work on generating some. I would have to figure out what needs to be done to build the library and put it in the right place for the Office build system

internal const string MSBuildCopyContentTransitivelyPropertyName = "MSBuildCopyContentTransitively";
internal const string HasRuntimeOutputPropertyName = "HasRuntimeOutput";

private static readonly string[] CopyItemNames = new[]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change this to an array?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the memoization refactor, BuildCopyItemsCore needs to iterate over each item to build the cached results. Extracting them into a shared array (as opposed to the 6 separate ReportCopyToOutputDirectoryItemsAsInputs calls that existed before) avoids duplicating the list of item names in two places and makes it easier to keep them in sync if a new item type is added later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants