Skip to content

Add distributed remote grain directory compatibility#10050

Merged
ReubenBond merged 12 commits into
dotnet:mainfrom
ReubenBond:feature/distributed-remote-grain-directory
May 8, 2026
Merged

Add distributed remote grain directory compatibility#10050
ReubenBond merged 12 commits into
dotnet:mainfrom
ReubenBond:feature/distributed-remote-grain-directory

Conversation

@ReubenBond

@ReubenBond ReubenBond commented Apr 29, 2026

Copy link
Copy Markdown
Member

Summary

Adds the distributed remote grain directory compatibility path on top of #10047 so distributed-directory silos can serve legacy IRemoteGrainDirectory requests during rolling upgrades from the local grain directory.

Changes

  • Add DistributedRemoteGrainDirectory backed by DistributedGrainDirectory.
  • Register compatibility system targets for the directory service and cache validator types.
  • Route remote register, lookup, and unregister calls through the distributed directory implementation.
  • Add mixed local/distributed rolling-upgrade compatibility coverage and resilience fixes.

Stack notes

This branch is rebased on top of #10047. The focused review range is:

fix/directory-snapshot-transfer-ranges..feature/distributed-remote-grain-directory

Follow-on PRs:

Validation

Focused validation run locally:

  • dotnet test test\Orleans.GrainDirectory.Tests\Orleans.GrainDirectory.Tests.csproj --framework net10.0 --filter "FullyQualifiedName~GrainDirectoryRollingUpgradeTests|FullyQualifiedName~GrainDirectoryPartitionBatchingTests|FullyQualifiedName~GrainDirectoryResilienceTests.JoiningSilo_DoesNotLeaveStaleEntriesOnPreviousOwner" -- -parallel none -noshadow

GrainDirectoryResilienceTests.ElasticChaos was also attempted in a broader run and timed out; the deterministic non-chaos coverage above passed.

@ReubenBond ReubenBond force-pushed the feature/distributed-remote-grain-directory branch from 4f0f87c to 0a5f4a4 Compare April 29, 2026 18:23
@ReubenBond ReubenBond changed the base branch from main to fix/directory-stable-ownership-view April 29, 2026 18:23
@ReubenBond ReubenBond force-pushed the fix/directory-stable-ownership-view branch from 7f1c458 to 223d182 Compare April 29, 2026 18:26
@ReubenBond ReubenBond changed the base branch from fix/directory-stable-ownership-view to main April 29, 2026 18:30
@ReubenBond ReubenBond force-pushed the feature/distributed-remote-grain-directory branch 2 times, most recently from 8c1324b to c78df19 Compare April 29, 2026 19:46
@ReubenBond ReubenBond force-pushed the feature/distributed-remote-grain-directory branch from c78df19 to 69a37ea Compare April 29, 2026 21:01
@ReubenBond

Copy link
Copy Markdown
Member Author

Rebuilt on top of #10053. The base compatibility pieces needed for earlier CI have moved into #10047; this PR now carries the follow-up distributed remote directory resilience and mixed-upgrade fixes.

@ReubenBond ReubenBond force-pushed the feature/distributed-remote-grain-directory branch from 69a37ea to 03dbb92 Compare April 29, 2026 23:39
@ReubenBond ReubenBond force-pushed the feature/distributed-remote-grain-directory branch 2 times, most recently from e1e53f4 to 4b26b7b Compare May 5, 2026 16:20
@ReubenBond ReubenBond requested a review from Copilot May 5, 2026 16:25

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a compatibility layer so silos running the new DistributedGrainDirectory can continue to serve legacy IRemoteGrainDirectory requests during rolling upgrades from LocalGrainDirectory, alongside testing-host and test-suite updates to validate mixed-mode upgrade scenarios.

Changes:

  • Add DistributedRemoteGrainDirectory system targets backed by DistributedGrainDirectory to handle legacy remote directory calls during rolling upgrades.
  • Add local-directory compatibility system targets (IGrainDirectoryClient/IGrainDirectoryPartition) to enable mixed local/distributed interop.
  • Extend testing infrastructure and add/adjust tests to cover rolling upgrade and resilience behavior.
Show a summary per file
File Description
test/Orleans.GrainDirectory.Tests/GrainDirectory/GrainDirectoryRollingUpgradeTests.cs New rolling-upgrade test exercising mixed Local→Distributed directory transitions with diagnostic log capture.
test/Orleans.GrainDirectory.Tests/GrainDirectory/GrainDirectoryResilienceTests.cs Use runtime PartitionsPerSilo from DirectoryMembershipService for integrity checks.
src/Orleans.TestingHost/TestCluster.cs Initialize client gateways from active silo gateway endpoints (fallback to configured ports).
src/Orleans.TestingHost/InProcTestClusterOptions.cs Add UseTestClusterGrainDirectory option (internal) to allow opting out of the test grain directory.
src/Orleans.TestingHost/InProcTestClusterBuilder.cs Default UseTestClusterGrainDirectory to true for backwards-compatible behavior.
src/Orleans.TestingHost/InProcTestCluster.cs Conditionally register the test grain directory based on UseTestClusterGrainDirectory.
src/Orleans.Runtime/Hosting/CoreHostingExtensions.cs Update distributed directory registration to construct DirectoryMembershipService with explicit partitioning/boundaries configuration.
src/Orleans.Runtime/GrainDirectory/RemoteGrainDirectory.cs Add optional registerAsSystemTarget to avoid system target ID conflicts when distributed directory is active.
src/Orleans.Runtime/GrainDirectory/LocalGrainDirectoryCompatibility.cs New compatibility system targets enabling distributed-directory silos to interact with local-directory silos.
src/Orleans.Runtime/GrainDirectory/LocalGrainDirectory.cs Avoid registering local IRemoteGrainDirectory targets when distributed directory is active; register compatibility targets for local-mode silos.
src/Orleans.Runtime/GrainDirectory/GrainDirectoryHandoffManager.cs Add additional rolling-upgrade diagnostic logging around split-partition handoff and removal.
src/Orleans.Runtime/GrainDirectory/DistributedRemoteGrainDirectory.cs New distributed-backed IRemoteGrainDirectory implementation with batching, retries, and duplicate-activation cleanup.
src/Orleans.Runtime/GrainDirectory/DistributedGrainDirectory.cs Use configured partitions-per-silo and register distributed-backed IRemoteGrainDirectory targets for upgrade compatibility.
src/Orleans.Runtime/GrainDirectory/DirectoryMembershipSnapshot.cs Add default ring-boundary function matching LocalGrainDirectory and use it consistently (including for Default).
src/Orleans.Runtime/GrainDirectory/DirectoryMembershipService.cs Parameterize membership snapshot generation by partitions-per-silo and boundary function.

Copilot's findings

  • Files reviewed: 15/15 changed files
  • Comments generated: 2

@ReubenBond ReubenBond force-pushed the feature/distributed-remote-grain-directory branch from e1618b4 to b39d8b8 Compare May 5, 2026 18:30
@ReubenBond ReubenBond enabled auto-merge May 6, 2026 21:34
@ReubenBond ReubenBond added this pull request to the merge queue May 6, 2026
@ReubenBond ReubenBond removed this pull request from the merge queue due to a manual request May 6, 2026
@ReubenBond ReubenBond added this pull request to the merge queue May 7, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 7, 2026
@ReubenBond ReubenBond added this pull request to the merge queue May 7, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 7, 2026
@ReubenBond ReubenBond added this pull request to the merge queue May 7, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to a conflict with the base branch May 7, 2026
@ReubenBond ReubenBond force-pushed the feature/distributed-remote-grain-directory branch from 0780eca to 9fcee27 Compare May 7, 2026 21:31
ReubenBond and others added 12 commits May 7, 2026 16:31
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add CancellationToken support and initialization guard to prevent
DistributedRemoteGrainDirectory from blocking indefinitely when
directory requests arrive before the first membership update.

- Extract internal LookupAsync, RegisterAsync, UnregisterAsync methods
  with CancellationToken support on DistributedGrainDirectory
- Add EnsureDirectoryInitializedAsync to refresh membership view before
  processing requests from legacy silos
- Add 30-second timeout to all IRemoteGrainDirectory operations
- Pass DirectoryMembershipService to DistributedRemoteGrainDirectory
- Simplify rolling upgrade test assertions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restore queued split-partition handoff semantics for local-to-distributed upgrades, add compatibility targets on local silos for distributed recovery APIs, refresh test-cluster gateways from active silos, and harden the rolling-upgrade regression test diagnostics and client refresh path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@ReubenBond ReubenBond force-pushed the feature/distributed-remote-grain-directory branch from 9fcee27 to e09534f Compare May 7, 2026 23:31
@ReubenBond ReubenBond enabled auto-merge May 7, 2026 23:31
@ReubenBond ReubenBond added this pull request to the merge queue May 8, 2026
Merged via the queue into dotnet:main with commit f38dd83 May 8, 2026
117 of 119 checks passed
@ReubenBond ReubenBond deleted the feature/distributed-remote-grain-directory branch May 8, 2026 04:12
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 7, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants