Split gc.cpp to multiple files based on the functionality - part 1#125703
Split gc.cpp to multiple files based on the functionality - part 1#125703janvorli wants to merge 44 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @agocke, @dotnet/gc |
There was a problem hiding this comment.
Pull request overview
This pull request appears to refactor CoreCLR GC implementation by moving substantial logic into new .cpp compilation units (which are then included from gc.cpp), separating concerns like sweeping, regions, no-GC mode, memory commit/decommit accounting, collection, initialization, and finalization.
Changes:
- Adds new GC implementation files for sweeping, regions (allocator/free list), no-GC region behavior, memory commit/decommit, collection, initialization, and finalization.
- Updates GC internals by removing an unused
VERIFY_HEAPmethod declaration fromgcpriv.h. - (Implied by file structure) Consolidates GC implementation via
#include-based composition of these new.cppunits.
Reviewed changes
Copilot reviewed 15 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/gc/sweep.cpp | Adds sweep/free-list building and related helpers (incl. FEATURE_BASICFREEZE RO sweep). |
| src/coreclr/gc/region_free_list.cpp | Adds region free-list management for USE_REGIONS. |
| src/coreclr/gc/region_allocator.cpp | Adds region allocation/deallocation logic for USE_REGIONS. |
| src/coreclr/gc/no_gc.cpp | Adds no-GC region support logic, including callback scheduling and region-extension helpers. |
| src/coreclr/gc/memory.cpp | Adds commit/decommit accounting and region decommit logic. |
| src/coreclr/gc/init.cpp | Adds GC initialization logic (including region range reservation/init under USE_REGIONS). |
| src/coreclr/gc/finalization.cpp | Adds finalization queue implementation and finalizer work scheduling. |
| src/coreclr/gc/collect.cpp | Adds GC collection logic (including region ephemeral/gc range computation under USE_REGIONS). |
| src/coreclr/gc/gcpriv.h | Removes set_batch_mark_array_bits declaration under VERIFY_HEAP. |
You can also share your feedback on Copilot code review. Take the survey.
|
If there is interest in keeping A repo admin might be necessary to actually merge that because a squash would likely undo it. |
That sounds very interesting. I'll give that a try. |
|
If necessary for perf reasons (ie lost optimizations that LTCG doesn’t figure out when you try to compile separately in the next PR), you can use the UNITY_BUILD feature in CMake to force all of the files to be included in one C++ file in Release builds to maintain the current experience while providing a better dev story. |
|
@MichalStrehovsky it seems that git blame still works without any special handling. You just need to pass it the |
|
Hmm, I take it back, the git -C -C doesn't work. |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR appears to split previously monolithic GC implementation code into multiple focused .cpp files under src/coreclr/gc/ (e.g., sweep/regions/no-GC/memory/init/finalization/collect) and updates .git-blame-ignore-revs to keep git blame attribution useful after the mechanical refactor.
Changes:
- Extract GC implementation areas into new compilation-unit fragments (included from
gc.cpp), such as sweeping, region allocation/free-list management, no-GC region logic, memory commit/decommit accounting, GC init, finalization, and collection logic. - Add region allocator/free-list implementations behind
USE_REGIONS. - Add blame-ignore entries for the mechanical “split gc.cpp” trimming commits.
Reviewed changes
Copilot reviewed 15 out of 21 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/gc/sweep.cpp | New split-out sweeping/free-list threading logic (plus RO segment sweeping under BASICFREEZE). |
| src/coreclr/gc/region_free_list.cpp | New region free-list bookkeeping, ordering, and sorting utilities (USE_REGIONS). |
| src/coreclr/gc/region_allocator.cpp | New region allocator implementation (USE_REGIONS). Contains correctness issues noted in PR comments. |
| src/coreclr/gc/no_gc.cpp | New split-out no-GC region and callback logic. |
| src/coreclr/gc/memory.cpp | New split-out commit/decommit accounting and decommit stepping logic (incl. regions). |
| src/coreclr/gc/init.cpp | New split-out GC initialization logic (incl. regions initial reservation). |
| src/coreclr/gc/finalization.cpp | New split-out finalization queue logic and finalizer work scheduling. |
| src/coreclr/gc/collect.cpp | New split-out GC collection driver logic and region ephemeral-range computation. |
| .git-blame-ignore-revs | Adds mechanical split/trim commits to blame-ignore list to preserve attribution. |
You can also share your feedback on Copilot code review. Take the survey.
|
Copilot cli has reworked the PR based on the article mentioned. It also added a To make local git blame work the best, |
|
It requires git version >= 2.40 (older git doesn't seem to use the algorithm) |
|
And this PR needs to be merged without squashing. Otherwise the tracking would get lost |
|
Let me know when you are ready to merge it. I can elevate to merge it without squashing. |
Co-authored-by: Austin Wise <AustinWise@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR reorganizes CoreCLR GC implementation by splitting the previously monolithic gc.cpp into multiple functionality-focused .cpp files, which are still #included from gc.cpp to preserve the single translation unit build model.
Changes:
- Split GC implementation into 19 category-specific
.cppfiles (allocation, marking, planning, sweeping, background GC, regions, etc.). - Updated
gc.cppto include the new files at the end, keeping shared infrastructure ingc.cpp. - Added
.git-blame-ignore-revsentries to makegit blameignore the mechanical “trim” commits for the split files.
Reviewed changes
Copilot reviewed 15 out of 21 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
.git-blame-ignore-revs |
Adds ignore-revs entries for the mechanical trimming commits associated with the split. |
src/coreclr/gc/gc.cpp |
Retains shared GC infrastructure and #includes the newly split implementation files. |
src/coreclr/gc/allocation.cpp |
Allocation helpers and SOH/LOH/POH allocation paths. |
src/coreclr/gc/background.cpp |
Background GC (concurrent mark/sweep, write watch support). |
src/coreclr/gc/card_table.cpp |
Card/brick table management and related helpers. |
src/coreclr/gc/collect.cpp |
Collection entry points and post-GC processing. |
src/coreclr/gc/diagnostics.cpp |
Heap verification and diagnostics/ETW walking support. |
src/coreclr/gc/dynamic_heap_count.cpp |
Dynamic heap count (server GC heap scaling). |
src/coreclr/gc/dynamic_tuning.cpp |
GC tuning heuristics, budgets, smoothing. |
src/coreclr/gc/finalization.cpp |
Finalization queue management and related helpers. |
src/coreclr/gc/init.cpp |
GC initialization logic (gc_heap::initialize_gc, related init helpers). |
src/coreclr/gc/interface.cpp |
GCHeap interface implementation (public API surface). |
src/coreclr/gc/mark_phase.cpp |
Mark phase implementation (mark stack, pinned handling). |
src/coreclr/gc/memory.cpp |
Virtual memory commit/decommit and address space management. |
src/coreclr/gc/no_gc.cpp |
No-GC region support. |
src/coreclr/gc/plan_phase.cpp |
Plan phase implementation (plug/gap processing, generation planning). |
src/coreclr/gc/region_allocator.cpp |
Region allocator implementation (regions mode). |
src/coreclr/gc/region_free_list.cpp |
Region free list management. |
src/coreclr/gc/regions_segments.cpp |
Segment/region lifecycle and mapping table management. |
src/coreclr/gc/relocate_compact.cpp |
Relocate/compact phases. |
src/coreclr/gc/sweep.cpp |
Sweep phase and free list building helpers. |
You can also share your feedback on Copilot code review. Take the survey.
|
@jkotas I think it is ready for merging. The CI failures are either known issues or Mono related ones. |
|
@jkotas please wait a bit with the merging, I have tried to test clrgc.dll from this build and there seems to be some issue, I need to verify it first. |
This PR splits the monolithic gc.cpp (~57,000 lines) into 19 smaller files organized by functionality, reducing gc.cpp to ~8,300 lines of shared infrastructure. The new files are #included at the end of gc.cpp, preserving the
existing single-compilation-unit build model for this first step - no CMakeLists.txt changes were made.
New files
allocation.cppallocate_small/large, SOH/LOH/POH allocationbackground.cppcard_table.cppcollect.cppgarbage_collect,do_post_gcdiagnostics.cppverify_heap, ETW/diag walkingdynamic_heap_count.cppdynamic_tuning.cppfinalization.cppinit.cppgc_heap::initialize_gc)interface.cppGCHeapinterface implementation (public API surface)mark_phase.cppmemory.cppno_gc.cppplan_phase.cppregion_allocator.cppregion_free_list.cppregions_segments.cpprelocate_compact.cppsweep.cppDesign decisions
- Remaining in gc.cpp: Shared globals, macros, general-purpose helpers, and methods that don't clearly belong to one functional category stay in gc.cpp.
What's NOT changed
These changes were made by copilot cli with my supervision and reviewing. The next step will be to actually compile each of the files separately.