[release/10.0] Fix NativeAOT GC hole issue#129711
Conversation
There is a GC hole when: * an exception is rethrown from a funclet * the exception escapes that funclet * a finally is executed for this secondary exception * GC runs while the call chain of this finally is being executed * A reference in non-volatile register is pushed in a prolog of one of the functions in the finally call chain * the nonvolatile register holds a live reference up somewhere up in the call chain of the parent of the catch handler that catches the secondary exception * the nonvolatile register is not pushed anywhere between the parent of the catch and the frame where the nonvolatile register holds a live GC reference In this case, if GC relocates that reference, it is updated in the stack frame of the finally call chain, but not in the location referenced by the REGDISPLAY in the ExInfo of the secondary exception. So when we resume after catch, the stale reference is placed in the nonvolatile register and then it bubbles up the call chain until it reaches the frame where the register is supposed to hold live GC reference. The fix is to save the nonvolatile registers after returning from a finally funclet back to the location referenced by the REGDISPLAY passed to the RhpCallFinallyFunclet. Close #129010
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @agocke, @dotnet/ilc-contrib |
JulieLeeMSFT
left a comment
There was a problem hiding this comment.
LGTM. Please get code review and check test failures when CI is done.
|
Hi, the code complete date for 10.0.10 (the July 2026 release) is 24 June. Make sure to merge this PR on that date at the latest (or explicitly let me know that I should merge it), or it won't make it into that release. As a reminder, if this is a product change, you also need Tactics approval before merging this PR (test-only or infra-only changes don't require Tactics approval). |
|
@svick, @JulieLeeMSFT I assume that since it has "servicing approved" label, it was already approved by the tactics. So it can be merged any time. The failure in the tests is a build timeout unrelated to this change. Since I am not authorized to push changes to the backporting branch, it would be great if one of you could do that. |
|
/ba-g only timeout and known errors |
|
@janvorli Done. |
Backport of #129598 to release/10.0
/cc @janvorli
Customer Impact
There is a GC hole in NativeAOT. As any other GC hole, it could lead to intermittent failures of applications due to unexpected NullReferenceException, AccessViolationException or just unexpected behavior. This GC hole occurs in some cases when GC scans stack with active exception handling when an exception thrown from a call chain of a funclet escapes the funclet and GC occurs when a finally handler of that secondary exception is being executed.
Regression
Introduced by #115284 in .NET 10.0
Testing
Libraries test that exposed the issue, directed regression test, CI coreclr and libraries tests.
Risk
Low. The change just ensures that a modified non-volatile register value is saved in the stack frame iterator of the pending exception handling, keeping that value up to date in case GC moves it.