Add runtime async support for saving and reusing continuation instances#125556
Add runtime async support for saving and reusing continuation instances#125556jakobbotsch merged 25 commits intodotnet:mainfrom
Conversation
This reverts commit 8e54df1.
There was a problem hiding this comment.
Pull request overview
This PR introduces an optimization for CoreCLR “runtime async” methods by enabling a single shared continuation layout per async method and reusing the same continuation instance across multiple suspension points, reducing allocations and GC pressure. It updates the continuation flags contract to encode per-suspension-point field offsets (notably for return storage) in Continuation.Flags.
Changes:
- JIT: Build shared continuation layouts across all suspension points and optionally reuse continuation instances (
JitAsyncReuseContinuations). - ABI/contract: Redefine continuation flags to encode exception/context/result slot indices via bitfields.
- BCL: Update
AsyncHelperscontinuation field accessors to decode indices from flags.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/vm/object.h | Adjusts continuation object helpers used by the runtime/interpreter to locate data/exception storage. |
| src/coreclr/vm/interpexec.cpp | Updates interpreter suspend/resume handling to use new continuation flag semantics. |
| src/coreclr/jit/jitstd/vector.h | Adds data() const overload needed by new JIT code. |
| src/coreclr/jit/jitconfigvalues.h | Adds JitAsyncReuseContinuations config switch. |
| src/coreclr/jit/async.h | Refactors async transformation to build per-state layouts and shared layout support types. |
| src/coreclr/jit/async.cpp | Implements shared layout creation, continuation reuse logic, and index encoding into flags. |
| src/coreclr/interpreter/compiler.cpp | Updates interpreter continuation layout/flags creation for new encoding scheme. |
| src/coreclr/inc/corinfo.h | Redefines CorInfoContinuationFlags to include index bitfield definitions. |
| src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs | Updates managed Continuation to decode exception/context/result locations from flags. |
Comments suppressed due to low confidence (1)
src/coreclr/vm/interpexec.cpp:4430
- Exception handling on interpreter resumption is gated on a placeholder flag literal (567), and GetExceptionObjectStorage currently returns the wrong slot. This will skip exception rethrow or read garbage. Update to the new index-based encoding (compare decoded exception index against the sentinel mask) and use the decoded offset when loading the exception object.
if (pAsyncSuspendData->flags & /*CORINFO_CONTINUATION_HAS_EXCEPTION */ 567)
{
// Throw exception if needed
OBJECTREF exception = *continuation->GetExceptionObjectStorage();
if (exception != NULL)
{
There was a problem hiding this comment.
Pull request overview
This PR updates the runtime async continuation model to support a single shared continuation layout across suspension points and (optionally) reuse a single continuation instance to reduce allocations/GC pressure.
Changes:
- Reworks continuation
Flagsto encode slot indices (exception/context/result) instead of simple presence bits, and updates VM + BCL consumers accordingly. - Introduces JIT infrastructure to build per-suspension sub-layouts, optionally merge them into a shared layout, and enable continuation reuse via a new config knob.
- Updates interpreter suspension metadata generation to use the new index-encoding scheme.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/vm/object.h | Computes result/exception storage addresses using index fields encoded in Flags. |
| src/coreclr/vm/interpexec.cpp | Decodes continuation-context index from flags and uses new exception-storage accessor. |
| src/coreclr/jit/jitstd/vector.h | Adds data() const to support const access patterns. |
| src/coreclr/jit/jitconfigvalues.h | Adds JitAsyncReuseContinuations config (default enabled). |
| src/coreclr/jit/async.h | Adds layout builder/state tracking types and shared-layout support plumbing. |
| src/coreclr/jit/async.cpp | Implements shared-layout creation, per-state suspend/resume emission, and continuation reuse logic. |
| src/coreclr/interpreter/compiler.cpp | Encodes exception/context/result indices into interpreter async suspend flags. |
| src/coreclr/inc/corinfo.h | Redefines CorInfoContinuationFlags to include index bitfields for slots. |
| src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs | Updates managed continuation flag decoding and storage offset computation to match new encoding. |
There was a problem hiding this comment.
Pull request overview
This PR adds an optimization for runtime async methods that enables using a single shared continuation layout across all suspension points, allowing the runtime to reuse a single continuation instance (reducing allocations/GC pressure). It also changes how continuation field locations (exception/context/result) are encoded and discovered across the JIT, VM, interpreter, and CoreLib.
Changes:
- Introduces shared continuation layout generation and continuation instance reuse for runtime async (gated by
JitAsyncReuseContinuations). - Updates continuation
Flagsto encode slot indices for exception/continuation-context/result, rather than “HasX” presence bits. - Adjusts VM/interpreter/CoreLib code to compute storage addresses using the new index encoding.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/vm/object.h | Computes continuation field addresses from encoded indices in Flags. |
| src/coreclr/vm/interpexec.cpp | Uses new VM helpers to locate exception/context storage slots during suspend/resume. |
| src/coreclr/jit/jitconfigvalues.h | Adds JitAsyncReuseContinuations config switch (enabled by default). |
| src/coreclr/jit/async.h | Updates async transformation APIs to support shared layouts/reuse and updated return lookup. |
| src/coreclr/jit/async.cpp | Implements shared layout building, reuse control flow, and index encoding into flags. |
| src/coreclr/interpreter/compiler.cpp | Encodes continuation slot indices into flags for interpreter async continuations. |
| src/coreclr/inc/corinfo.h | Redefines CorInfoContinuationFlags to include bit positions/sizes for encoded indices. |
| src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs | Updates managed Continuation helpers to interpret Flags as index-encoded offsets. |
|
cc @dotnet/jit-contrib PTAL @dhartglassMSFT @EgorBo With this PR continuation reuse is enabled by default. That means all suspension points use the same continuation layout (leaving the space for fields that are not live at particular suspension points untouched). This also makes the continuation live in the entire function, so it will result in some prolog code (if storing on the stack) or use of a register. On suspension the new codegen first checks if we have a continuation, and if so, reuses the existing continuation instead of calling
For 2/3 we will need some way to make a function-local call. I am not totally sure how to represent it. One possibility is that we pass it the state number and then use the state number in a switch to return back. |
There was a problem hiding this comment.
Pull request overview
This PR adds “continuation reuse” for runtime-async methods by generating a single shared continuation layout per method (covering all suspension points) and reusing the same continuation instance across suspensions to reduce allocations and GC pressure. It also updates the continuation flags format to encode per-suspension-point storage indices (exception/context/result) needed with shared layouts.
Changes:
- Switch continuation flags from “Has* presence bits” to encoded slot indices, updating CoreCLR VM, JIT, interpreter, and BCL helpers accordingly.
- Add JIT support for creating a shared continuation layout across states and reusing an existing continuation instance on subsequent suspensions.
- Introduce
JitAsyncReuseContinuationsconfig knob (enabled by default) to gate the optimization.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/inc/corinfo.h | Redefines CorInfoContinuationFlags to encode slot indices and updates bit assignments. |
| src/coreclr/vm/object.h | Updates ContinuationObject helpers to compute storage addresses from encoded indices. |
| src/coreclr/vm/interpexec.cpp | Switches interpreter execution to use the new “*_StorageOrNull” helpers for exception/context storage. |
| src/coreclr/interpreter/compiler.cpp | Updates interpreter suspend emission to encode slot indices into flags. |
| src/coreclr/jit/jitconfigvalues.h | Adds JitAsyncReuseContinuations config setting. |
| src/coreclr/jit/async.h | Adds shared-layout builder API and updates return-lookup signature. |
| src/coreclr/jit/async.cpp | Implements shared layout creation and continuation-instance reuse, and encodes result/exception/context indices in flags. |
| src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs | Updates managed continuation flag decoding to match the new encoded-index format. |
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs
Show resolved
Hide resolved
|
/ba-g Unknown failures were #125638 |
1 similar comment
|
/ba-g Unknown failures were #125638 |
This adds support for generating a single shared continuation layout for each runtime async method. The shared continuation layout is compatible with the state that needs to be stored for all suspension points in the function. For that reason it uses more memory than the previous separated continuation layouts.
The benefit is that once a single layout is compatible with all suspension points we can reuse the same continuation instance every time we suspend. That means a single runtime async function only ends up allocating one continuation instance.
On suspension heavy benchmarks this improves performance by about 30% by significantly reducing the amount of garbage generated.
A complication arises for return values. Before this change the continuation object always stored its single possible return value in a known location, and resumption stubs would propagate return values into the caller's continuation at that location. With this change the continuation stores space for all possible types of return values, and the offset to store at changes for every suspension point. To handle that we now encode the offset in
Continuation.Flags.Example with warmup
Codegen diff
Improves performance by about 30%. It also increases size, but I have ideas to reduce it again (e.g. by sharing the "alloc new continuation" path).
Based on #125497