Skip to content

[dotnet] Raise ThreadPool min threads to avoid Kestrel stalls#7085

Merged
NachoEchevarria merged 2 commits into
mainfrom
nacho/AvoidThreadPoolStarvationDotnet
Jun 5, 2026
Merged

[dotnet] Raise ThreadPool min threads to avoid Kestrel stalls#7085
NachoEchevarria merged 2 commits into
mainfrom
nacho/AvoidThreadPoolStarvationDotnet

Conversation

@NachoEchevarria
Copy link
Copy Markdown
Contributor

@NachoEchevarria NachoEchevarria commented Jun 4, 2026

Motivation

Test_SqlServiceNameSource intermittently fails in INTEGRATIONS (~2% of runs in dd-trace-dotnet master). Root cause: Kestrel response-flush stalls for ~5 s while the .NET ThreadPool grows — the request completes server-side in ~12 ms, but bytes don't reach the client until after the test's 5 s read timeout.

The ThreadPool defaults its minimum to Environment.ProcessorCount for both worker and IOCP threads (so 2–4 on ubuntu-latest), and grows by 1 thread every ~500 ms. Under cumulative tracer + AppSec + IAST background load in this scenario, Kestrel's I/O queues behind tracer activity until the pool grows — long enough to trip the client timeout.

Changes

Set DOTNET_ThreadPool_ForceMinWorkerThreads=32 and DOTNET_ThreadPool_ForceMinIoCompletionThreads=32 in poc.Dockerfile and uds.Dockerfile. 32 is a comfortable floor above the burst; idle threads cost ~1 MB each, and on hosts with ≥32 vCPU the env vars are a no-op.

Verification

2 CI runs × 50 INTEGRATIONS attempts = 100 fresh scenario runs, 0 stalls (prior baseline ≈ 2 / 100). Diagnostic branch diag/dotnet-rasp-sqli-stall kept in case the flake reappears.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

CODEOWNERS have been resolved as:

utils/build/docker/dotnet/poc.Dockerfile                                @DataDog/apm-dotnet @DataDog/asm-dotnet @DataDog/system-tests-core
utils/build/docker/dotnet/uds.Dockerfile                                @DataDog/apm-dotnet @DataDog/asm-dotnet @DataDog/system-tests-core

@NachoEchevarria NachoEchevarria changed the title Avoid thread pool starvation [dotnet] Raise ThreadPool min threads to avoid Kestrel stalls Jun 4, 2026
@NachoEchevarria NachoEchevarria requested a review from Copilot June 4, 2026 16:18
@NachoEchevarria NachoEchevarria requested a review from amarziali June 4, 2026 16:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to eliminate intermittent .NET integration test flakes caused by Kestrel response flush delays while the .NET ThreadPool ramps up under tracer/AppSec/IAST load, by forcing higher minimum ThreadPool worker and I/O completion thread counts in the Docker images used for these scenarios.

Changes:

  • Set DOTNET_ThreadPool_ForceMinWorkerThreads=32 in the .NET Docker runtime images used by the tests.
  • Set the I/O completion thread minimum via a DOTNET_ThreadPool_ForceMin*CompletionThreads environment variable in the same images.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
utils/build/docker/dotnet/uds.Dockerfile Adds ThreadPool minimum thread environment variables to reduce Kestrel stalls in UDS scenario images.
utils/build/docker/dotnet/poc.Dockerfile Adds ThreadPool minimum thread environment variables to reduce Kestrel stalls in PoC scenario images.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread utils/build/docker/dotnet/poc.Dockerfile Outdated
Comment thread utils/build/docker/dotnet/uds.Dockerfile Outdated
@NachoEchevarria NachoEchevarria marked this pull request as ready for review June 5, 2026 07:47
@NachoEchevarria NachoEchevarria requested review from a team as code owners June 5, 2026 07:47
Copy link
Copy Markdown
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a try, thanks!

@NachoEchevarria NachoEchevarria merged commit dd59f76 into main Jun 5, 2026
109 checks passed
@NachoEchevarria NachoEchevarria deleted the nacho/AvoidThreadPoolStarvationDotnet branch June 5, 2026 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants