Skip to content

Fix possible deadlock in UnixCertificateManager and MacOSCertificateManager#66727

Merged
Youssef1313 merged 6 commits into
mainfrom
dev/ygerges/fix-deadlock
Jun 2, 2026
Merged

Fix possible deadlock in UnixCertificateManager and MacOSCertificateManager#66727
Youssef1313 merged 6 commits into
mainfrom
dev/ygerges/fix-deadlock

Conversation

@Youssef1313

Copy link
Copy Markdown
Member

Fixes #65518

When stdout/stderr are redirected, the child process writes to buffer (typically 4kb, at least on Windows), and the parent process is expected to be reading the stdout/stderr. If the parent process didn't read stdout/stderr, then the child process will be blocked forever on any Console.Write* call when the buffer is full (4kb of data is already written).

The child process can only get unblocked when the parent process consumes the output, but the parent process never attempts to read the output.

RunAndCaptureText is a new API in .NET 11 that attempts to resolve that. See also the very recent blogpost about this: https://devblogs.microsoft.com/dotnet/process-api-improvements-in-dotnet-11/

@adamsitnik I'm less familiar with the new APIs and haven't used them much yet. I was hesitating between RunAndCaptureText (keeping the current behavior of redirecting stdout/stderr which is likely done to avoid the avoid from showing on the running terminal) versus using ProcessStartInfo.StartDetached which prevents the handle inheritance altogether and uses null/nop handles instead (which is probably faster but more deviating from the original code).

Copilot AI review requested due to automatic review settings May 18, 2026 16:30
@github-actions github-actions Bot added the needs-area-label Used by the dotnet-issue-labeler to label those issues which couldn't be triaged automatically label May 18, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a possible Unix development-certificate startup hang by replacing a redirected certutil process wait with Process.RunAndCaptureText, allowing stdout/stderr to be consumed while waiting.

Changes:

  • Updates the NSS database certificate lookup path to use Process.RunAndCaptureText.
  • Preserves existing exit-code-based success behavior for the lookup.

Comment thread src/Shared/CertificateGeneration/UnixCertificateManager.cs Outdated
@Youssef1313 Youssef1313 force-pushed the dev/ygerges/fix-deadlock branch from 9991cd8 to cf3abff Compare May 18, 2026 16:40
@Youssef1313 Youssef1313 force-pushed the dev/ygerges/fix-deadlock branch from 56470e3 to 5326fb3 Compare May 18, 2026 16:56
@Youssef1313 Youssef1313 changed the title Fix possible deadlock in UnixCertificateManager Fix possible deadlock in UnixCertificateManager and MacOSCertificateManager May 18, 2026
@Youssef1313 Youssef1313 requested a review from javiercn May 18, 2026 19:53

@adamsitnik adamsitnik left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the fact that you have found other places that could run into the same bug and fixed them @Youssef1313 🥇

I also love the fact that you have tried the new APIs! But please keep in mind that in case you need to backport these changes to older versions, they won't be available.

PTAL at my comments, thanks!

}
using (var process = Process.Start(MacOSTrustCertificateCommandLine, MacOSTrustCertificateCommandLineArguments + tmpFile))

var exitStatus = Process.Run(new ProcessStartInfo(MacOSTrustCertificateCommandLine, MacOSTrustCertificateCommandLineArguments + tmpFile));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for using the new APIs @Youssef1313 👍

return checkTrustProcess.ExitCode == 0 ? TrustLevel.Full : TrustLevel.None;
};

var checkTrustProcessOutput = Process.RunAndCaptureText(checkTrustProcessStartInfo);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this particular use case the best solution is to redirect standard output and error to null file.

What we get:

  • no output is being printed
  • no extra work is being performed to capture the text just to ignore it

This is how it could look like (pseudocode, I've not compiled it)

            using SafeFileHandle nullHandle = File.OpenNullHandle();
            var checkTrustProcessStartInfo = new ProcessStartInfo(
                MacOSVerifyCertificateCommandLine,
                string.Format(CultureInfo.InvariantCulture, MacOSVerifyCertificateCommandLineArgumentsFormat, tmpFile))
            {
                // Do this to avoid showing output to the console when the cert is not trusted. It is trivial to export
                // the cert and replicate the command to see details.
                StandardOutputHandle = nullHandle,
                StandardErrorHandle= nullHandle
            });

            var checkTrustProcessOutput = Process.Run(checkTrustProcessStartInfo);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created dotnet/runtime#128453 in order to make it one-liner

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it different from StartDetached?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StartDetached

Yes, start detached focuses on ensuring the process can outlive the parent (it creates new session on Unix etc) and in addition to that it redirects to null


{output}");
}
{processOutput.StandardOutput}{processOutput.StandardError}");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This problem has existed before this PR, I just want to mention it: joining standard output and error like this may confuse the users when it produced invalid order.

Imagine that the child process produces following output:

OUT: A
ERR: B
OUT: C

The combined output will be:

A
C
B

In the past I was considering introducing a method that would redirect std out and err to the same pipe and producing a single string (and preserving original ordering). I think it would be very useful in such scenarios.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. As it's a pre-existing different issue, I would say it's out of scope for this PR for now.

using var process = Process.Start(startInfo)!;
process.WaitForExit();
return process.ExitCode == 0;
return Process.RunAndCaptureText(startInfo).ExitStatus.ExitCode == 0;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, it would be better to redirect to null file and then just .Run

using var process = Process.Start(startInfo)!;
process.WaitForExit();
return process.ExitCode == 0;
return Process.RunAndCaptureText(startInfo).ExitStatus.ExitCode == 0;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, it would be better to redirect to null file and then just .Run

using var process = Process.Start(startInfo)!;
process.WaitForExit();
return process.ExitCode == 0;
return Process.RunAndCaptureText(startInfo).ExitStatus.ExitCode == 0;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, it would be better to redirect to null file and then just .Run

@adamsitnik adamsitnik left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (assuming the tests are going to pass), thank you for addressing my feedback @Youssef1313 !

RedirectStandardOutput = true
});
RedirectStandardOutput = true,
RedirectStandardError = true,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes behavior beyond just fixing the deadlock. Previously stderr from security find-certificate was inherited and visible on the user's console; now it's captured into findCertificateProcessOutput.StandardError and never read. In practice security find-certificate writes useful diagnostics there — e.g. SecKeychainSearchCopyNext: The specified item could not be found in the keychain., "keychain does not exist", or User interaction is not allowed. in locked / SSH / CI sessions — which have historically helped diagnose dev-certs trust issues.

Could we either (a) route stderr to File.OpenNullHandle() like the other call sites that don't care about it, to make the discard explicit, or (b) log findCertificateProcessOutput.StandardError via the existing Log.* event source when it's non-empty? Option (b) preserves the diagnostic signal even when EventSource listeners are attached. Not a blocker for the deadlock fix itself.

@Youssef1313 Youssef1313 May 29, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. I think it might be safer to just revert to the old code. In this specific callsite, as the original code was reading stdout and not redirecting stderr at all, that path shouldn't have deadlock.

The paths that would have deadlocks are either not reading the output at all, or redirecting both stdout and stderr and reading them sequentially.

@Youssef1313

Copy link
Copy Markdown
Member Author

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 2 pipeline(s).

@rokonec rokonec left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Youssef1313 Youssef1313 merged commit 27a6a87 into main Jun 2, 2026
26 checks passed
@Youssef1313 Youssef1313 deleted the dev/ygerges/fix-deadlock branch June 2, 2026 12:55
@dotnet-policy-service dotnet-policy-service Bot added this to the 11.0-preview6 milestone Jun 2, 2026
wtgodbe added a commit to wtgodbe/aspnetcore that referenced this pull request Jun 16, 2026
…acOS VSD hang

Merge-downgrade of eng/Version.Details.{xml,props} + global.json to dotnet/dotnet
codeflow commit 8d5666c (keeps arcade/Wix + main-only deps at main, no dep removal).
Reverts the macOS PR-check Helix queue (dotnet#63531) back to OSX.15.Amd64.Open so the
hang-prone queue is exercised. Rolls the 2 CertificateGeneration files back to the
target commit to stay coherent with the older runtime (pre-dotnet#66727 Process.Run API).

Draft probe PR for runtime mac-hang bisection. Not for merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wtgodbe added a commit to wtgodbe/aspnetcore that referenced this pull request Jun 16, 2026
…acOS VSD hang

Merge-downgrade of eng/Version.Details.{xml,props} + global.json to dotnet/dotnet
codeflow commit 8a68740 (keeps arcade/Wix + main-only deps at main, no dep removal).
Reverts the macOS PR-check Helix queue (dotnet#63531) back to OSX.15.Amd64.Open so the
hang-prone queue is exercised. Rolls the 2 CertificateGeneration files back to the
target commit to stay coherent with the older runtime (pre-dotnet#66727 Process.Run API).

Draft probe PR for runtime mac-hang bisection. Not for merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wtgodbe added a commit to wtgodbe/aspnetcore that referenced this pull request Jun 16, 2026
…acOS VSD hang

Merge-downgrade of eng/Version.Details.{xml,props} + global.json to dotnet/dotnet
codeflow commit 2f91a9d (keeps arcade/Wix + main-only deps at main, no dep removal).
Reverts the macOS PR-check Helix queue (dotnet#63531) back to OSX.15.Amd64.Open so the
hang-prone queue is exercised. Rolls the 2 CertificateGeneration files back to the
target commit to stay coherent with the older runtime (pre-dotnet#66727 Process.Run API).

Draft probe PR for runtime mac-hang bisection. Not for merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wtgodbe added a commit to wtgodbe/aspnetcore that referenced this pull request Jun 16, 2026
…acOS VSD hang

Merge-downgrade of eng/Version.Details.{xml,props} + global.json to dotnet/dotnet
codeflow commit 8d5666c (keeps arcade/Wix + main-only deps at main, no dep removal).
Reverts the macOS PR-check Helix queue (dotnet#63531) back to OSX.15.Amd64.Open so the
hang-prone queue is exercised. Rolls the 2 CertificateGeneration files back to the
target commit to stay coherent with the older runtime (pre-dotnet#66727 Process.Run API).

Draft probe PR for runtime mac-hang bisection. Not for merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wtgodbe added a commit to wtgodbe/aspnetcore that referenced this pull request Jun 16, 2026
…acOS VSD hang

Merge-downgrade of eng/Version.Details.{xml,props} + global.json to dotnet/dotnet
codeflow commit 8a68740 (keeps arcade/Wix + main-only deps at main, no dep removal).
Reverts the macOS PR-check Helix queue (dotnet#63531) back to OSX.15.Amd64.Open so the
hang-prone queue is exercised. Rolls the 2 CertificateGeneration files back to the
target commit to stay coherent with the older runtime (pre-dotnet#66727 Process.Run API).

Draft probe PR for runtime mac-hang bisection. Not for merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wtgodbe added a commit to wtgodbe/aspnetcore that referenced this pull request Jun 16, 2026
…acOS VSD hang

Merge-downgrade of eng/Version.Details.{xml,props} + global.json to dotnet/dotnet
codeflow commit 2f91a9d (keeps arcade/Wix + main-only deps at main, no dep removal).
Reverts the macOS PR-check Helix queue (dotnet#63531) back to OSX.15.Amd64.Open so the
hang-prone queue is exercised. Rolls the 2 CertificateGeneration files back to the
target commit to stay coherent with the older runtime (pre-dotnet#66727 Process.Run API).

Draft probe PR for runtime mac-hang bisection. Not for merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wtgodbe added a commit to wtgodbe/aspnetcore that referenced this pull request Jun 16, 2026
…acOS VSD hang

Merge-downgrade of eng/Version.Details.{xml,props} + global.json to dotnet/dotnet
codeflow commit 8d5666c (keeps arcade/Wix + main-only deps at main, no dep removal).
Adds the target-era darc-pub-dotnet-extensions feed to NuGet.config so rolled-back
package versions (e.g. Microsoft.Extensions.Caching.Hybrid 10.4.1) resolve.
Reverts the macOS PR-check Helix queue (dotnet#63531) back to OSX.15.Amd64.Open.
Targeted API-break fixes coherent with the older runtime: rolls the 2
CertificateGeneration files back to the target (pre-dotnet#66727 Process.Run) and removes
the JsonTypeInfoKind.Union block (pre-dotnet#67001) in OpenApi.

Draft probe PR for runtime mac-hang bisection. Not for merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wtgodbe added a commit to wtgodbe/aspnetcore that referenced this pull request Jun 16, 2026
…acOS VSD hang

Merge-downgrade of eng/Version.Details.{xml,props} + global.json to dotnet/dotnet
codeflow commit 8a68740 (keeps arcade/Wix + main-only deps at main, no dep removal).
Reverts the macOS PR-check Helix queue (dotnet#63531) back to OSX.15.Amd64.Open so the
hang-prone queue is exercised. Rolls the 2 CertificateGeneration files back to the
target commit to stay coherent with the older runtime (pre-dotnet#66727 Process.Run API).

Draft probe PR for runtime mac-hang bisection. Not for merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wtgodbe added a commit to wtgodbe/aspnetcore that referenced this pull request Jun 16, 2026
…acOS VSD hang

Merge-downgrade of eng/Version.Details.{xml,props} + global.json to dotnet/dotnet
codeflow commit 2f91a9d (keeps arcade/Wix + main-only deps at main, no dep removal).
Reverts the macOS PR-check Helix queue (dotnet#63531) back to OSX.15.Amd64.Open so the
hang-prone queue is exercised. Rolls the 2 CertificateGeneration files back to the
target commit to stay coherent with the older runtime (pre-dotnet#66727 Process.Run API).

Draft probe PR for runtime mac-hang bisection. Not for merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-area-label Used by the dotnet-issue-labeler to label those issues which couldn't be triaged automatically

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ASP.NET fails to start on Linux

5 participants