Skip to content

Transactional inbox/outbox via psql with multiple replicas: Fails to gracefully shutdown, Error trying to release the leadership lock #2146

Description

@tos-ilex

During termination of an app using postgresql for transactional inbox/outbox, exceptions are thrown:

[2026-02-09 10:59:04.723] INFO - Microsoft.Hosting.Lifetime
Application is shutting down...

[2026-02-09 10:59:04.757] ERROR - Wolverine.Runtime.Agents.NodeAgentController
Error trying to release the leadership lock

Npgsql.NpgsqlException: Exception while reading from stream
   at Npgsql.Internal.NpgsqlReadBuffer.<Ensure>g__EnsureLong|54_0(NpgsqlReadBuffer buffer, Int32 count, Boolean async, Boolean readingNotifications)
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at Npgsql.Internal.NpgsqlConnector.ReadMessageLong(Boolean async, DataRowLoadingMode dataRowLoadingMode, Boolean readingNotifications, Boolean isReadingPrependedMessage)
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
   at Npgsql.NpgsqlDataReader.NextResult(Boolean async, Boolean isConsuming, CancellationToken cancellationToken)
   at Npgsql.NpgsqlDataReader.NextResult(Boolean async, Boolean isConsuming, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(Boolean async, CommandBehavior behavior, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteReader(Boolean async, CommandBehavior behavior, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteNonQuery(Boolean async, CancellationToken cancellationToken)
   at Wolverine.Postgresql.AdvisoryLock.ReleaseLockAsync(Int32 lockId) in /home/runner/work/wolverine/wolverine/src/Persistence/Wolverine.Postgresql/PostgresqlNodePersistence.cs:line 439
   at Wolverine.Runtime.Agents.NodeAgentController.StopAsync(IMessageBus messageBus) in /home/runner/work/wolverine/wolverine/src/Wolverine/Runtime/Agents/NodeAgentController.cs:line 88

Inner Exception: System.IO.EndOfStreamException
   Message: Attempted to read past the end of the stream.
   at Npgsql.Internal.NpgsqlReadBuffer.<Ensure>g__EnsureLong|54_0(NpgsqlReadBuffer buffer, Int32 count, Boolean async, Boolean readingNotifications)

[2026-02-09 10:59:04.801] INFO - Wolverine.Transports.ListeningAgent
Stopped message listener at stub://replies/

[2026-02-09 10:59:04.816] INFO - Wolverine.Transports.ListeningAgent
Stopped message listener at rabbitmq://queue/apikey.subscriptions

[2026-02-09 10:59:05.328] INFO - Wolverine.Transports.ListeningAgent
Stopped message listener at dbcontrol://edd195fa-89a5-4ed9-8639-b979a18d43d9/

[2026-02-09 10:59:05.414] ERROR - Wolverine.Postgresql.PostgresqlMessageStore
Error trying to dispose of advisory locks for database WolverineEnvelopeStorage

System.InvalidOperationException: Connection is not open
   at Npgsql.ThrowHelper.ThrowInvalidOperationException(String message)
   at Npgsql.NpgsqlCommand.CheckAndGetConnection()
   at Npgsql.NpgsqlCommand.ExecuteReader(Boolean async, CommandBehavior behavior, CancellationToken cancellationToken)
   at Npgsql.NpgsqlCommand.ExecuteNonQuery(Boolean async, CancellationToken cancellationToken)
   at Wolverine.Postgresql.AdvisoryLock.DisposeAsync() in /home/runner/work/wolverine/wolverine/src/Persistence/Wolverine.Postgresql/PostgresqlNodePersistence.cs:line 459

This is from one of 3 k8s pod replicas. We're using the default 30 second grace period throughout, and the exceptions are thrown immediately when the termination signal is sent, nothing's timing out.

Config:

        services.AddWolverine(opts =>
        {
            opts.ServiceName = ServiceName;
            opts.PersistMessagesWithPostgresql(settings.ConnectionStrings.Database, WolverineSchemaName);
            opts.Durability.MessageStorageSchemaName = WolverineSchemaName;
            opts.Policies.AutoApplyTransactions();
            opts.UseEntityFrameworkCoreTransactions();
            opts.Policies.UseDurableLocalQueues();
            opts.Policies.UseDurableOutboxOnAllSendingEndpoints();
            opts.Policies.UseDurableInboxOnAllListeners();

            // opts.UseRabbitMq ...
            // opts.ListenToRabbitQueue ...
            // opts.PublishMessage ...
        });

        services.AddDbContext<MyDbContext>(
            (serviceProvider, options) =>
            {
                options.UseNpgsql(
                    settings.ConnectionStrings.Database,
                    npgsqlOptions =>
                    {
                        npgsqlOptions.MigrationsHistoryTable("__EFMigrationsHistory", MyDbContext.SchemaName);
                        npgsqlOptions.EnableRetryOnFailure();
                    }
                );

                // ... adding some interceptor
            },
            optionsLifetime: ServiceLifetime.Singleton
        );

The database is an AWS Aurora psql db. Everything works just fine, we haven't identified any functional issues, just the exceptions during shutdown.

Desktop (please complete the following information):

  • k8s v1.34.1-eks-3025e55
  • mcr.microsoft.com/dotnet/aspnet:9.0-noble-chiseled-extra
  • WolverineFx.EntityFrameworkCore:5.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions