-
Notifications
You must be signed in to change notification settings - Fork 874
Description
I have a scenario where a process runs in Azure for a long time, querying a database rather eagerly. I estimate the load to be 5-20 queries/second.
The issue
After running for about 4 hours, the job consistently fails to connect to the database. The relevant exception and stack trace is given below:
An exception of type Npgsql.NpgsqlException was thrown: Failed to establish a connection to 'database-host'.
Stack trace:
at Npgsql.NpgsqlClosedState.Open(NpgsqlConnector context, Int32 timeout)
at Npgsql.NpgsqlConnector.Open()
at Npgsql.NpgsqlConnectorPool.GetPooledConnector(NpgsqlConnection Connection)
at Npgsql.NpgsqlConnectorPool.RequestPooledConnectorInternal(NpgsqlConnection Connection)
at Npgsql.NpgsqlConnectorPool.RequestConnector(NpgsqlConnection Connection)
at Npgsql.NpgsqlConnection.Open() at System.Data.Common.DbConnection.OpenAsync(CancellationToken cancellationToken)
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Data.Entity.Core.EntityClient.EntityConnection.d__8.MoveNext()
Inner exceptions:
An exception of type System.Net.Sockets.SocketException was thrown: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full
Stack trace:
at Npgsql.NpgsqlClosedState.Open(NpgsqlConnector context, Int32 timeout)
Possible causes
I have not identified a certain cause, but judging by this MSDN blog post it seems that Npgsql either exhausts TCP ephemeral ports or the memory available for TCP buffers.
Reproducing the issue
Sadly, I am unable to share any code. The essence is an EF6 DBContext, connected with pooling enabled (I also use SSL, if that matters), hitting the database eagerly for an extended time period. Eventually, an Open should fail with the error above. The same single instance of the DBContext is used, and async/await is used rather heavily.