简体   繁体   English

Active Azure Sql连接超出了连接池限制

[英]Active Azure Sql Connections are over the connection pool limit

We fight the issue in production when once in a while our Azure SQL database performance significantly degrades. 我们的Azure SQL数据库性能会在很长一段时间内显着降低,我们会在生产中解决问题。 We know we have locks on one of the tables, but these locks are not deadlocks, they are long locks and in an hour or so the performance returns to normal. 我们知道我们在其中一个表上有锁,但是这些锁不是死锁,它们是长锁,并且在一个小时左右的时间内性能恢复正常。 We are trying to find all the possible scenarios on how we get these long locks (every query is super fast and all performance analyzers could show us what causes long locks). 我们试图找到关于如何获得这些长锁的所有可能场景(每个查询都超快,所有性能分析器都可以向我们展示导致长锁的原因)。 The reason for this question is the picture below: 这个问题的原因如下图:

在此输入图像描述

Out connection pool settings allow only 200 connections to be pooled. Out连接池设置仅允许汇集200个连接。 And most of the times we have about 10-20 open/pooled connections with the database. 大多数时候,我们与数据库有大约10-20个开放/池化连接。 Then suddenly a number of active connections start to grow and the pool is completely taken. 然后突然一些活动连接开始增长,池完全被占用。 While a number of pooled connections stay below 200, we see a number of active connections using sp_who2 reach 1.5k-2k connections (sometimes 4k-5k). 虽然许多池化连接保持在200以下,但我们看到使用sp_who2的多个活动连接达到1.5k-2k连接(有时为4k-5k)。

I've built the same chart using Azure Portal monitoring tools. 我使用Azure门户监控工具构建了相同的图表。 It has different aggregation period but shows the same issue: 它有不同的聚合期,但显示相同的问题: 在此输入图像描述

the connection string we use: 我们使用的连接字符串:

Data Source=[server].database.windows.net;initial catalog=[database];persist security info=True;user id=[user];password=[password];MultipleActiveResultSets=True;Connection Timeout=30;Max Pool Size=200;Pooling=True;App=[AppName] Data Source = [server] .database.windows.net; initial catalog = [database]; persist security info = True; user id = [user]; password = [password]; MultipleActiveResultSets = True; Connection Timeout = 30; Max Pool大小= 200;池=真;应用= [AppName的]

How is that possible taking into account connection pool limitation of 200 connections? 考虑到200个连接的连接池限制,这怎么可能?

ps: there is no periodic task, long running query or other tool doing anything, we checked with sp_who2 all the active connections to the database. ps:没有周期性任务,长时间运行的查询或其他任何工具,我们检查sp_who2sp_who2所有活动连接。

[this is more of a long comment than an answer] [这是一个长期评论而非答案]

I do have several hosts connected to the same database but each host has the same limitation of 200 connections 我确实有几个主机连接到同一个数据库,但每个主机具有相同的200个连接限制

The connection pool is per (Connection String,AppDomain). 连接池是per(连接字符串,AppDomain)。 Each Server might have multiple AppDomains. 每个服务器可能有多个AppDomain。 And each AppDomain will have one connection pool per connection string. 每个AppDomain每个连接字符串都有一个连接池。 So here if you have different user/password combos, they will generate different connection pools. 因此,如果您有不同的用户/密码组合,它们将生成不同的连接池。 So no real mystery why it is possible to have more than 200 connections. 所以没有真正的神秘,为什么有200多个连接可能。

So why are you getting lots of connections? 那你为什么要得到很多联系呢? Possible causes: 可能的原因:

Connection Leaks. 连接泄漏。

If you are failing to Dispose a DbContext or a SqlConnection that connection will linger on the managed heap until finalized, and not be available for reuse. 如果您未能公开DbContext或SqlConnection,那么连接将在托管堆上停留,直到最终确定,并且无法重用。 When a connection pool reaches its limit, new connection request will wait for 30sec for a connection to become available, and fail after that. 当连接池达到其限制时,新连接请求将等待30秒以使连接可​​用,并在此之后失败。

You will not see any waits or blocking on the server in this scenario. 在这种情况下,您不会在服务器上看到任何等待或阻塞。 The sessions will all be idle, not waiting. 会议都将闲置,而不是等待。 And there would not be a large number of requests in 并且不会有大量的请求

select *
from sys.dm_exec_requests 

Note that Session Wait Stats are now live on Azure SQL DB, so it's much easier to see realtime blocking and waits. 请注意,会话等待统计信息现在可以在Azure SQL DB上运行,因此可以更轻松地查看实时阻止和等待。

select *
from sys.dm_exec_session_wait_stats

Blocking. 阻塞。

If incoming requests begin to be blocked by some transaction, and new requests keep starting, your number of sessions can grow, as new requests get new sessions, start requests and become blocked. 如果传入请求开始被某个事务阻止,并且新请求不断启动,则会话数量会增加,因为新请求会获得新会话,启动请求并被阻止。 Here you would see lots of blocked requests in 在这里你会看到很多被阻止的请求

select *
from sys.dm_exec_requests

Slow Queries. 慢查询。

If requests were just talking a long time to finish due to resourse availability (CPU, Disk, Log), you could see this. 如果请求只是由于资源可用性(CPU,磁盘,日志)而需要很长时间才能完成,您可以看到这一点。 But that's unlikely as your DTU usage is low during this time. 但这不太可能,因为在此期间您的DTU使用率很低。

So the next step for you is to see if these connections are active on the server suggesting blocking, or idle on the server suggesting a connection pool problem. 因此,下一步是查看这些连接是否在服务器上处于活动状态,表明在服务器上阻塞或空闲,表明存在连接池问题。

There are 2 things you can check on you dbcontext objects to see if you are using them correctly and dispose object to return the connection to the connection pool. 有两件事可以检查dbcontext对象,看看你是否正确使用它们并配置对象以返回连接池的连接。

First, you are creating the dbcontext from code. 首先,您是从代码创建dbcontext。 Check if there is a using statement around each creation scope of the dbcontext object. 检查dbcontext对象的每个创建范围周围是否有using语句。 Something like: 就像是:

using (var context = new xxxContext()) {
    ...
}

This will dispose the context when it goes out of scope automatically. 这将在它自动超出范围时处置上下文。

Second you are using dependency injection to inject the dbcontext object. 其次,您使用依赖注入来注入dbcontext对象。 Make sure you are using scoped: 确保使用作用域:

services.AddScoped<xxxContext>(

Then the DI will take care of disposing your context objects. 然后DI将负责处理您的上下文对象。

The next thing you can check is if you have uncommitted transactions. 您可以检查的下一件事是您是否有未提交的事务。 Check if all you transactions are within using blocks, so they will commit or rollback when you are out of scope. 检查所有事务是否都在使用块中,因此当您超出范围时,它们将提交或回滚。

The problem may related to " Pool fragmentation " 问题可能与“ 池碎片 ”有关

Pool fragmentation is a common problem in many Web applications where the application can create a large number of pools that are not freed until the process exits. 池碎片是许多Web应用程序中的常见问题,其中应用程序可以创建大量池,这些池在流程退出之前不会被释放。 This leaves a large number of connections open and consuming memory, which results in poor performance. 这会导致大量连接打开并消耗内存,从而导致性能下降。

Pool Fragmentation Due to Integrated Security* Connections are pooled according to the connection string plus the user identity. 由于集成安全性导致的池碎片*根据连接字符串和用户标识共享连接。 Therefore, if you use Basic authentication or Windows Authentication on the Web site and an integrated security login, you get one pool per user. 因此,如果在网站上使用基本身份验证或Windows身份验证以及集成安全登录,则每个用户可以获得一个池。 Although this improves the performance of subsequent database requests for a single user, that user cannot take advantage of connections made by other users. 虽然这可以提高单个用户的后续数据库请求的性能,但该用户无法利用其他用户建立的连接。 It also results in at least one connection per user to the database server. 它还导致每个用户至少与数据库服务器建立一个连接。 This is a side effect of a particular Web application architecture that developers must weigh against security and auditing requirements. 这是特定Web应用程序体系结构的副作用,开发人员必须权衡安全性和审计要求。

Source : https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/sql-server-connection-pooling 资料来源https//docs.microsoft.com/en-us/dotnet/framework/data/adonet/sql-server-connection-pooling

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM