简体   繁体   English

SQL Server查询性能随着时间的推移而降低

[英]SQL Server query performance slows over time

I've seen this question asked in many ways all over the Internet but despite implementing the abundance of advice (and some voodoo), I'm still struggling. 我已经在整个Internet上以许多方式看到了这个问题,但是尽管实施了很多建议(和一些伏都教),但我仍在努力。 I have a 100GB+ database that is constantly inserting and updating records in very large transactions (200+ statements per trans). 我有一个100GB以上的数据库,该数据库不断地在非常大的事务中插入和更新记录(每个事务200个以上的语句)。 After a system restart, the performance is amazing (data is written to a large SATA III SSD connected via USB 3.0). 系统重新启动后,性能会非常出色(数据已写入通过USB 3.0连接的大型SATA III SSD)。 The SQL Server instance is running on a VM running under VMWare Workstation. SQL Server实例在VMWare Workstation下运行的VM上运行。 The host is set to hold the entire VM in memory. 主机设置为将整个VM保留在内存中。 The VM itself has a paging cache of 5000 MB. VM本身具有5000 MB的分页缓存。 The SQL Server user is set to 'hold pages in memory'. SQL Server用户设置为“将页面保留在内存中”。 I have 5 GBs of RAM allocated to the VM, and the max memory of the SQL Server instance is set to half a Gig. 我有5 GB的RAM分配给VM,并且SQL Server实例的最大内存设置为一半Gig。

I have played with every single one of these parameters to attempt to maintain consistent performance, but sure and steady, the performance eventually degrades to the point where it begins to time out. 我已经尝试使用这些参数中的每个参数来尝试保持一致的性能,但可以肯定并稳定地将性能降低到开始超时的程度。 Here's the kicker though, if I stop the application that's loading the database, and then execute the stored proc in the Management Studio, it runs like lightning, clearly indicating it's not an issue with the query, and probably nothing to do with memory management or paging. 不过,这是一个问题,如果我停止正在加载数据库的应用程序,然后在Management Studio中执行存储的proc,它的运行就像闪电一样,清楚地表明这与查询无关,可能与内存管理或分页。 If I then restart the loader app, it still crawls. 如果然后重新启动加载器应用程序,它仍然会爬网。 If I reboot the VM however, the app once again runs like lightning...for a while... 但是,如果我重新启动VM,该应用程序将再次像闪电般运行...一段时间...

Does anybody have any other suggestions based upon the symptoms presented? 根据出现的症状,有人有其他建议吗?

  • Depending on how large your hot set is, 5GB memory may just tax it for a 100+gb database. 根据您的热设置的大小,对于100 + GB数据库,5GB内存可能只会增加它的负担。

  • Check indices and query plans. 检查索引和查询计划。 We can not help you without them. 没有他们,我们将无济于事。 And I bet you miss some indices - which is the standard performance issue people have. 我敢打赌,您会错过一些指数-这是人们的标准绩效问题。

  • Otherwise, once you made your homework - head over to dba.stackexchange.com and ask there. 否则,一旦完成作业,请转到dba.stackexchange.com并在那里询问。

  • Generally - consider that 200 statements per transaction may simply indicate a seriously sub-optimal programming. 通常-考虑到每个事务200条语句可能仅表示严重次优的编程。 For example you could bulk-load the data into a temp table then merge into the final one. 例如,您可以将数据批量加载到临时表中,然后合并到最后一个表中。

Actually, I may have a working theory. 实际上,我可能有一个可行的理论。 What I did was add some logic to the app that when it times out, sit for two minutes, and then try again, and voila! 我所做的就是为应用程序添加了一些逻辑,使其在超时时坐了两分钟,然后重试,瞧! Back to full speed. 回到全速。 I rubber-ducky'd my co-worker and came up with the concept that my perceived SSD write speeds were actually the write speed to the VMWare host's virtual USB 3 buffer, and that the actual SSD write speeds were slower. 我与同事打交道,想到了一个概念,即我认为SSD的写入速度实际上是对VMWare主机的虚拟USB 3缓冲区的写入速度,而实际的SSD写入速度却较慢。 I'm probably hitting against the host's buffer size and by forcing the app to wait 2 minutes, the host has a chance to dump its back-buffered data to the SSD. 我可能会碰到主机的缓冲区大小,并通过强制应用程序等待2分钟,使主机有机会将其后缓冲数据转储到SSD中。 Elementary, Watson :) 小学,沃森:)

If this approach also fails to be sustainable, I'll report in. 如果这种方法也不能持续下去,我会报告。

Try executing this to determine your problem queries: 尝试执行此操作以确定您的问题查询:

SELECT TOP 20
  qs.sql_handle,
  qs.execution_count,
  qs.total_worker_time AS Total_CPU,
  total_CPU_inSeconds = --Converted from microseconds
      qs.total_worker_time/1000000,
  average_CPU_inSeconds = --Converted from microseconds
      (qs.total_worker_time/1000000) / qs.execution_count,
  qs.total_elapsed_time,
  total_elapsed_time_inSeconds = --Converted from microseconds
      qs.total_elapsed_time/1000000,
 st.text,
 qp.query_plan
FROM
 sys.dm_exec_query_stats as qs
 CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) as st
 cross apply sys.dm_exec_query_plan (qs.plan_handle) as qp
 ORDER BY qs.total_worker_time desc

Then check your estimated and actual execution plans on the queries this command helps you pinpoint. 然后在此命令可帮助您查明的查询上检查估计的和实际的执行计划。

Source How do I find out what is hammering my SQL Server? 来源如何找出影响SQL Server的原因? and at the bottom of the page of http://technet.microsoft.com/en-us/magazine/2007.11.sqlquery.aspx 并在http://technet.microsoft.com/zh-cn/magazine/2007.11.sqlquery.aspx页面底部

Beyond the excellent indexing suggestions already given, be sure to read up on parameter sniffing. 除了已经给出的出色索引建议之外,请务必阅读有关参数嗅探的内容。 That could be the cause of the problem. 这可能是问题的原因。

SQL Server - parameter sniffing SQL Server-参数嗅探

http://www.sommarskog.se/query-plan-mysteries.html#compileps http://www.sommarskog.se/query-plan-mysteries.html#compileps

As a result you could have a bad query plan being re-used, or SQL's buffer could be getting full and writing pages out of memory to disk (maybe that's other allocated memory in your case). 结果,您可能会重复使用错误的查询计划,或者SQL的缓冲区可能已满,并且将内存中的页面写到磁盘上(可能是您所分配的其他内存)。

You could run DBCC FreeProcCache and DBCC FreeSystemCache to empty it and see if you get a performance boost. 您可以运行DBCC FreeProcCache和DBCC FreeSystemCache将其清空,看看是否可以提高性能。

You should give SQL more memory too - as much as you can while leaving room for other critical programs and the OS. 您还应该为SQL提供更多的内存-在为其他关键程序和操作系统留出空间的同时,还应尽可能多地给SQL提供内存。 You might have 5gb of Ram on the VM, but SQL is only getting to play with a 1/2 gb, which seems REALLY small for what you're describing. 您在VM上可能有5gb的Ram,但是SQL只能以1/2 gb的速度运行,这对于您所描述的内容似乎很小。

If those things don't move you in the right direction, install the SQL Management Data Warehouse so you can see exactly what is happening when your slow down begins. 如果这些事情没有朝着正确的方向发展,请安装SQL Management Data Warehouse,以便您可以准确地看到减速开始时发生的情况。 Running it takes up additional memory, but you will give the DBA's more to go on. 运行它会占用额外的内存,但是您会给DBA留下更多的空间。

In the end, what I did was a combination of two things, putting in logic to recover when timeouts occurred, and setting the host core count to only reflect physical cores, not logical cores, so for example, the host has 2 cores that are hyper-threaded. 最后,我要做的是将两件事结合在一起:放入逻辑以在发生超时时进行恢复,并将主机核心数设置为仅反映物理核心,而不是逻辑核心,例如,主机有两个核心超线程的。 When I set my VM to use 4 cores, it occasionally gets hung in some infinite loop, but when I set it to 2 cores, it runs without fail. 当我将VM设置为使用4个内核时,它有时会陷入无限循环,但是当我将其设置为2个内核时,它会毫无故障地运行。 Still, aberrant behavior like this is difficult to mitigate reliably. 尽管如此,这样的异常行为仍然难以可靠地缓解。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM