简体   繁体   English

SQL性能,.Net优化与最佳实践

[英]SQL Performance, .Net Optimizations vs Best Practices

I need confirmation/explanation from you pros/gurus with the following because my team is telling me "it doesn't matter" and it's fustrating me :) 我需要你的专业人士/大师的确认/解释,因为我的团队告诉我“这没关系”,这让我感到很沮丧:)

Background: We have a SQL Server 2008 that is being used by our main MVC3 / .Net4 web app. 背景:我们的主要MVC3 / .Net4 Web应用程序正在使用SQL Server 2008。 We have about 200+ concurrent users at any given point. 我们在任何给定点上都有大约200多个并发用户。 The server is being hit EXTREMELY hard (locks, timeouts, overall slowness) and I'm trying to apply things i learned throughout my career and at my last MS certification class. 服务器受到极大的打击(锁定,超时,整体缓慢),我正在尝试应用我在整个职业生涯和最后一次MS认证课程中学到的东西。 They are things we've all been drilled on ("close SQL connections STAT") and I'm trying to explain to my team that these 'little things", though not one alone makes a difference, adds up in the end. 他们是我们所有人都已经完成的事情(“关闭SQL连接STAT”),我试图向我的团队解释这些“小事情”,尽管不是唯一一个有所作为,最后加起来。

I need to know if the following do have a performance impact or if it's just 'best practice' 我需要知道以下是否会对性能产生影响,或者它是否只是“最佳实践”

1. Using "USING" keyword. 1.使用“USING”关键字。 Most of their code is like this: 他们的大部分代码都是这样的:

public string SomeMethod(string x, string y) {
    SomethingDataContext dc = new SomethingDataContext();
    var x = dc.StoredProcedure(x, y);
}

While I'm trying to tell them that USING closes/frees up resources faster: 虽然我试图告诉他们USING会更快地关闭/释放资源:

using (SomethingDataContext dc = new SomethingDataContext()) {
    var x = dc.StoredProcedure(x, y);
}

Their argument is that the GC does a good enough job cleaning up after the code is done executing, so USING doesn't have a huge impact. 他们的论点是GC在代码执行完毕后做了很好的清理工作,因此USING没有产生巨大的影响。 True or false and why? 是真还是假?为什么?

2. Connection Pools 2.连接池

I always heard setting up connection pools can significantly speed up any website (at least .Net w/ MSSQL). 我一直听说设置连接池可以显着加快任何网站(至少.Net w / MSSQL)。 I recommended we add the following to our connectionstrings in the web.config: 我建议我们在web.config中的connectiontring中添加以下内容:

..."Pooling=True;Min Pool Size=3;Max Pool Size=100;Connection Timeout=10;"... ...“Pooling = True; Min Pool Size = 3; Max Pool Size = 100; Connection Timeout = 10;”......

Their argument is that .Net/MSSQL already sets up the connection pools behind the scenes and is not necessary to put in our web.config. 他们的论点是.Net / MSSQL已经在幕后设置了连接池,没有必要放入我们的web.config。 True or false? 对或错? Why does every other site say pooling should be added for optimal performance if it's already setup? 为什么每个其他网站都说如果已经设置了池,那么应该添加池以获得最佳性能?

3. Minimize # of calls to DB 3.最小化对DB的调用次数

The Role/Membership provider that comes with the default .Net MVC project is nice - it's handy and does most of the legwork for you. 默认的.Net MVC项目附带的角色/成员资格提供程序很不错 - 它非常方便,可以为您完成大部分工作。 But these guys are making serious use of UsersInRoles() and use it freely like a global variable (it hits the DB everytime this method is called). 但是这些人正在认真地使用UsersInRoles()并像一个全局变量一样自由地使用它(每次调用此方法时它都会访问数据库)。 I created a "user object" that loads all the roles upfront on every pageload (along with some other user stuff, such as GUIDs, etc) and then query this object for if the user has the Role. 我创建了一个“用户对象”,它在每个页面加载(以及其他一些用户的东西,如GUID等)上预先加载所有角色,然后查询该对象是否具有该角色。

Other parts of the website have FOR statements that loop over 200 times and do 20-30 sql queries on every pass = over 4,000 database calls. 该网站的其他部分有FOR语句循环超过200次,并在每次传递=超过4,000个数据库调用时执行20-30个sql查询。 It somehow does this in a matter of seconds, but what I want to do is consolidate the 20-30 DB calls into one, so that it makes ONE call 200 times (each loop). 它以某种方式在几秒钟内完成,但我想要做的是将20-30个DB调用合并为一个,这样它就可以进行200次调用(每个循环)。 But because SQL profiler says the query took "0 seconds", they're argument is it's so fast and small that the servers can handle these high number of DB queries. 但是因为SQL分析器说查询花了“0秒”,所以它们的论点是它如此快速和小,以至于服务器可以处理这些大量的数据库查询。

My thinking is "yeah, these queries are running fast, but they're killing the overall SQL server's performance." 我的想法是“是的,这些查询运行速度很快,但它们会破坏整个SQL服务器的性能。” Could this be a contributing factor? 这可能是一个促成因素吗? Am I worrying about nothing, or is this a (significant) contributing factor to the server's overall performance issues? 我是否担心什么,或者这是服务器整体性能问题的(重要)因素?

4. Other code optimizations 4.其他代码优化

The first one that comes to mind is using StringBuilder vs a simple string variable. 首先想到的是使用StringBuilder和一个简单的字符串变量。 I understand why I should use StringBuilder (especially in loops), but they say it doesn't matter - even if they need to write 10k+ lines, their argument is that the performance gain doesn't matter. 我理解为什么我应该使用StringBuilder (特别是在循环中),但是他们说这没关系 - 即使他们需要写10k +行,他们的论点是性能增益无关紧要。

So all-in-all, are all the things we learn and have drilled into us ("minimize scope!") just 'best practice' with no real performance gain or do they all contribute to a REAL/measurable performance loss? 总而言之,我们所学到的所有东西都已深入到我们身上(“最小化范围!”)只是“最佳实践”而没有真正的性能提升,或者它们是否都会导致实际/可衡量的性能损失?

EDIT *** Thanks guys for all your answers! 编辑 ***谢谢大家的所有答案! I have a new (5th) question based on your answers: They in fact do not use "USING", so what does that mean is happening? 我根据你的答案提出了一个新的(第五个)问题:他们实际上并没有使用“USING”,那么这意味着什么呢? If there is connection pooling happening automatically, is it tying up connections from the pool until the GC comes around? 如果连接池自动发生,它是否会从池中连接到GC到来之前? Is it possible each open connection to the SQL server is adding a little more burden to the server and slowing it down? 是否可能每个与SQL服务器的开放连接都会给服务器增加一点负担并减慢它的速度?

Based on your suggestions, I plan on doing some serious benchmarking/logging of connection times because I suspect that a) the server is slow, b) they aren't closing connections and c) Profiler is saying it ran in 0 seconds, the slowness might be coming from the connection. 根据你的建议,我计划对连接时间做一些严格的基准测试/记录,因为我怀疑a)服务器很慢,b)他们没有关闭连接和c)Profiler说它在0秒内运行,速度慢可能来自连接。

I really appreciate your help guys. 我真的很感谢你的帮助。 THanks again 再次感谢

Branch the code, make your changes & benchmark+profile it against the current codebase. 对代码进行分支,根据当前代码库进行更改和基准测试+配置文件。 Then you'll have some proof to back up your claims. 然后你会得到一些证据来支持你的说法。

As for your questions, here goes: 至于你的问题,这里有:

  1. You should always manually dispose of classes which implement IDisposable , the GC won't actually call dispose however if the class also implements a finalizer then it will call the finalizer however in most implementations they only clean up unmanaged resources. 你应该总是手动处理实现IDisposable的类,GC实际上不会调用dispose,但是如果类也实现了终结器,那么它将调用终结器,但是在大多数实现中它们只清理非托管资源。

  2. It's true that the .NET framework already does connection pooling, I'm not sure what the defaults are but the connection string values would just be there to allow you to alter them. 确实.NET框架已经进行了连接池,我不确定默认值是什么,但是连接字符串值只是允许你改变它们。

  3. The execution time of the SQL statement is only part of the story, in SQL profiler all you will see is how long the database engine took to execute the query, what you're missing there is the time it takes the web server to connect to and receive the results from the database server so while the query may be quick, you can save on a lot of IO & network latency by batching queries. SQL语句的执行时间只是故事的一部分,在SQL分析器中,您将看到数据库引擎执行查询所花费的时间,您缺少的是Web服务器连接所需的时间。并从数据库服务器接收结果,因此查询可能很快,您可以通过批处理查询节省大量的IO和网络延迟。

  4. This one is a good one to do some profiling on to prove the extra memory used by concatenation over string builders. 这个是一个很好的一个来进行一些分析,以证明串联构建器上连接所使用的额外内存。

Oye. 奥耶。 For sure, you can't let GC close your database connections for you. 当然,您不能让GC关闭您的数据库连接。 GC might not happen for a LONG time...sometimes hours later. GC可能不会发生很长时间......有时几个小时后。 It doesn't happen right away as soon as a variable goes out of scope. 一旦变量超出范围,它就不会立即发生。 Most people use the IDisposable using() { } syntax, which is great, but at the very least something, somewhere needs to be calling connection.Close() 大多数人使用IDisposable使用(){}语法,这很好,但至少在某些地方,某处需要调用connection.Close()

  1. Objects that implement IDisposable and hold on inmanaged resources also implement a finilizer that will ensure that dispose is called during GC, the problem is when it is called, the gc can take a lot of time to do it and you migth need those resources before that. 实现IDisposable并保持inmanaged资源的对象也实现了一个finilizer,它将确保在GC期间调用dispose,问题是当它被调用时,gc可能需要花费大量时间来完成它而你需要迁移这些资源。 Using makes the call to the dispose as soon as you are done with it. 一旦完成,使用就会调用dispose。

  2. You can modify the parameters of pooling in the webconfig but its on by default now, so if you leave the default parameters you are no gaining anything 您可以在webconfig中修改池的参数,但现在默认情况下它已打开,因此如果您保留默认参数,则无法获得任何内容

  3. You not only have to think about how long it takes the query to execute but also the connection time between application server and database, even if its on the same computer it adds an overhead. 您不仅要考虑执行查询需要多长时间,还要考虑应用程序服务器和数据库之间的连接时间,即使它在同一台计算机上增加了开销。

  4. StringBuilder wont affect performance in most web applications, it would only be important if you are concatenating 2 many times to the same string, but i think its a good idea to use it since its easier to read . StringBuilder不会影响大多数Web应用程序的性能,只有当你将2次连接到同一个字符串时才会很重要,但我认为使用它是一个好主意,因为它更容易阅读。

I think that you have two separate issues here. 我认为你在这里有两个不同的问题。

  1. Performance of your code 代码的性能
  2. Performance of the SQL Server database SQL Server数据库的性能

SQL Server SQL Server

Do you have any monitoring in place for SQL Server? 您是否对SQL Server进行了任何监控? Do you know specifically what queries are being run that cause the deadlocks? 您是否具体了解导致死锁的查询?

I would read this article on deadlocks and consider installing the brilliant Who is active to find out what is really going on in your SQL Server. 我会阅读关于死锁的这篇文章,并考虑安装辉煌的谁是活跃的,以找出SQL Server中真正发生的事情。 You might also consider installing sp_Blitz by Brent Ozar. 你也可以考虑安装sp_Blitz布伦特奥扎尔。 This should give you an excellent idea of what is going on in your database and give you the tools to fix that problem first. 这应该可以让您对数据库中发生的事情有一个很好的了解,并为您提供解决该问题的工具。

Other code issues 其他代码问题

I can't really comment on the other code issues off the top of my head. 我无法真正评论其他代码问题。 So I would look at SQL server first. 所以我先看看SQL服务器。

Remember 记得

  1. Monitor 监控
  2. Identify Problems 找出问题
  3. Profile 轮廓
  4. Fix 固定
  5. Go to 1 转到1

Well, I'm not a guru, but I do have a suggestion: if they say you're wrong, tell them, "Prove it! Write me a test! Show me that 4000 calls are just as fast as 200 calls and have the same impact on the server!" 嗯,我不是大师,但我确实有一个建议:如果他们说你错了,告诉他们,“证明它!给我一个测试!告诉我4000个电话和200个电话一样快,并且有对服务器的影响相同!“

Ditto the other things. 与其他事情一样。 If you're not in a position to make them prove you right, prove them wrong, with clear, well-documented tests that show that what you're saying is right. 如果你无法让他们证明你是对的,那就证明他们是错的,有明确的,有充分证据的测试证明你所说的是正确的。

If they're not open even to hard evidence, gathered from their own server, with code they can look at and inspect, then you may be wasting your time on that team. 如果他们甚至没有公开的证据,从他们自己的服务器收集,他们可以查看和检查代码,那么你可能会浪费你的时间在那个团队。

At the risk of just repeating what others here have said, here's my 2c on the matter 冒着重复其他人所说的话的风险,这是关于此事的我的2c

Firstly, you should pick your battles carefully...I wouldn't go to war with your colleagues on all 4 points because as soon as you fail to prove one of them, it's over, and from their perspective they're right and you're wrong. 首先,你应该仔细挑选你的战斗......我不会在所有4点上与你的同事开战,因为一旦你未能证明其中一个,它就结束了,从他们的角度来看,他们是正确的你错了。 Also bear in mind that no-one likes to be told their beatiful code is an ugly baby, so I assume you'll be diplomatic - don't say "this is slow", say "I found a way to make this even faster"....(of course your team could be perfectly reasonable so I'm basing that on my own experience as well:) So you need to pick one of the 4 areas above to tackle first. 还要记住,没有人喜欢被告知他们美丽的代码是一个丑陋的婴儿,所以我认为你将是外交 - 不要说“这很慢”,说“我找到了一种方法,使这更快“....(当然你的团队可能是完全合理的,所以我也基于我自己的经验:)所以你需要从上面的4个区域中选择一个来解决。

My money is on #3. 我的钱在#3上。 1, 2 and 4 can make a difference, but in my own experience, not that much - but what you described in #3 sounds like death by a thousand papercuts for the poor old server! 1,2和4可以有所作为,但根据我自己的经验,并没有那么多 - 但你在#3中所描述的听起来像是因为可怜的旧服务器的千张纸张而死! The queries probably execute fast because they're parameterised so they're cached, but you need to bear in mind that "0 seconds" in the profiler could be 900 milliseconds, if you see what I mean...add that up for many and things start getting slow; 查询可能执行速度很快,因为它们已经参数化,因此它们被缓存了,但是你需要记住,探查器中的“0秒”可能是900毫秒,如果你看到我的意思...加上许多事情开始变慢; this could also be a primary source of the locks because if each of these nested queries is hitting the same table over and over, no matter how fast it runs, with the number of users you mentioned, it's certain you will have contention. 这也可能是锁的主要来源,因为如果这些嵌套查询中的每一个都反复敲击同一个表,无论它运行得多快,根据您提到的用户数量,您肯定会有争用。 Grab the SQL and run it in SSMS but include Client Statistics so you can see not only the execution time but also the amount of data being sent back to the client; 获取SQL并在SSMS中运行它,但包括客户端统计信息,这样您不仅可以查看执行时间,还可以查看发送回客户端的数据量; that will give you a clearer picture of what sort of overhead in involved. 这将使您更清楚地了解所涉及的开销类型。

Really the only way you can prove any of this is to setup a test and measure as others have mentioned, but also be certain to also run some profiling on the server as well - locks, IO queues, etc, so that you can show that not only is your way faster, but that it places less load on the server. 真正唯一可以证明这一点的方法是设置其他人提到的测试和测量,但也要确保在服务器上运行一些分析 - 锁,IO队列等,以便您可以显示不仅是你的方式更快,而且它减少了服务器上的负担。

To touch on your 5th question - I'm not sure, but I would guess that any SqlConnection that's not auto-disposed (via using) is counted as still "active" and is not available from the pool any more. 触及你的第五个问题 - 我不确定,但我猜想任何未自动处理的SqlConnection(通过使用)都被视为仍处于“活动状态”,并且不再可以从池中获取。 That being said - the connection overhead is pretty low on the server unless the connection is actually doing anything - but you can again prove this by using the SQL Performance counters. 话虽如此 - 服务器上的连接开销非常低,除非连接实际上正在做任何事情 - 但您可以通过使用SQL性能计数器再次证明这一点。

Best of luck with it, can't wait to find out how you get on. 祝它好运,迫不及待地想知道你是怎么过的。

The using clause is just syntactic sugar, you are essentially doing using子句只是语法糖,你本质上是在做

try
{
    resouce.DoStuff();
}
finally
{
     resource.Dispose()
}

Dispose is probably going to get called anyway when the object is garbage collected, but only if the framework programmers did a good job of implementing the disposable pattern . 当对象被垃圾收集时,Dispose可能会被调用,但前提是框架程序员在实现一次性模式方面做得很好。 So the arguments against your colleagues here are: 所以这里反对你的同事的论点是:

i) if we get into the habit of utilizing using we make sure to free unmanaged resources because not all framework programmers are smart to implement the disposable pattern. i)如果我们养成使用的习惯,我们确保释放非托管资源,因为并非所有框架程序员都能够实现一次性模式。

ii) yes, the GC will eventually clean that object, but it may take a while, depending on how old that object is. ii)是的,GC最终将清理该对象,但可能需要一段时间,具体取决于该对象的年龄。 A gen 2 GC cleanup is done only once per second. gen 2 GC清理每秒只进行一次。

So on short: 所以简而言之:

  1. see above 往上看

  2. yes, pooling is set by default to true and max pool size to 100 是的,池默认设置为true,最大池大小设置为100

  3. you are correct, definitely the best area to push on for improvements. 你是对的,绝对是推动改进的最佳领域。

  4. premature optimization is the root of all evil. 过早优化是万恶之源。 Get #1 and #3 in first. 首先获得#1和#3。 Use SQL profiler and db specific methods (add indexes, defragment them, monitor deadlocks etc.). 使用SQL profiler和db特定方法(添加索引,对它们进行碎片整理,监视死锁等)。

  5. yes, could be. 是的,可能。 best way is to measure it - look at the perf counter SQLServer: General Statistics – User Connections; 最好的方法是测量它 - 查看perf计数器SQLServer:General Statistics - User Connections; here is an article describing how to do it. 是一篇描述如何做的文章。

Always measure your improvements, don't change the code without evidence! 始终衡量您的改进,不要在没有证据的情况下更改代码!

I recently was dealing with a bug in the interaction between our web application and our email provider. 我最近正在处理我们的Web应用程序和我们的电子邮件提供商之间的交互中的错误。 When an email was sent, a protocol error occurred. 发送电子邮件时,发生协议错误。 But not right away. 但不是马上。

I was able to determine that the error only occurred when the SmtpClient instance was closed, which was occurring when the SmtpClient was disposed, which was only happening during garbage collection. 我能够确定错误仅在SmtpClient实例关闭时发生,这是在处理SmtpClient时发生的,这只发生在垃圾收集期间。

And I noticed that this often took two minutes after the "Send" button was clicked... 我注意到点击“发送”按钮后经常需要两分钟 ...

Needless to say, the code now properly implements using blocks for both the SmtpClient and MailMessage instances. 不用说,代码现在正确地为SmtpClientMailMessage实例using块。

Just a word to the wise... 对智者说一句话......

1 has been addressed well above (I agree with it disposing nicely, however, and have found it to be a good practice). 1已经在上面解决了(我同意它处理得很好,然而,并且发现它是一个很好的做法)。

2 is a bit of a hold-over from previous versions of ODBC wherein SQL Server connections were configured independently with regards to pooling. 2是从以前版本的ODBC中保留的一点,其中SQL Server连接是独立配置的。 It used to be non-default; 它曾经是非默认的; now it's default. 现在它是默认的。

As to 3 and 4, 4 isn't going to affect your SQL Server's performance - StringBuilder might help speed the process within the UI, certainly, which may have the effect of closing off your SQL resources faster, but they won't reduce the load on the SQL Server. 对于3和4,4不会影响SQL Server的性能 - 当然, StringBuilder可能有助于加速UI中的进程,这可能会更快地关闭SQL资源,但它们不会减少加载SQL Server。

3 sounds like the most logical place to concentrate, to me. 对我来说,3听起来像是最合乎逻辑的地方。 I try to close off my database connections as quickly as possible, and to make the fewest calls possible. 我尝试尽快关闭我的数据库连接,并尽可能少地调用。 If you're using LINQ , pull everything into an IQueryable or something (list, array, whatever) so that you can manipulate it & build whatever UI structures you need, while releasing the connection prior to any of that hokum. 如果您正在使用LINQ ,请将所有内容放入IQueryable或其他内容(列表,数组等),以便您可以操作它并构建所需的任何UI结构,同时在任何该hokum之前释放连接。

All of that said, it sounds like you need to spend some more quality time with the profiler. 所有这些都说,听起来你需要花更多的时间在探查器上。 Rather than looking at the amount of time each execution took, look at the processor & memory usage. 而不是查看每次执行所花费的时间,而不是查看处理器和内存使用情况。 Just because they're fast doesn't mean they're not "hungry" executions. 仅仅因为他们的速度快并不意味着他们不是“饥饿”的执行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM