简体   繁体   English

比较SQL Server中索引视图和存储过程的性能

[英]Comparing The Performance Of Indexed Views And Stored Procedures In SQL Server

I've just recently become aware of the fact that you can now index your views in SQL Server (see http://technet.microsoft.com/en-us/library/cc917715.aspx ). 我最近才意识到您现在可以在SQL Server中索引视图(请参阅http://technet.microsoft.com/en-us/library/cc917715.aspx )。 I'm now trying to figure out when I'd get better performance from a query against an indexed view versus the same query inside a stored procedure that's had it's execution path cached? 我现在正试图弄清楚当我从一个索引视图的查询中获得更好的性能与存储过程中的同一查询时,它的执行路径被缓存了吗?

Take for example the following: 举个例子如下:

SELECT colA, colB, sum(colC), sum(colD), colE
FROM myTable
WHERE colFDate < '9/30/2011'
GROUP BY colA, colB, colE

The date will be different every time it's run, so if this were a view, I wouldn't include the WHERE in the view and instead have that as part of my select against the view. 每次运行时日期都会不同,所以如果这是一个视图,我不会在视图中包含WHERE ,而是将其作为我对视图的选择的一部分。 If it were a stored procedure, the date would be a parameter. 如果它是存储过程,则日期将是参数。 Note, there are about 300,000 rows in the table. 请注意,表中大约有300,000行。 200,000 of them would meet the where clause with the date. 其中200,000个符合where条款和日期。 10,000 would be returned after the group by. 在小组之后将返回10,000。

If this were an indexed view, should I expect to get better performance out of it than a stored procedure that's had an opportunity to cache the execution path? 如果这是一个索引视图,我是否应该期望获得更好的性能而不是有机会缓存​​执行路径的存储过程? Or would the proc be faster? 或者proc会更快吗? Or would the difference be negligible? 或者差异可以忽略不计? I know we could say "just try both out" but there are too many factors that could falsely bias the results leading me down a false conclusion, so I'd like to hear more of the theory behind it and what the expected outcomes are instead. 我知道我们可以说“只是尝试两种方式”,但有太多因素可能会错误地偏向结果导致我得出错误的结论,所以我想听到更多关于它背后的理论以及预期的结果是什么。

Thanks! 谢谢!

An indexed view can be regarded like a normal table - it's a materialized collection of rows. 索引视图可以被视为普通表 - 它是行的物化集合。

So the question really boils down to whether or not a "normal" query is faster than a stored procedure. 所以问题实际上归结为“正常”查询是否比存储过程更快。

If you look at what steps the SQL Server goes through to execute any query (stored procedure call or ad-hoc SQL statement), you'll find (roughly) these steps: 如果您查看SQL Server执行任何查询(存储过程调用或临时SQL语句)的步骤,您将(大致)找到这些步骤:

  1. syntactically check the query 语法检查查询
  2. if it's okay - it checks the plan cache to see if it already has an execution plan for that query 如果没关系 - 它会检查计划缓存,看它是否已经有该查询的执行计划
  3. if there is an execution plan - that plan is (re-)used and the query executed 如果有执行计划 - 该计划被(重新)使用并执行查询
  4. if there is no plan yet, an execution plan is determined 如果还没有计划,则确定执行计划
  5. that plan is stored into the plan cache for later reuse 该计划存储在计划缓存中以供以后重用
  6. the query is executed 执行查询

The point is: ad-hoc SQL and stored procedures are treatly no differently . 关键是:ad-hoc SQL和存储过程没有区别

If an ad-hoc SQL query is properly using parameters - as it should anyway, to prevent SQL injection attacks - its performance characteristics are no different and most definitely no worse than executing a stored procedure. 如果ad-hoc SQL查询正确使用参数 - 无论如何应该防止SQL注入攻击 - 它的性能特征没有什么不同,并且绝对不会比执行存储过程更糟糕

Stored procedure have other benefits (no need to grant users direct table access, for instance), but in terms of performance, using properly parametrized ad-hoc SQL queries is just as efficient as using stored procedures. 存储过程具有其他好处(例如,无需授予用户直接表访问权限),但就性能而言,使用正确参数化的即席SQL查询与使用存储过程一样高效

Using stored procedures over non-parametrized queries is better for two main reasons: 非参数化查询使用存储过程更好,主要有两个原因:

  • since each non-parametrized query is a new, different query to SQL Server, it has to go through all the steps of determining the execution plan, for each query (thus wasting time - and also wasting plan cache space, since storing the execution plan into plan cache doesn't really help in the end, since that particular query will probably not be executed again) 由于每个非参数化查询都是对SQL Server的新的不同查询,因此必须针对每个查询执行确定执行计划的所有步骤(从而浪费时间 - 并且还浪费计划缓存空间,因为存储执行计划进入计划缓存最终并没有真​​正帮助,因为那个特定的查询可能不会再次执行)

  • non-parametrized queries are at risk of SQL injection attack and should be avoided at all costs 非参数化查询存在SQL注入攻击的风险,应该不惜一切代价避免

Now of course, if you're indexed view can reduce down the number rows significantly (by using a GROUP BY clause) - then of course that indexed view will be significantly faster than when you're running a stored procedure against the whole dataset. 当然,如果您是索引视图可以显着减少行数(通过使用GROUP BY子句) - 那么当然 ,索引视图将比您对整个数据集运行存储过程时快得多。 But that's not because of the different approaches taken - it's just a matter of scale - querying against a few dozen or few hundred rows will be faster than querying against 200'000 or more rows - no matter which way you query. 但这并不是因为采用了不同的方法 - 这只是一个规模问题 - 查询几十行或几百行将比查询200'000或更多行更快 - 无论您查询哪种方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM