简体   繁体   English

通过添加未使用的WHERE条件,查询运行时间更长

[英]Query running longer by adding unused WHERE conditions

I've hit an interesting snag (interesting to me at least). 我遇到了一个有趣的障碍(至少对我来说很有趣)。 Below is a general idea of what my query looks like. 以下是查询的大致概念。 Assume @AuthorType is an input to the stored procedure and that there are various specialized conditions each place I've put comments. 假设@AuthorType是存储过程的输入,并且我在每个地方放置了注释,都有各种特殊条件。

SELECT *
FROM TBooks
WHERE
(--...SOME CONDITIONS)
OR
(@AuthorType = 1 AND --...DIFFERENT CONDITIONS)
OR
(@AuthorType = 2 AND --...STILL MORE CONDITIONS)

What's interesting to me is that if I execute this SP with @AuthorType = 0, it runs slower than if I remove the last two sets of conditions (the ones that add conditions for specialized values of @AuthorType). 对我来说有趣的是,如果我使用@AuthorType = 0执行此SP,则它的运行速度比如果删除最后两组条件(为@AuthorType的特殊值添加条件的条件)要慢。

Shouldn't SQL Server realize at runtime that those conditions will never be met and ignore them entirely? SQL Server难道不应该在运行时意识到永远不会满足这些条件并完全忽略它们吗? The difference I'm experiencing is not small; 我所经历的差异并不小; it's approximately doubling the length of the query (1-2 seconds to 3-5 seconds). 大约是查询时间的两倍(1-2秒到3-5秒)。

Am I expecting SQL Server to optimize this too much for me? 我是否期望SQL Server对我进行太多优化? Do I really need to have 3 separate SPs for specialized conditions? 我是否真的需要3个单独的SP用于特殊条件?

Shouldn't SQL Server realize at runtime that those conditions will never be met and ignore them entirely? SQL Server难道不应该在运行时意识到永远不会满足这些条件并完全忽略它们吗?

No, absolutely not. 不,绝对不是。 There are two factors at play here. 这里有两个因素在起作用。

  1. SQL Server does not guarantee boolean operator short circuit. SQL Server 不能保证布尔运算符短路。 See On SQL Server boolean operator short-circuit for an example showing clearly how query optimization can reverse the order of boolean expression evaluation. 有关示例,请参阅“ 在SQL Server上布尔运算符短路”中清楚地显示了查询优化如何颠倒布尔表达式求值的顺序。 While at a first impression this seems like a bug to the imperative C like programming mind set, it is the right thing to do for declarative set oriented world of SQL. 乍一看,这似乎对命令式C语言编程心态是一个错误,但对于面向声明性集的SQL世界,这是正确的做法。

  2. OR is the enemy of SQL SARGability. OR是SQL SARGability的敌人。 SQL statements are compliled into an execution plan, then the plan is executed. 将SQL语句编译为执行计划,然后执行该计划。 The plan gets reused between invocations (is cached). 计划在调用之间被重用(被缓存)。 As such the SQL compiler has to generate one single plan that fits all separate OR cases (@AuthorType=1 AND @AuthorType=2 AND @AuthorType=3). 因此,SQL编译器必须生成一个适合所有单独OR情况的单一计划(@ AuthorType = 1 AND @ AuthorType = 2 AND @ AuthorType = 3)。 When it comes to generating the query plan is it exactly as if @AuthorType would have all values at once, in a sense. 从某种意义上说,生成查询计划时就好像@AuthorType一次具有所有值一样。 The result is almost always the worst possible plan, one that cannot benefit any index because the various OR branches contradict each other, so it ends up scanning the whole table and checking rows one by one. 结果几乎总是最糟糕的计划,因为不同的OR分支相互矛盾,所以该计划无法受益于任何索引,因此最终导致扫描整个表并逐行检查行。

The bestthing to do in your case, and any other case that involves boolean OR, is to move the @AuthorType outside the query: 对于您的情况以及任何其他涉及布尔OR的情况,最好的做法是将@AuthorType移至查询之外:

IF (@AuthorType = 1)
  SELECT ... FROM ... WHERE ...
ELSE IF (@AuthorType = 2)
  SELECT ... FROM ... WHERE ...
ELSE ...

Because each branch is clearly separated into its own statement, SQL can create the proper access path for each individual case. 因为每个分支都清楚地分成了自己的语句,所以SQL可以为每种情况创建正确的访问路径。

The next best thing is to use UNION ALL, the way chadhoc already suggested, and is the right approach in views or other places where a single statement is required (no IF is permitted). 第二个最好的方法是使用chadhoc已经建议的UNION ALL,这是在视图或其他需要单个语句(不允许IF)的地方使用的正确方法。

It has to due with how difficult it is for the optimizer to handle "OR" type logic along with issues to do with parameter sniffing . 对于优化器来说,处理“ OR”类型逻辑有多么困难,以及与参数嗅探 有关的问题 Try changing your query above to a UNION approach like mentioned in the post here . 尝试改变上述像帖子中提到一个UNION方式查询这里 ie you'll wind up with multiple statements unioned together with just a single @AuthorType = x AND, allowing the optimizer to rule out portions where AND logic doesn't match the given @AuthorType, and seek into the appropriate indexes in turn ... would look something like this: 即,您将获得多个语句,它们与仅一个@AuthorType = x AND联合在一起,从而使优化器可以排除AND逻辑与给定@AuthorType不匹配的部分,并依次查找适当的索引。 。看起来像这样:

SELECT *
FROM TBooks
WHERE
(--...SOME CONDITIONS)
AND @AuthorType = 1 AND --...DIFFERENT CONDITIONS)
union all
SELECT *
FROM TBooks
WHERE
(--...SOME CONDITIONS)
AND @AuthorType = 2 AND --...DIFFERENT CONDITIONS)
union all
...

I should fight the urge to reduce duplication...but man, that really doesn't feel right to me. 我应该消除减少重复的冲动...但是,老兄,这对我来说真的不对。

Would this "feel" better? 这种“感觉”会更好吗?

SELECT ... lots of columns and complicated stuff ...
FROM 
(
    SELECT MyPK
    FROM TBooks
    WHERE 
    (--...SOME CONDITIONS) 
    AND @AuthorType = 1 AND --...DIFFERENT CONDITIONS) 
    union all 
    SELECT MyPK
    FROM TBooks
    WHERE 
    (--...SOME CONDITIONS) 
    AND @AuthorType = 2 AND --...DIFFERENT CONDITIONS) 
    union all 
    ... 
) AS B1
JOIN TBooks AS B2
    ON B2.MyPK = B1.MyPK
JOIN ... other tables ...

The pseudo table B1 is just the WHERE clause to get the PKs. 伪表B1只是获取PK的WHERE子句。 That is then joined back to the original table (and any others that are required) to get the "presentation". 然后将其重新连接到原始表(以及所需的任何其他表)以获取“表示”。 This avoids duplicating the Presentation columns in every UNION ALL 这样可以避免在每个UNION ALL中复制Presentation列

You can take this a step further and insert PKs into a temporary table first, and then join that to the other tables for the presentation aspect. 您可以进一步执行此操作,并先将PK插入临时表中,然后再将其与其他表连接以进行演示。

We do this for very large tables where the user has lots of choices on what to query on. 我们对非常大的表执行此操作,其中用户可以选择要查询的内容。

DECLARE @MyTempTable TABLE
(
    MyPK int NOT NULL,
    PRIMARY KEY
    (
        MyPK
    )
)

IF @LastName IS NOT NULL
BEGIN
   INSERT INTO @MyTempTable
   (
        MyPK
   )
   SELECT MyPK
   FROM MyNamesTable
   WHERE LastName = @LastName -- Lets say we have an efficient index for this
END
ELSE
IF @Country IS NOT NULL
BEGIN
   INSERT INTO @MyTempTable
   (
        MyPK
   )
   SELECT MyPK
   FROM MyNamesTable
   WHERE Country = @Country -- Got an index on this one too
END

... etc

SELECT ... presentation columns
FROM @MyTempTable AS T
    JOIN MyNamesTable AS N
        ON N.MyPK = T.MyPK -- a PK join, V. efficient
    JOIN ... other tables ...
        ON ....
WHERE     (@LastName IS NULL OR Lastname @LastName)
      AND (@Country IS NULL OR Country @Country)

Note that all tests are repeated [technically you don;t need the @Lastname one :) ], including obscure ones which were (lets say) not in the original filters to create @MyTempTable. 请注意,所有测试都是重复的[从技术上讲,您不需要@姓氏:)],包括晦涩难懂的(不是说)不是在原始过滤器中创建@MyTempTable的测试。

The creation of @MyTempTable is designed to make the best of whatever parameters are available. @MyTempTable的创建旨在充分利用可用的任何参数。 Perhaps if both @LastName AND @Country are available that is far more efficient at filling the table than either one of them, so we create a case for that scenario. 也许如果@LastName和@Country都可用,则填充表的效率要比其中任何一个都高得多,因此我们为这种情况创建了一个案例。

Scaling problems? 缩放问题? Review what actual queries are being made and add cases for the ones that can be improved. 查看正在执行的实际查询,并为可以改进的查询添加案例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM