[英]Simple change causes SQL query execution time to dramatically increase
I run the following SQL query on my Microsoft SQL Server (2012 Express) database, and it works fine, executing in less than a second: 我在我的Microsoft SQL Server(2012 Express)数据库上运行以下SQL查询,它工作正常,在不到一秒的时间内执行:
SELECT
StringValue, COUNT(StringValue)
FROM Attributes
WHERE
Name = 'Windows OS Version'
AND StringValue IS NOT NULL
AND ProductAssociation IN (
SELECT ID
FROM ProductAssociations
WHERE ProductCode = 'MyProductCode'
)
GROUP BY StringValue
I add a filter in the inner query and it continues to work fine, returning slightly less results (as expected) and also executing in less than a second. 我在内部查询中添加了一个过滤器,它继续正常工作,返回的结果略少(如预期的那样),并且在不到一秒的时间内执行。
SELECT
StringValue, COUNT(StringValue)
FROM Attributes
WHERE
Name = 'Windows OS Version'
AND StringValue IS NOT NULL
AND ProductAssociation IN (
SELECT ID
FROM ProductAssociations
WHERE ProductCode = 'MyProductCode'
AND ID IN (
SELECT A2.ProductAssociation
FROM Attributes A2
WHERE A2.Name = 'Is test' AND A2.BooleanValue = 0
)
)
GROUP BY StringValue
But when I add a flag variable to enable me to "turn on/off" the filter in the inner query, and set the flag to zero, the query seems to execute indefinitely (I left it running about 5 minutes and then force cancelled): 但是当我添加一个标志变量以使我能够“打开/关闭”内部查询中的过滤器,并将标志设置为零时, 查询似乎无限期地执行 (我让它运行大约5分钟,然后强制取消) :
DECLARE @IsTestsIncluded bit
SET @IsTestsIncluded = 0
SELECT
StringValue, COUNT(StringValue)
FROM Attributes
WHERE
Name = 'Windows OS Version'
AND StringValue IS NOT NULL
AND ProductAssociation IN (
SELECT ID
FROM ProductAssociations
WHERE ProductCode = 'MyProductCode'
AND (
@IsTestsIncluded = 1
OR
ID IN (
SELECT A2.ProductAssociation
FROM Attributes A2
WHERE A2.Name = 'Is test' AND A2.BooleanValue = 0
)
)
)
GROUP BY StringValue
Why? 为什么? What am I doing wrong?
我究竟做错了什么? I swear I've used this pattern in the past without a problem.
我发誓我过去使用过这种模式没有问题。
(When I set @IsTestsIncluded = 1
in the final query above, the filter is skipped and the execution time is normal - the delay only happens when @IsTestsIncluded = 0
) (当我在上面的最终查询中设置
@IsTestsIncluded = 1
时,跳过过滤器并且执行时间正常 - 延迟仅在@IsTestsIncluded = 0
时发生)
EDIT 编辑
As per Joel's request in the comments, here is the execution plan for the first query: 根据Joel在评论中的请求,这是第一个查询的执行计划:
And here is the execution plan for the second query: 这是第二个查询的执行计划:
(I can't post an execution plan for the 3rd query as it never completes - unless there is another way to get it in SSMS?) (我不能发布第三个查询的执行计划,因为它永远不会完成 - 除非有其他方法可以在SSMS中获取它吗?)
Why?
为什么? What am I doing wrong?
我究竟做错了什么?
You are trying to compile a query that needs to satisfy multiple distinct conditions, based on the variable. 您正在尝试根据变量编译需要满足多个不同条件的查询。 The optimizer must come up with one plan that works in both cases.
优化器必须提出一个适用于这两种情况的计划。
Try to avoid this like the plague. 试着像瘟疫一样避免这种情况。 Just issue two queries, one for one condition one for the other, so that the optimizer is free to optimize each queries separately and compile an execution plan that is optimal for each case.
只需发出两个查询,一个用于另一个条件,因此优化器可以自由地分别优化每个查询并编译对每种情况都是最佳的执行计划。
A lenghty discussion of the topic, with alternatives and pros and cons: Dynamic Search Conditions in T‑SQL 关于该主题的长篇讨论,以及替代方案和优点和缺点: T-SQL中的动态搜索条件
Try this: 尝试这个:
SELECT
a.StringValue, COUNT(a.StringValue)
FROM Attributes a
INNER JOIN ProductAssociations p ON a.ProductAssociation = p.ID
AND p.ProductCode = 'MyProductCode'
LEFT JOIN Attributes a2 ON a2.ProductAssociation = p.ID
AND a2.Name = 'Is Test' AND a2.BooleanValue = 0
WHERE
Name = 'Windows OS Version'
AND StringValue IS NOT NULL
AND COALESCE(a2.ProductAssociation, NULLIF(@IsTestsIncluded, 1)) IS NOT NULL
GROUP BY a.StringValue
The coalesce/nullif
combination is not the easiest-to-follow thing I've ever written, but it should be functionally equivalent to what you have as long as the join conditions match 0 or 1 record on the joined table. coalesce/nullif
组合不是我写过的最容易coalesce/nullif
东西,但只要连接条件匹配连接表上的0或1记录,它就应该在功能上等同于你拥有的东西。
Good answer from Joel +1 乔尔+1给出了很好的答案
OR is hard to optimize 或者很难优化
Going back to the second 回到第二个
Where in is hard for the optimizer to optimize 优化者难以优化的地方
Consider JOIN over all those where in 考虑加入所有那里的人
This still has an OR that may cause bad query plan but it gives the optimizer a better chance at minimizing the OR 这仍然具有可能导致错误查询计划的OR,但它为优化器提供了最小化OR的机会
SELECT A1.StringValue, COUNT(A1.StringValue)
FROM Attributes A1
JOIN ProductAssociations PA
ON PA.ID = A1.ProductAssociation
AND A1.Name = 'Windows OS Version'
AND A1.StringValue IS NOT NULL
AND PA.ProductCode = 'MyProductCode'
JOIN Attributes A2
ON A2.ProductAssociation = A1.ProductAssociation
AND ( @IsTestsIncluded = 1
OR (A2.Name = 'Is test' AND A2.BooleanValue = 0)
)
GROUP BY A1.StringValue
if you refactor @IsTestsIncluded you can maybe do this 如果你重构@IsTestsIncluded,你可以这样做
SELECT A1.StringValue, COUNT(A1.StringValue)
FROM Attributes A1
JOIN ProductAssociations PA
ON PA.ID = A1.ProductAssociation
AND A1.Name = 'Windows OS Version'
AND A1.StringValue IS NOT NULL
AND PA.ProductCode = 'MyProductCode'
LEFT JOIN Attributes A2
ON A2.ProductAssociation = A1.ProductAssociation
AND A2.Name = 'Is test'
AND A2.BooleanValue = 0
WHERE ISNULL(@IsTestsIncluded, A2.ProductAssociation) is NOT NULL
GROUP BY A1.StringValue
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.