[英]Why ' in ' is so much faster than ' = ' in SQL Select?
This select's execution time is approximately 25 ~ 30 sec.此选择的执行时间约为 25 ~ 30 秒。
SELECT *
FROM custinfo cs
WHERE cs.idcust = (SELECT cust_id
FROM customers
WHERE id = 1230)
Execution plan for ' = ': “ = ”的执行计划:
But if I change ' = ' for ' in ', then it becomes so much faster about 0.040 ~ 0.060-sec average.但是,如果我将 ' = ' 更改为 ' in ',那么它会变得更快,大约平均 0.040 ~ 0.060 秒。
SELECT *
FROM custinfo cs
WHERE cs.idcust in (SELECT cust_id
FROM customers
WHERE id = 1230)
Execution plan for ' in ': ' in ' 的执行计划:
And there have been opposite cases like this, where ' = ' was faster than ' in '.并且有过这样的相反情况,其中 ' = ' 比 ' in ' 快。
Does anybody know the reason why simple syntax change makes this much difference in performance and execution time?有人知道为什么简单的语法更改会在性能和执行时间上产生如此大的差异吗?
When is ' = ' is faster than ' in ' or vice versa?什么时候“=”比“in”快,反之亦然? Are there some conditions for which to use in what cases?有什么条件可以在什么情况下使用?
' = ' ' = '
The "IN" means "there might be more that one row returned in this subquery, please check them all" whereas "=" means "there will be only one line returned from subquery" otherwise it would be an error. “IN”表示“此子查询中可能返回不止一行,请全部检查”,而“=”表示“子查询将只返回一行”,否则会出错。
Having that info the optimizer build different query plans.有了这些信息,优化器会构建不同的查询计划。 For "="-query it executes subquery first and then filters the custInfo table out.对于 "="-query,它首先执行子查询,然后过滤出 custInfo 表。
For the "IN" query optimizer performs a join operation as if you've written following query对于“IN”查询优化器执行连接操作,就像您编写了以下查询一样
SELECT *
FROM custinfo cs
JOIN customers c
ON cs.idcust = c.cust_id
WHERE c.id = 1230;
This is why execution time differs.这就是执行时间不同的原因。 It can take longer or not depending on you data selectivity, indexes, partitioning and so on根据您的数据选择性、索引、分区等,它可能需要更长的时间
UPD.更新。 From the execution plans you've uploaded I see the following从您上传的执行计划中,我看到以下内容
1.1. It competely scans the MT_OPERATION_OUT table (FULL TABLE SCAN), captures the result
1.2. Then it accesess another table on remote DB, presumably scans it too (REMOTE)
1.3. Filters data it got from remote.
2.1. It competely scans the MT_OPERATION_OUT table (FULL TABLE SCAN), captures the result
2.2. Sorts what it got on the previous step (SORT UNIQUE)
2.3. Then it accesess another table on remote DB, presumably scans it too (REMOTE)
2.4. Performs a join (NESTED LOOPS)
So to me it seems that for some reason the db needs more time to filter data from remote db that to join it using "nested loops" method.所以在我看来,由于某种原因,数据库需要更多时间来过滤来自远程数据库的数据,以便使用“嵌套循环”方法加入它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.