简体   繁体   English

为什么 ' in ' 比 SQL Select 中的 '=' 快得多?

[英]Why ' in ' is so much faster than ' = ' in SQL Select?

This select's execution time is approximately 25 ~ 30 sec.此选择的执行时间约为 25 ~ 30 秒。

SELECT *
  FROM custinfo cs
 WHERE cs.idcust = (SELECT cust_id
                        FROM customers
                       WHERE id = 1230)

Execution plan for ' = ': “ = ”的执行计划:

“ = ”的执行计划

But if I change ' = ' for ' in ', then it becomes so much faster about 0.040 ~ 0.060-sec average.但是,如果我将 ' = ' 更改为 ' in ',那么它会变得更快,大约平均 0.040 ~ 0.060 秒。

SELECT *
      FROM custinfo cs
     WHERE cs.idcust in (SELECT cust_id
                            FROM customers
                           WHERE id = 1230)

Execution plan for ' in ': ' in ' 的执行计划:

' in ' 的执行计划

And there have been opposite cases like this, where ' = ' was faster than ' in '.并且有过这样的相反情况,其中 ' = ' 比 ' in ' 快。

Does anybody know the reason why simple syntax change makes this much difference in performance and execution time?有人知道为什么简单的语法更改会在性能和执行时间上产生如此大的差异吗?

When is ' = ' is faster than ' in ' or vice versa?什么时候“=”比“in”快,反之亦然? Are there some conditions for which to use in what cases?有什么条件可以在什么情况下使用?

  • I'm using dblink for my table.我正在为我的表使用 dblink。 Maybe that's what's affecting my query?也许这就是影响我的查询的原因?
  • Welp, guys, Here's the thing... Now both of my queries run for about ~0.10 sec.好吧,伙计们,事情就是这样......现在我的两个查询都运行了大约 0.10 秒。 So now I can't find an Execution plan for my queries when they were running slow.所以现在我无法为我的查询在运行缓慢时找到执行计划。 And I have absolutely no idea why my queries performance changed in a day... Like, I can only guess the problem was with our servers, but, why did it only affect my 1 query, while the other runs normally?而且我完全不知道为什么我的查询性能在一天内发生了变化......就像,我只能猜测问题出在我们的服务器上,但是,为什么它只影响我的 1 个查询,而另一个正常运行? Still, here's my execution plans:不过,这是我的执行计划:

' = ' ' = '

' = ' 执行计划

' in ' ' 在 ' '中'执行计划

The "IN" means "there might be more that one row returned in this subquery, please check them all" whereas "=" means "there will be only one line returned from subquery" otherwise it would be an error. “IN”表示“此子查询中可能返回不止一行,请全部检查”,而“=”表示“子查询将只返回一行”,否则会出错。

Having that info the optimizer build different query plans.有了这些信息,优化器会构建不同的查询计划。 For "="-query it executes subquery first and then filters the custInfo table out.对于 "="-query,它首先执行子查询,然后过滤出 custInfo 表。

For the "IN" query optimizer performs a join operation as if you've written following query对于“IN”查询优化器执行连接操作,就像您编写了以下查询一样

SELECT *
  FROM custinfo cs
  JOIN customers c
    ON cs.idcust = c.cust_id
 WHERE c.id = 1230;

This is why execution time differs.这就是执行时间不同的原因。 It can take longer or not depending on you data selectivity, indexes, partitioning and so on根据您的数据选择性、索引、分区等,它可能需要更长的时间

UPD.更新。 From the execution plans you've uploaded I see the following从您上传的执行计划中,我看到以下内容

  1. For the "=" query:对于“=”查询:
1.1. It competely scans the MT_OPERATION_OUT table (FULL TABLE SCAN), captures the result
1.2. Then it accesess another table on remote DB, presumably scans it too (REMOTE)
1.3. Filters data it got from remote.
  1. For the "IN" query:对于“IN”查询:
2.1. It competely scans the MT_OPERATION_OUT table (FULL TABLE SCAN), captures the result
2.2. Sorts what it got on the previous step (SORT UNIQUE)
2.3. Then it accesess another table on remote DB, presumably scans it too (REMOTE)
2.4. Performs a join (NESTED LOOPS)

So to me it seems that for some reason the db needs more time to filter data from remote db that to join it using "nested loops" method.所以在我看来,由于某种原因,数据库需要更多时间来过滤来自远程数据库的数据,以便使用“嵌套循环”方法加入它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM