简体   繁体   English

SELECT DISTINCT 内部 WHERE IN 子句性能

[英]SELECT DISTINCT Inside WHERE IN clause performance

I have a performance question about the following code...我对以下代码有一个性能问题...

SELECT*FROM GCL_Loans WHERE Loan_ID IN
(
    SELECT Loan_ID FROM GCL_Loan_Items
)

GCL_Loans has a list of loans with basic infomation CCL_Loan_Items has information about a specific item in a loan. GCL_Loans有一个包含基本信息的贷款列表CCL_Loan_Items包含有关贷款中特定项目的信息。 There can be duplicate Loan_ID's in GCL_Loan_Items GCL_Loan_Items 中可能存在重复的GCL_Loan_Items

Can anyone explain why this query would be faster or slower than the one above?谁能解释为什么这个查询会比上面的查询更快或更慢?

SELECT*FROM GCL_Loans WHERE Loan_ID IN
(
    SELECT DISTINCT Loan_ID FROM GCL_Loan_Items
)

The "DISTINCT" version is probably faster, because the IN clause will have a smaller data set to search to determine if any given GCL_Loans.Loan_ID is in the set. “DISTINCT”版本可能更快,因为IN子句将有一个较小的数据集来搜索以确定是否有任何给定的GCL_Loans.Loan_ID在集合中。 Without the DISTINCT , the data set will be larger.没有DISTINCT ,数据DISTINCT更大。

There's a reasonably good argument to be made that the query optimizer will automatically recognize the IN test is a set-wise, not a list-wise test and do the DISTINCT during auto-indexing ... but I've seen that fail before.有一个相当好的论点,即查询优化器将自动识别IN测试是设置方式,而不是列表方式测试,并在自动索引期间执行 DISTINCT ......但我以前见过失败。

Note that subselects can be a fail here too, because some databases (mysql) will execute the subselect for each element in the primary select.请注意,这里的子选择也可能失败,因为某些数据库 (mysql) 会为主选择中的每个元素执行子选择。

两者的计划和绩效是平等的

Because by selecting DISTINCT there is less criteria in the SUBQuery (IN).因为通过选择 DISTINCT,SUBQuery (IN) 中的条件较少。 My understanding is SQL will run the subquery first to generate the list of items that are to be included in the IN.我的理解是 SQL 将首先运行子查询以生成要包含在 IN 中的项目列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM