简体   繁体   English

SQL server join与子查询性能问题

[英]SQL server join vs subquery performance question

I discovered that in some cases a query like 我发现在某些情况下查询类似

select 
   usertable.userid,
   (select top 1 name from nametable where userid = usertable.userid) as name 
from usertable 
where active = 1

takes an order of magnitude longer to complete in SS2008R2 than the equivalent join query 在SS2008R2中完成的时间比等效的连接查询要长一个数量级

select 
   usertable.userid,
   nametable.name 
from usertable 
left join nametable on nametable.userid = usertable.userid 
where usertable.active = 1

where both tables are indexed and have over 100k rows. 其中两个表都已编入索引并且行数超过100k。 Interestingly, inserting a top clause into the original query makes it perform on par with the join query: 有趣的是,在原始查询中插入一个top子句使其与连接查询相同:

select 
    top (select count(*) from usertable where active = 1) usertable.userid,
    (select top 1 name from nametable where userid = usertable.userid) as name 
from usertable 
where active = 1

Does anyone have any idea why the original query performs so poorly? 有没有人知道为什么原始查询表现如此糟糕?

Well, the queries are different - unless the userid column is a primary key or has a uniqueness constraint then the second query could return more rows than the first. 好吧,查询是不同的 - 除非userid列是主键或具有唯一性约束,否则第二个查询可能返回比第一个更多的行。

That said, with the assumption that userid is a primary key / unique try removing the TOP 1 part of the first subquery: 也就是说,假设userid是主键/唯一尝试删除第一个子查询的TOP 1部分:

select 
   usertable.userid,
   (select name from nametable where userid = usertable.userid) as name 
from usertable 
where active = 1

It's a correlated subquery, which means it needs to execute once per return row of the outer query since it references a field in the outer query. 它是一个相关的子查询,这意味着它需要在外部查询的每个返回行执行一次,因为它引用外部查询中的字段。

A JOIN runs once for the entire result set and gets merged. JOIN为整个结果集运行一次并合并。 Your subquery runs the outer query, then for each returned row it runs the subquery again. 您的子查询运行外部查询,然后对于每个返回的行,它再次运行子查询。

The original query will execute the sub select as many times as there are rows, thus the poor performance. 原始查询将执行子select次数与行数一样多,因此性能较差。

When you JOIN you get the whole result set at once. 当您JOIN您立即获得整个结果集。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM