简体   繁体   English

SQL查询性能条件连接

[英]SQL Query Performance Join with condition

calling all sql experts. 致电所有sql专家。 I have the following select statement: 我有以下选择语句:

SELECT 1
  FROM table1 t1
  JOIN table2 t2 ON t1.id = t2.id
 WHERE t1.field = xyz

I'm a little bit worried about the performance here. 我有点担心这里的表现。 Is the where clause evaluated before or after the join? 在连接之前或之后评估where子句? If its evaluated after, is there way to first evaluate the where clause? 如果在之后对其求值,是否有办法首先对where子句求值?

The whole table could easily contain more than a million entries but after the where clause it may be only 1-10 entries left so in my opinion it really is a big performance difference depending on when the where clause is evaluated. 整个表很容易包含超过一百万个条目,但是在where子句之后可能只剩下1-10个条目,因此,我认为这实际上是一个很大的性能差异,具体取决于评估where子句的时间。

Thanks in advance. 提前致谢。

Dimi 迪米

You could rewrite your query like this: 您可以这样重写查询:

SELECT 1
  FROM (SELECT * FROM table1 WHERE field = xyz) t1
  JOIN table2 t2 ON t1.id = t2.id

But depending on the database product the optimiser might still decide that the best way to do this is to JOIN table1 to table2 and then apply the constraint. 但是,根据数据库产品的不同,优化器可能仍会决定,执行此操作的最佳方法是将table1联接到table2,然后应用约束。

For this query: 对于此查询:

SELECT 1
FROM table1 t1 JOIN
     table2 t2
     ON t1.id = t2.id
WHERE t1.field = xyz;

The optimal indexes are table1(field, id) , table2(id) . 最佳索引是table1(field, id)table2(id)

How the query is executed depends on the optimizer. 查询的执行方式取决于优化程序。 It is tasked with choosing the based execution plan, given the table statistics and environment. 给定表统计信息和环境,它的任务是选择基础的执行计划。

Each DBMS has its own query optimizer. 每个DBMS都有自己的查询优化器。 So by logic of things in case like yours WHERE will be executed first and then JOIN part of the query 因此,通过的情况下,像你这样的事情的逻辑WHERE将被先执行JOIN查询的一部分

As mentioned in the comments and other answers with performance the answer is always "it depends" depending on your dbms and the indexing of the base tables the query may be fine as is and the optimizer may evaluate the where first. 如评论和其他具有性能的答案中所述,答案始终是“取决于”的,具体取决于您的dbms和基表的索引,查询可能仍然是正确的,优化器可能会先评估在哪里。 Or the join may be efficient anyway if the indexes cover the join requirements. 否则,如果索引满足连接要求,则连接可能仍然有效。

Alternatively you can force the behavior you require by reducing the dataset of t1 before you do the join using a nested select as Richard suggested or adding the t1.field = xyz to the join for example 另外,您可以强制执行所需的行为,方法是在进行联接之前使用Richard所建议的嵌套选择来减少t1的数据集,或者将t1.field = xyz添加到联接中

ON t1.field = xyz AND t1.id = t2.id

personally if i needed to reduce the dataset before the join I would use a cte 就个人而言,如果我需要在加入之前减少数据集,我会使用cte

With T1 AS 
(
   SELECT * FROM table1
   WHERE T1.Field = 'xyz'
)
SELECT 1 
FROM T1
JOIN Table2 T2
ON T1.Id = T2.Id

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM