简体   繁体   English

这两个SQL查询之间的区别

[英]Difference between these two SQL queries

So I'm testing 2 queries and I'm getting different results. 因此,我正在测试2个查询,但结果却有所不同。 I want to correct/patch up my understanding. 我想纠正/补充我的理解。 Here's two generic SQL queries that to my understanding are the same but when executed get different results . 据我所知,这是两个通用SQL查询,它们相同,但是执行时会得到不同的结果 Note this is not a question about diff between ANSI and non-ANSI SQL . 请注意,这不是有关ANSI和非ANSI SQL之间的区别的问题。

Query 1 (using LEFT JOIN ): 查询1(使用LEFT JOIN ):

SELECT * FROM person p LEFT JOIN person_log pl
ON p.person_id = pl.person_id
WHERE pl.person_id IS NULL
AND p.is_active = 1;

Query 2 (using 2 queries): 查询2(使用2个查询):

SELECT * FROM person
WHERE person.is_active = 1
AND person_id NOT IN (SELECT person_id FROM person_log);

To my understanding, both represent this in venn diagram form . 据我了解,两者均以维恩图形式表示 Also, is one more efficient than the other? 另外,一个比另一个更有效吗? A query on JOIN results vs 2 queries? 关于JOIN结果的查询还是2个查询?

EDIT: Changed = to IS in query 1. Thanks to @Justin Samuel for spotting the = error that's causing different results! 编辑:将=更改为查询1中的IS 。感谢@Justin Samuel发现导致不同结果的=错误!

There is one bug in the above query 1. You cannot use "=" to check whether it is NULL 上面的查询1中有一个错误。您不能使用“ =”检查它是否为NULL

SELECT * FROM person p LEFT JOIN person_log pl
ON p.person_id = pl.person_id
WHERE pl.person_id = NULL
AND p.is_active = 1;

Ideally you should be using IS NULL 理想情况下,您应该使用IS NULL

SELECT * FROM person p LEFT JOIN person_log pl
ON p.person_id = pl.person_id
WHERE pl.person_id IS NULL
AND p.is_active = 1;

You can review the NULL checks in the https://www.simple-talk.com/sql/t-sql-programming/how-to-get-nulls-horribly-wrong-in-sql-server/ 您可以在https://www.simple-talk.com/sql/t-sql-programming/how-to-get-nulls-horribly-wrong-in-sql-server/中查看NULL检查。

Both queries get you the same data. 这两个查询都会为您提供相同的数据。

The second query is the straight-forward way to the problem; 第二个查询是解决问题的直接方法。 get all persons that have no entry in person_log. 获取在person_log中没有任何条目的所有人员。 You can do the same with a NOT EXISTS clause instead of a NOT IN clause. 您可以使用NOT EXISTS子句代替NOT IN子句来执行相同的操作。 ( NOT IN is a bit leaner, but the values you select in the subquery must not be null, for otherwise you see no data at all. I usually perfer IN / NOT IN over EXISTS / NOT EXISTS for their simplicity, but that's a matter of personal preference. NOT IN稍微更精简一些,但是您在子查询中选择的值不能为null,否则您将根本看不到任何数据。为了简化起见,我通常将IN / NOT IN视为超过EXISTS / NOT EXISTS ,但这是一个问题个人喜好。

The first query is called an anti join. 第一个查询称为反联接。 It is a trick to achieve the same as a NOT EXISTS or NOT IN query on weak database systems that don't implement these methods well. 在不能很好地实现这些方法的弱数据库系统上,实现与NOT EXISTSNOT IN查询相同的技巧。 (The reason is that when a new database system is written, the programmers usually put all their effort in joins for they are so important and neglect EXISTS and IN for some time.) (原因是在编写新的数据库系统时,程序员通常将所有精力都放在连接上,因为它们是如此重要,并且IN一段时间内忽略了EXISTSIN 。)

It depends on the DBMS which gets executed fastest, NOT IN , NOT EXISTS or the anti join. 它取决于执行速度最快的DBMS, NOT INNOT EXISTS或反联接。 The ideal DBMS would get to the same execution plan, no matter which syntax you choose. 无论您选择哪种语法,理想的DBMS都将达到相同的执行计划。

The anti join can produce large intermediate results. 反连接可以产生较大的中间结果。 With a mature DBMS you shouldn't use anti joins for this reason and for mere readability. 对于成熟的DBMS,出于这个原因并且仅出于可读性,您不应该使用反联接。

如果要在第二张表中查找不匹配的行,特别是如果列不可为空,则使用NOT IN。

The first one gets all the persons that have no log entries and then filters out the inactive ones. 第一个获取所有没有日志条目的人员,然后过滤掉不活动的人员。

The second query gets all the persons. 第二个查询获取所有人员。 Then filters out the inactive. 然后过滤掉非活动状态。 Then gets all the log entries. 然后获取所有日志条目。 Then filters out the persons that have no log entries. 然后过滤掉没有日志条目的人员。

They will both return the same info. 他们都将返回相同的信息。 However, for performance reasons, queries like the second one should be avoided if using a JOIN is possible. 但是,出于性能原因,如果可以使用JOIN,则应避免使用第二种查询。 The main benefit of the JOINs is indexes. JOIN的主要好处是索引。 Only one index in the WHERE will be used, but each JOIN will use one. WHERE中将仅使用一个索引,但每个JOIN将使用一个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM