[英]SQL JOIN - WHERE clause vs. ON clause
After reading it, this is not a duplicate of Explicit vs Implicit SQL Joins .阅读后,这不是Explicit vs Implicit SQL Joins的副本。 The answer may be related (or even the same) but the question is different.
答案可能相关(甚至相同),但问题不同。
What is the difference and what should go in each?每个 go 有什么区别,应该是什么?
If I understand the theory correctly, the query optimizer should be able to use both interchangeably.如果我正确地理解了这个理论,查询优化器应该能够互换使用两者。
They are not the same thing.它们不是同一件事。
Consider these queries:考虑这些查询:
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345
and和
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345
The first will return an order and its lines, if any, for order number 12345
.第一个将返回订单号
12345
的订单及其行(如果有)。 The second will return all orders, but only order 12345
will have any lines associated with it.第二个将返回所有订单,但只有订单
12345
将有任何与之关联的行。
With an INNER JOIN
, the clauses are effectively equivalent.使用
INNER JOIN
,子句实际上是等效的。 However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.但是,仅仅因为它们在功能上相同,产生相同的结果,并不意味着这两种子句具有相同的语义。
Does not matter for inner joins对内连接无关紧要
Matters for outer joins外连接事项
a.一种。
WHERE
clause: After joining. WHERE
子句:加盟后。 Records will be filtered after join has taken place.加入后将过滤记录。
b.湾
ON
clause - Before joining. ON
子句 - 加入前。 Records (from right table) will be filtered before joining.记录(来自右表)将在加入之前被过滤。 This may end up as null in the result (since OUTER join).
这可能最终在结果中为 null(因为 OUTER join)。
Example : Consider the below tables:示例:考虑下表:
documents:文件:
id ![]() |
name![]() |
---|---|
1 ![]() |
Document1![]() |
2 ![]() |
Document2![]() |
3 ![]() |
Document3![]() |
4 ![]() |
Document4![]() |
5 ![]() |
Document5![]() |
downloads:下载:
id ![]() |
document_id![]() |
username![]() |
---|---|---|
1 ![]() |
1 ![]() |
sandeep![]() |
2 ![]() |
1 ![]() |
simi![]() |
3 ![]() |
2 ![]() |
sandeep![]() |
4 ![]() |
2 ![]() |
reya![]() |
5 ![]() |
3 ![]() |
simi![]() |
a) Inside WHERE
clause: a) 在
WHERE
子句中:
SELECT documents.name, downloads.id
FROM documents
LEFT OUTER JOIN downloads
ON documents.id = downloads.document_id
WHERE username = 'sandeep'
For above query the intermediate join table will look like this.对于上述查询,中间连接表将如下所示。
id(from documents) ![]() |
name![]() |
id (from downloads) ![]() |
document_id![]() |
username![]() |
---|---|---|---|---|
1 ![]() |
Document1![]() |
1 ![]() |
1 ![]() |
sandeep![]() |
1 ![]() |
Document1![]() |
2 ![]() |
1 ![]() |
simi![]() |
2 ![]() |
Document2![]() |
3 ![]() |
2 ![]() |
sandeep![]() |
2 ![]() |
Document2![]() |
4 ![]() |
2 ![]() |
reya![]() |
3 ![]() |
Document3![]() |
5 ![]() |
3 ![]() |
simi![]() |
4 ![]() |
Document4![]() |
NULL![]() |
NULL![]() |
NULL![]() |
5 ![]() |
Document5![]() |
NULL![]() |
NULL![]() |
NULL![]() |
After applying the WHERE
clause and selecting the listed attributes, the result will be:应用
WHERE
子句并选择列出的属性后,结果将是:
name![]() |
id ![]() |
---|---|
Document1![]() |
1 ![]() |
Document2![]() |
3 ![]() |
b) Inside JOIN
clause b) 在
JOIN
子句中
SELECT documents.name, downloads.id
FROM documents
LEFT OUTER JOIN downloads
ON documents.id = downloads.document_id
AND username = 'sandeep'
For above query the intermediate join table will look like this.对于上述查询,中间连接表将如下所示。
id(from documents) ![]() |
name![]() |
id (from downloads) ![]() |
document_id![]() |
username![]() |
---|---|---|---|---|
1 ![]() |
Document1![]() |
1 ![]() |
1 ![]() |
sandeep![]() |
2 ![]() |
Document2![]() |
3 ![]() |
2 ![]() |
sandeep![]() |
3 ![]() |
Document3![]() |
NULL![]() |
NULL![]() |
NULL![]() |
4 ![]() |
Document4![]() |
NULL![]() |
NULL![]() |
NULL![]() |
5 ![]() |
Document5![]() |
NULL![]() |
NULL![]() |
NULL![]() |
Notice how the rows in documents
that did not match both the conditions are populated with NULL
values.注意
documents
中不匹配这两个条件的行是如何用NULL
值填充的。
After Selecting the listed attributes, the result will be:选择列出的属性后,结果将是:
name![]() |
id ![]() |
---|---|
Document1![]() |
1 ![]() |
Document2![]() |
3 ![]() |
Document3![]() |
NULL![]() |
Document4![]() |
NULL![]() |
Document5![]() |
NULL![]() |
On INNER JOIN
s they are interchangeable, and the optimizer will rearrange them at will.在
INNER JOIN
它们是可以互换的,优化器会随意重新排列它们。
On OUTER JOIN
s, they are not necessarily interchangeable, depending on which side of the join they depend on.在
OUTER JOIN
,它们不一定可以互换,这取决于它们依赖于连接的哪一侧。
I put them in either place depending on the readability.我根据可读性把它们放在任何一个地方。
The way I do it is:我这样做的方式是:
Always put the join conditions in the ON
clause if you are doing an INNER JOIN
.如果您正在执行
INNER JOIN
始终将连接条件放在ON
子句中。 So, do not add any WHERE conditions to the ON clause, put them in the WHERE
clause.因此,不要在 ON 子句中添加任何 WHERE 条件,将它们放在
WHERE
子句中。
If you are doing a LEFT JOIN
, add any WHERE conditions to the ON
clause for the table in the right side of the join.如果您正在执行
LEFT JOIN
,请将任何 WHERE 条件添加到连接右侧表的ON
子句中。 This is a must, because adding a WHERE clause that references the right side of the join will convert the join to an INNER JOIN.这是必须的,因为添加引用联接右侧的 WHERE 子句会将联接转换为 INNER JOIN。
The exception is when you are looking for the records that are not in a particular table.例外情况是当您要查找不在特定表中的记录时。 You would add the reference to a unique identifier (that is not ever NULL) in the RIGHT JOIN table to the WHERE clause this way:
WHERE t2.idfield IS NULL
.您可以通过这种方式将 RIGHT JOIN 表中对唯一标识符(永远不会为 NULL)的引用添加到 WHERE 子句中:
WHERE t2.idfield IS NULL
。 So, the only time you should reference a table on the right side of the join is to find those records which are not in the table.因此,您应该在联接右侧引用表的唯一时间是查找那些不在表中的记录。
On an inner join, they mean the same thing.在内部联接中,它们的含义相同。 However you will get different results in an outer join depending on if you put the join condition in the WHERE vs the ON clause.
但是,根据您是将连接条件放在 WHERE 还是 ON 子句中,您将在外连接中获得不同的结果。 Take a look at this related question and this answer (by me).
看看这个相关的问题和这个答案(由我)。
I think it makes the most sense to be in the habit of always putting the join condition in the ON clause (unless it is an outer join and you actually do want it in the where clause) as it makes it clearer to anyone reading your query what conditions the tables are being joined on, and also it helps prevent the WHERE clause from being dozens of lines long.我认为养成始终将连接条件放在 ON 子句中的习惯是最有意义的(除非它是一个外部连接并且您确实希望在 where 子句中使用它),因为它使任何阅读您的查询的人都更清楚表的连接条件是什么,它还有助于防止 WHERE 子句长达数十行。
Considering we have the following post
and post_comment
tables:考虑到我们有以下
post
和post_comment
表:
The post
has the following records:该
post
有以下记录:
| id | title |
|----|-----------|
| 1 | Java |
| 2 | Hibernate |
| 3 | JPA |
and the post_comment
has the following three rows:并且
post_comment
有以下三行:
| id | review | post_id |
|----|-----------|---------|
| 1 | Good | 1 |
| 2 | Excellent | 1 |
| 3 | Awesome | 2 |
The SQL JOIN clause allows you to associate rows that belong to different tables. SQL JOIN 子句允许您关联属于不同表的行。 For instance, a CROSS JOIN will create a Cartesian Product containing all possible combinations of rows between the two joining tables.
例如, CROSS JOIN将创建一个笛卡尔积,其中包含两个连接表之间所有可能的行组合。
While the CROSS JOIN is useful in certain scenarios, most of the time, you want to join tables based on a specific condition.虽然 CROSS JOIN 在某些情况下很有用,但大多数情况下,您希望根据特定条件连接表。 And, that's where INNER JOIN comes into play.
而且,这就是 INNER JOIN 发挥作用的地方。
The SQL INNER JOIN allows us to filter the Cartesian Product of joining two tables based on a condition that is specified via the ON clause. SQL INNER JOIN 允许我们根据通过 ON 子句指定的条件过滤连接两个表的笛卡尔积。
If you provide an "always true" condition, the INNER JOIN will not filter the joined records, and the result set will contain the Cartesian Product of the two joining tables.如果您提供“始终为真”的条件,则 INNER JOIN 不会过滤连接的记录,结果集将包含两个连接表的笛卡尔积。
For instance, if we execute the following SQL INNER JOIN query:例如,如果我们执行以下 SQL INNER JOIN 查询:
SELECT
p.id AS "p.id",
pc.id AS "pc.id"
FROM post p
INNER JOIN post_comment pc ON 1 = 1
We will get all combinations of post
and post_comment
records:我们将获得
post
和post_comment
记录的所有组合:
| p.id | pc.id |
|---------|------------|
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
So, if the ON clause condition is "always true", the INNER JOIN is simply equivalent to a CROSS JOIN query:因此,如果 ON 子句条件“始终为真”,则 INNER JOIN 仅等效于 CROSS JOIN 查询:
SELECT
p.id AS "p.id",
pc.id AS "pc.id"
FROM post p
CROSS JOIN post_comment
WHERE 1 = 1
ORDER BY p.id, pc.id
On the other hand, if the ON clause condition is "always false", then all the joined records are going to be filtered out and the result set will be empty.另一方面,如果 ON 子句条件为“始终为假”,则所有连接的记录都将被过滤掉,结果集将为空。
So, if we execute the following SQL INNER JOIN query:因此,如果我们执行以下 SQL INNER JOIN 查询:
SELECT
p.id AS "p.id",
pc.id AS "pc.id"
FROM post p
INNER JOIN post_comment pc ON 1 = 0
ORDER BY p.id, pc.id
We won't get any result back:我们不会得到任何结果:
| p.id | pc.id |
|---------|------------|
That's because the query above is equivalent to the following CROSS JOIN query:这是因为上面的查询等效于以下 CROSS JOIN 查询:
SELECT
p.id AS "p.id",
pc.id AS "pc.id"
FROM post p
CROSS JOIN post_comment
WHERE 1 = 0
ORDER BY p.id, pc.id
The most common ON clause condition is the one that matches the Foreign Key column in the child table with the Primary Key column in the parent table, as illustrated by the following query:最常见的 ON 子句条件是将子表中的外键列与父表中的主键列匹配,如以下查询所示:
SELECT
p.id AS "p.id",
pc.post_id AS "pc.post_id",
pc.id AS "pc.id",
p.title AS "p.title",
pc.review AS "pc.review"
FROM post p
INNER JOIN post_comment pc ON pc.post_id = p.id
ORDER BY p.id, pc.id
When executing the above SQL INNER JOIN query, we get the following result set:当执行上述 SQL INNER JOIN 查询时,我们得到以下结果集:
| p.id | pc.post_id | pc.id | p.title | pc.review |
|---------|------------|------------|------------|-----------|
| 1 | 1 | 1 | Java | Good |
| 1 | 1 | 2 | Java | Excellent |
| 2 | 2 | 3 | Hibernate | Awesome |
So, only the records that match the ON clause condition are included in the query result set.因此,只有符合 ON 子句条件的记录才会包含在查询结果集中。 In our case, the result set contains all the
post
along with their post_comment
records.在我们的例子中,结果集包含所有
post
及其post_comment
记录。 The post
rows that have no associated post_comment
are excluded since they can not satisfy the ON Clause condition.没有关联
post_comment
的post
行被排除,因为它们不能满足 ON 子句条件。
Again, the above SQL INNER JOIN query is equivalent to the following CROSS JOIN query:同样,上面的 SQL INNER JOIN 查询等效于以下 CROSS JOIN 查询:
SELECT
p.id AS "p.id",
pc.post_id AS "pc.post_id",
pc.id AS "pc.id",
p.title AS "p.title",
pc.review AS "pc.review"
FROM post p, post_comment pc
WHERE pc.post_id = p.id
The non-struck rows are the ones that satisfy the WHERE clause, and only these records are going to be included in the result set.未命中的行是满足 WHERE 子句的行,并且只有这些记录将包含在结果集中。 That's the best way to visualize how the INNER JOIN clause works.
这是可视化 INNER JOIN 子句如何工作的最佳方式。
| p.id | pc.post_id | pc.id | p.title | pc.review | |------|------------|-------|-----------|-----------| | 1 | 1 | 1 | Java | Good | | 1 | 1 | 2 | Java | Excellent || 1 | 2 | 3 | Java | Awesome || 2 | 1 | 1 | Hibernate | Good || 2 | 1 | 2 | Hibernate | Excellent || 2 | 2 | 3 | Hibernate | Awesome || 3 | 1 | 1 | JPA | Good || 3 | 1 | 2 | JPA | Excellent || 3 | 2 | 3 | JPA | Awesome |
An INNER JOIN statement can be rewritten as a CROSS JOIN with a WHERE clause matching the same condition you used in the ON clause of the INNER JOIN query. INNER JOIN 语句可以重写为 CROSS JOIN,其 WHERE 子句与您在 INNER JOIN 查询的 ON 子句中使用的条件相同。
Not that this only applies to INNER JOIN, not for OUTER JOIN.
并不是这仅适用于 INNER JOIN,不适用于 OUTER JOIN。
There is great difference between where clause vs. on clause , when it comes to left join.当涉及到左连接时, where 子句与on 子句之间有很大的不同。
Here is example:这是示例:
mysql> desc t1;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id | int(11) | NO | | NULL | |
| fid | int(11) | NO | | NULL | |
| v | varchar(20) | NO | | NULL | |
+-------+-------------+------+-----+---------+-------+
There fid is id of table t2. fid 是表 t2 的 id。
mysql> desc t2;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id | int(11) | NO | | NULL | |
| v | varchar(10) | NO | | NULL | |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.00 sec)
Query on "on clause" :查询“on 子句”:
mysql> SELECT * FROM `t1` left join t2 on fid = t2.id AND t1.v = 'K'
-> ;
+----+-----+---+------+------+
| id | fid | v | id | v |
+----+-----+---+------+------+
| 1 | 1 | H | NULL | NULL |
| 2 | 1 | B | NULL | NULL |
| 3 | 2 | H | NULL | NULL |
| 4 | 7 | K | NULL | NULL |
| 5 | 5 | L | NULL | NULL |
+----+-----+---+------+------+
5 rows in set (0.00 sec)
Query on "where clause":查询“where子句”:
mysql> SELECT * FROM `t1` left join t2 on fid = t2.id where t1.v = 'K';
+----+-----+---+------+------+
| id | fid | v | id | v |
+----+-----+---+------+------+
| 4 | 7 | K | NULL | NULL |
+----+-----+---+------+------+
1 row in set (0.00 sec)
It is clear that, the first query returns a record from t1 and its dependent row from t2, if any, for row t1.v = 'K'.很明显,第一个查询从 t1 返回一条记录,并从 t2 返回它的从属行(如果有的话),对于行 t1.v = 'K'。
The second query returns rows from t1, but only for t1.v = 'K' will have any associated row with it.第二个查询从 t1 返回行,但仅对于 t1.v = 'K' 将有任何关联的行。
Let's consider those tables :让我们考虑这些表:
A一种
id | SomeData
B乙
id | id_A | SomeOtherData
id_A
being a foreign key to table A
id_A
是表A
的外键
Writting this query :编写此查询:
SELECT *
FROM A
LEFT JOIN B
ON A.id = B.id_A;
Will provide this result :将提供这个结果:
/ : part of the result
B
+---------------------------------+
A | |
+---------------------+-------+ |
|/////////////////////|///////| |
|/////////////////////|///////| |
|/////////////////////|///////| |
|/////////////////////|///////| |
|/////////////////////+-------+-------------------------+
|/////////////////////////////|
+-----------------------------+
What is in A but not in B means that there is null values for B.在 A 中但不在 B 中的内容意味着 B 有空值。
Now, let's consider a specific part in B.id_A
, and highlight it from the previous result :现在,让我们考虑
B.id_A
的特定部分,并从之前的结果中突出显示它:
/ : part of the result
* : part of the result with the specific B.id_A
B
+---------------------------------+
A | |
+---------------------+-------+ |
|/////////////////////|///////| |
|/////////////////////|///////| |
|/////////////////////+---+///| |
|/////////////////////|***|///| |
|/////////////////////+---+---+-------------------------+
|/////////////////////////////|
+-----------------------------+
Writting this query :编写此查询:
SELECT *
FROM A
LEFT JOIN B
ON A.id = B.id_A
AND B.id_A = SpecificPart;
Will provide this result :将提供这个结果:
/ : part of the result
* : part of the result with the specific B.id_A
B
+---------------------------------+
A | |
+---------------------+-------+ |
|/////////////////////| | |
|/////////////////////| | |
|/////////////////////+---+ | |
|/////////////////////|***| | |
|/////////////////////+---+---+-------------------------+
|/////////////////////////////|
+-----------------------------+
Because this removes in the inner join the values that aren't in B.id_A = SpecificPart
因为这会在内部连接中删除不在
B.id_A = SpecificPart
Now, let's change the query to this :现在,让我们将查询更改为:
SELECT *
FROM A
LEFT JOIN B
ON A.id = B.id_A
WHERE B.id_A = SpecificPart;
The result is now :结果现在是:
/ : part of the result
* : part of the result with the specific B.id_A
B
+---------------------------------+
A | |
+---------------------+-------+ |
| | | |
| | | |
| +---+ | |
| |***| | |
| +---+---+-------------------------+
| |
+-----------------------------+
Because the whole result is filtered against B.id_A = SpecificPart
removing the parts B.id_A IS NULL
, that are in the A that aren't in B因为整个结果是针对
B.id_A = SpecificPart
过滤的,删除了B.id_A IS NULL
部分,这些部分在A 中而不是在 B 中
In terms of the optimizer, it shouldn't make a difference whether you define your join clauses with ON or WHERE.就优化器而言,使用 ON 或 WHERE 定义连接子句应该没有区别。
However, IMHO, I think it's much clearer to use the ON clause when performing joins.但是,恕我直言,我认为在执行连接时使用 ON 子句要清楚得多。 That way you have a specific section of you query that dictates how the join is handled versus intermixed with the rest of the WHERE clauses.
这样,您就有了查询的特定部分,该部分指示如何处理连接以及如何与其余的 WHERE 子句混合。
Are you trying to join data or filter data?您是要连接数据还是过滤数据?
For readability it makes the most sense to isolate these use cases to ON and WHERE respectively.为了可读性,将这些用例分别隔离到 ON 和 WHERE 是最有意义的。
It can become very difficult to read a query where the JOIN condition and a filtering condition exist in the WHERE clause.读取 WHERE 子句中存在 JOIN 条件和过滤条件的查询会变得非常困难。
Performance wise you should not see a difference, though different types of SQL sometimes handle query planning differently so it can be worth trying ¯\\_(ツ)_/¯
(Do be aware of caching effecting the query speed)性能方面你不应该看到差异,尽管不同类型的 SQL 有时会以不同的方式处理查询计划,因此值得尝试
¯\\_(ツ)_/¯
(请注意缓存会影响查询速度)
Also as others have noted, if you use an outer join you will get different results if you place the filter condition in the ON clause because it only effects one of the tables.另外正如其他人所指出的,如果您使用外部联接,如果将过滤条件放在 ON 子句中,您将获得不同的结果,因为它只影响其中一个表。
I wrote a more in depth post about this here: https://dataschool.com/learn/difference-between-where-and-on-in-sql我在这里写了一篇更深入的文章: https : //dataschool.com/learn/difference-between-where-and-on-in-sql
In SQL, the 'WHERE' and 'ON' clause,are kind of Conditional Statemants, but the major difference between them are, the 'Where' Clause is used in Select/Update Statements for specifying the Conditions, whereas the 'ON' Clause is used in Joins, where it verifies or checks if the Records are Matched in the target and source tables, before the Tables are Joined在 SQL 中,'WHERE' 和 'ON' 子句是一种条件语句,但它们之间的主要区别在于,'Where' 子句用于 Select/Update 语句中用于指定条件,而 'ON' 子句用于连接,在连接表之前,它验证或检查目标表和源表中的记录是否匹配
For Example: - 'WHERE'例如: - 'WHERE'
SELECT * FROM employee WHERE employee_id=101
For Example: - 'ON'例如: - 'ON'
There are two tables employee and employee_details, the matching columns are employee_id.有两个表employee 和employee_details,匹配的列是employee_id。
SELECT * FROM employee
INNER JOIN employee_details
ON employee.employee_id = employee_details.employee_id
Hope I have answered your Question.希望我已经回答了你的问题。 Revert for any clarifications.
回复任何澄清。
I think it's the join sequence effect.我认为这是连接序列效应。 In the upper left join case, SQL do Left join first and then do where filter.
在左上连接的情况下,SQL 先做左连接,然后做 where 过滤。 In the downer case, find Orders.ID=12345 first, and then do join.
比较不利的情况,先找到Orders.ID=12345,然后进行join。
For an inner join, WHERE
and ON
can be used interchangeably.对于内连接,
WHERE
和ON
可以互换使用。 In fact, it's possible to use ON
in a correlated subquery.事实上,可以在相关子查询中使用
ON
。 For example:例如:
update mytable
set myscore=100
where exists (
select 1 from table1
inner join table2
on (table2.key = mytable.key)
inner join table3
on (table3.key = table2.key and table3.key = table1.key)
...
)
This is (IMHO) utterly confusing to a human, and it's very easy to forget to link table1
to anything (because the "driver" table doesn't have an "on" clause), but it's legal.这对人类来说(恕我直言)完全令人困惑,并且很容易忘记将
table1
链接到任何东西(因为“驱动程序”表没有“on”子句),但它是合法的。
for better performance tables should have a special indexed column to use for JOINS .为了获得更好的性能,表应该有一个特殊的索引列用于 JOINS 。
so if the column you condition on is not one of those indexed columns then i suspect it is better to keep it in WHERE .因此,如果您所依赖的列不是那些索引列之一,那么我怀疑最好将其保留在 WHERE 中。
so you JOIN using the indexed columns, then after JOIN you run the condition on the none indexed column .所以你使用索引列 JOIN ,然后在 JOIN 之后你在 none indexed column 上运行条件。
Normally, filtering is processed in the WHERE clause once the two tables have already been joined.通常,一旦两个表已经连接,就在 WHERE 子句中处理过滤。 It's possible, though that you might want to filter one or both of the tables before joining them.
有可能,但您可能希望在加入表之前过滤其中一个或两个表。 ie, the where clause applies to the whole result set whereas the on clause only applies to the join in question.
即 where 子句适用于整个结果集,而 on 子句仅适用于所讨论的连接。
I think this distinction can best be explained via the logical order of operations in SQL , which is, simplified:我认为这种区别最好通过SQL 中操作的逻辑顺序来解释,即简化:
FROM
(including joins) FROM
(包括连接)WHERE
GROUP BY
HAVING
WINDOW
SELECT
DISTINCT
UNION
, INTERSECT
, EXCEPT
UNION
, INTERSECT
, EXCEPT
ORDER BY
OFFSET
FETCH
Joins are not a clause of the select statement, but an operator inside of FROM
.联接不是 select 语句的子句,而是
FROM
内部的运算符。 As such, all ON
clauses belonging to the corresponding JOIN
operator have "already happened" logically by the time logical processing reaches the WHERE
clause.因此,当逻辑处理到达
WHERE
子句时,属于相应JOIN
运算符的所有ON
子句在逻辑上“已经发生”。 This means that in the case of a LEFT JOIN
, for example, the outer join's semantics has already happend by the time the WHERE
clause is applied.这意味着在
LEFT JOIN
的情况下,例如,在应用WHERE
子句时,外连接的语义已经发生。
I've explained the following example more in depth in this blog post . 我在这篇博文中更深入地解释了以下示例。 When running this query:
运行此查询时:
SELECT a.actor_id, a.first_name, a.last_name, count(fa.film_id)
FROM actor a
LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id
WHERE film_id < 10
GROUP BY a.actor_id, a.first_name, a.last_name
ORDER BY count(fa.film_id) ASC;
The LEFT JOIN
doesn't really have any useful effect, because even if an actor did not play in a film, the actor will be filtered, as its FILM_ID
will be NULL
and the WHERE
clause will filter such a row. LEFT JOIN
并没有真正有用的效果,因为即使某个演员没有出演电影,该演员也会被过滤,因为它的FILM_ID
将为NULL
,而WHERE
子句将过滤这样的行。 The result is something like:结果是这样的:
ACTOR_ID FIRST_NAME LAST_NAME COUNT
--------------------------------------
194 MERYL ALLEN 1
198 MARY KEITEL 1
30 SANDRA PECK 1
85 MINNIE ZELLWEGER 1
123 JULIANNE DENCH 1
Ie just as if we inner joined the two tables.即就像我们内部加入了两个表。 If we move the filter predicate in the
ON
clause, it now becomes a criteria for the outer join:如果我们在
ON
子句中移动过滤谓词,它现在成为外连接的条件:
SELECT a.actor_id, a.first_name, a.last_name, count(fa.film_id)
FROM actor a
LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id
AND film_id < 10
GROUP BY a.actor_id, a.first_name, a.last_name
ORDER BY count(fa.film_id) ASC;
Meaning the result will contain actors without any films, or without any films with FILM_ID < 10
这意味着结果将包含没有任何电影的演员,或者没有任何
FILM_ID < 10
电影
ACTOR_ID FIRST_NAME LAST_NAME COUNT
-----------------------------------------
3 ED CHASE 0
4 JENNIFER DAVIS 0
5 JOHNNY LOLLOBRIGIDA 0
6 BETTE NICHOLSON 0
...
1 PENELOPE GUINESS 1
200 THORA TEMPLE 1
2 NICK WAHLBERG 1
198 MARY KEITEL 1
Always put your predicate where it makes most sense, logically.始终将谓词放在逻辑上最有意义的地方。
They are equivalent , literally.从字面上看,它们是等效的。
In most open-source databases (most notable examples, in MySql and postgresql ) the query planning is a variant of the classic algorithm appearing in Access Path Selection in a Relational Database Management System (Selinger et al, 1979) .在大多数开源数据库中(最著名的例子,在MySql和postgresql 中),查询计划是出现在关系数据库管理系统中的访问路径选择(Selinger 等,1979)中的经典算法的变体。 In this approach, the conditions are of two types
在这种方法中,条件有两种类型
Especially in MySql, you can see yourself , by tracing the optimizer, that the join .. on
conditions are replaced during parsing by the equivalent where
conditions.特别是在 MySql 中,您可以通过跟踪优化器看到自己,在解析过程中,
join .. on
条件被等效的where
条件替换。 A similar thing happens in postgresql (though there's no way to see it through a log, you have to read the source description).类似的事情发生在 postgresql 中(虽然没有办法通过日志看到它,你必须阅读源描述)。
Anyway, the main point is, the difference between the two syntax variants is lost during the parsing/query-rewriting phase, it does not even reach the query planning and execution phase.无论如何,重点是,两种语法变体之间的差异在解析/查询重写阶段丢失了,甚至没有到达查询计划和执行阶段。 So, there's no question about whether they are equivalent in terms of performance, they become identical long before they reach the execution phase .
因此,毫无疑问,它们在性能方面是否等效,它们在到达执行阶段之前很久就变得相同了。
You can use explain
, to verify that they produce identical plans.您可以使用
explain
来验证它们是否生成相同的计划。 Eg, in postgres, the plan will contain a join
clause, even if you didn't use the join..on
syntax anywhere .例如,在 postgres 中,计划将包含一个
join
子句,即使您没有在任何地方使用join..on
语法。
Oracle and SQL server are not open source, but, as far as I know, they are based equivalence rules (similar to those in relational algebra), and they also produce identical execution plans in both cases.
Oracle 和 SQL Server 不是开源的,但据我所知,它们是基于等价规则的(类似于关系代数中的规则),并且它们在两种情况下也生成相同的执行计划。
Obviously, the two syntax styles are not equivalent for outer joins, for those you have to use the
join ... on
syntax显然,这两种语法风格对于外连接并不等价,对于那些你必须使用
join ... on
syntax 的人
Regarding your question,关于你的问题,
It is the same both 'on' or 'where' on an inner join as long as your server can get it:只要您的服务器可以获取,内部连接上的“on”或“where”都是相同的:
select * from a inner join b on a.c = b.c
and和
select * from a inner join b where a.c = b.c
The 'where' option not all interpreters know so maybe should be avoided.并非所有口译员都知道“where”选项,因此可能应该避免使用。 And of course the 'on' clause is clearer.
当然,“on”子句更清楚。
To add onto Joel Coehoorn's response, I'll add some sqlite-specific optimization info (other SQL flavors may behave differently).为了添加到 Joel Coehoorn 的响应中,我将添加一些特定于 sqlite 的优化信息(其他 SQL 风格的行为可能有所不同)。 In the original example, the LEFT JOINs have a different outcome depending on whether you use
JOIN ON ... WHERE
or JOIN ON ... AND
.在原始示例中,LEFT JOIN 具有不同的结果,具体取决于您使用的是
JOIN ON ... WHERE
还是JOIN ON ... AND
。 Here is a slightly modified example to illustrate:这是一个稍微修改的示例来说明:
SELECT *
FROM Orders
LEFT JOIN OrderLines ON Orders.ID = OrderLines.OrderID
WHERE Orders.Username = OrderLines.Username
versus相对
SELECT *
FROM Orders
LEFT JOIN OrderLines ON Orders.ID = OrderLines.OrderID
AND Orders.Username = OrderLines.Username
Now, the original answer states that if you use a plain inner join instead of a left join, the outcome of both queries will be the same, but the execution plan will differ.现在,原始答案指出,如果您使用普通内连接而不是左连接,则两个查询的结果将相同,但执行计划会有所不同。 I recently realized that the semantic difference between the two is that the former forces the query optimizer to use the index associated with the
ON
clause, while the latter allows the optimizer to choose any index within the ON ... AND
clauses, depending on what it thinks will work best.我最近意识到两者之间的语义差异在于前者强制查询优化器使用与
ON
子句关联的索引,而后者允许优化器选择ON ... AND
子句中的任何索引,具体取决于什么它认为效果最好。
Occasionally, the optimizer will guess wrong and you'll want to force a certain execution plan.有时,优化器会猜错,您会想要强制执行某个执行计划。 In this case, let's say that the SQLite optimizer wrongly concludes that the fastest way to perform this join would be to use the index on
Orders.Username
, when you know from empirical testing that the index on Orders.ID
would deliver your query faster.在这种情况下,假设 SQLite 优化器错误地得出执行此连接的最快方法是使用
Orders.Username
上的索引,当您从经验测试中知道Orders.ID
上的索引将更快地交付您的查询时。
In this case, the former JOIN ON ... WHERE
syntax essentially allows you to force the primary join operation to occur on the ID
parameter, with secondary filtering on Username
performed only after the main join is complete.在这种情况下,以前的
JOIN ON ... WHERE
语法本质上允许您强制对ID
参数进行主连接操作,仅在主连接完成后才对Username
执行二级过滤。 In contrast, the JOIN ON ... AND
syntax allows the optimizer to pick whether to use the index on Orders.ID
or Orders.Username
, and there is the theoretical possibility that it picks the one that ends up slower.相比之下,
JOIN ON ... AND
语法允许优化器选择是使用Orders.ID
还是Orders.Username
上的索引,并且理论上有可能选择最终速度较慢的那个。
a.一种。 WHERE clause: After joining, Records will be filtered.
WHERE 子句:加入后,记录将被过滤。
b.湾ON clause - Before joining, Records (from right table) will be filtered.
ON 子句 - 在加入之前,记录(来自右表)将被过滤。
this is my solution.这是我的解决方案。
SELECT song_ID,songs.fullname, singers.fullname
FROM music JOIN songs ON songs.ID = music.song_ID
JOIN singers ON singers.ID = music.singer_ID
GROUP BY songs.fullname
You must have the GROUP BY
to get it to work.您必须拥有
GROUP BY
才能使其工作。
Hope this help.希望这有帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.