简体   繁体   中英

MySQL placement of conditions in on-clauses of multiple outer joins

As a follow up to In SQL / MySQL, what is the difference between "ON" and "WHERE" in a join statement? and SQL join: where clause vs. on clause - it does matter if a condition is placed in the on-clause vs. the where-clause in an outer join.

However, does it matter which on-clause the condition is placed in when there are multiple outer joins?

For example, could these produce different results?

select * from t1 left join t2 on t1.fid=t2.id and t2.col=val
                 left join t3 on t2.fid=t3.id;

vs:

select * from t1 left join t2 on t1.fid=t2.id
                 left join t3 on t2.fid=t3.id and t2.col=val;

Absolutely they are different.

The fisrt query will only have t2 rows that satisfy t2.col=val

The second query will include all t2 rows and only list t3 when t2.col=val

The queries are not equivalent. It is easy to construct a counter example:

create table t1 (id int not null, val int not null);
create table t2 (id int not null, val int not null);
create table t3 (id int not null, val int not null);
insert into t1 (id, val) values (1,1);
insert into t2 (id, val) values (1,1);
insert into t3 (id, val) values (1,1);

select * from t1 
left join t2 
    on t1.id = t2.id 
    and t2.val = 2 
left join t3 
    on t2.id = t3.id;
+----+-----+------+------+------+------+
| id | val | id   | val  | id   | val  |
+----+-----+------+------+------+------+
|  1 |   1 | NULL | NULL | NULL | NULL |
+----+-----+------+------+------+------+

select * from t1 
left join t2 
    on t1.id = t2.id 
left join t3 
    on t2.id = t3.id 
    and t2.val = 2;
+----+-----+------+------+------+------+
| id | val | id   | val  | id   | val  |
+----+-----+------+------+------+------+
|  1 |   1 |    1 |    1 | NULL | NULL |
+----+-----+------+------+------+------+

This is one of the queries:

select *
from t1 left join
     t2
     on t1.fid = t2.id left join
     t3
     on t2.fid = t3.id and t2.col = val;

Yes, the results are different. If you were using inner join they would be the same, but the left join changes things -- because the join clause does not do any filtering of rows.

I think the simplest explanation is that the join between t1 and t2 will include all rows from t1 as well as all matching rows from t2 -- even those where t2.col <> val . These remain in the result set, because the next left join does not filter them out.

In fact, the condition t2.col = val in the second on clause does not affect which rows are in the result set. If there is a match, then the row from t3 stays based on the first condition. If there is no match, then the row from t3 is still in the result set -- but the t3 columns will be NULL .

In this version:

select *
from t1 left join
     t2
     on t1.fid = t2.id  and t2.col = val left join
     t3
     on t2.fid = t3.id;

The first join gets all rows from t1 and only matching rows from t2 where t2.col = val . The third join can then add more rows.

Note: there are definitely situations where the two queries would return the same results. But, the following data would generate different results (assume val = 0):

t1

fid
1

t2

fid   col
1     0
1     1

t3

id 
1

The query with the condition in the second on clause will return:

1    1    0    1
1    1    1    NULL

The query with the condition in the first on clause will return:

1    1    0    1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM