简体   繁体   中英

Simulating FULL OUTER JOIN: Performance of UNION of LEFT+RIGHT JOIN vs cross join

The Access / Jet database engine doesn't support FULL OUTER JOIN s:

SELECT Table1.*, Table2.*
FROM Table1
FULL OUTER JOIN Table2 ON Table1.JoinField = Table2.JoinField

The commonly recommended alternative is to UNION the results of the LEFT and RIGH JOIN s; some variation on the following:

SELECT Table1.*, Table2.*
FROM Table1
LEFT JOIN Table2 ON Table1.JoinField = Table2.JoinField

UNION ALL 
SELECT Table1.*, Table2.*
FROM Table1
RIGHT JOIN Table2 ON Table1.JoinField = Table2.JoinField
WHERE Table1.JoinField IS NULL

However, isn't it also possible to use a cross join?

SELECT Table1.*, Table2.*
FROM Table1, Table2
WHERE Table1.JoinField = Table2.JoinField
    OR Table1.JoinField IS NULL
    OR Table2.JoinField IS NULL

Are there any performance penalties or other downsides to using a cross join in this way?

Your cross join isn't a FULL OUTER JOIN at all. It's an inner join that also matches NULL to all records.

In a CROSS JOIN , rows from one table are always matched with rows from another table, while in a FULL OUTER JOIN , there are rows that are matched to nothing.

To illustrate, I created a small sample (T-SQL, but that's not relevant). You can see that an inequal row is returned.

You can, however, use a CROSS JOIN to emulate a FULL OUTER JOIN , if there are no Null values, by appending a Null row, using NOT EXISTS , and some more tricks. You'll see, however, that this is a very elaborate solution, and the normal UNION is usually preferred:

SELECT *
FROM (SELECT * FROM #Table1 UNION ALL SELECT Null, Null) t1, (SELECT * FROM #Table2 UNION ALL SELECT Null, Null) t2
WHERE (t1.JoinField = t2.JoinField
OR (NOT EXISTS(SELECT 1 FROM #Table2 WHERE #Table2.JoinField = t1.JoinField) AND t1.JoinField Is Not Null AND t2.JoinField IS NULL)
OR (NOT EXISTS(SELECT 1 FROM #Table1 WHERE #Table1.JoinField = t2.JoinField) AND t2.JoinField Is Not Null AND t1.JoinField IS NULL))
AND (t1.JoinField Is Not Null Or t2.JoinField Is Not Null) 

(In the linked sample, you can see it in action)

As I am using Redshift there may be syntax difference.

With a as
(
Select 1 id union all
Select 2 union all
Select 3 
)
, b as 
(
Select 2 d union all
Select 4 union all
Select 5 
)
Select a.*,b.* 
From a full join b on id=d

Output is

id  d
1   NULL
2   2
3   NULL
NULL    4
NULL    5

If you run

Select a.*,b.* 
from a 
left join b on id=d 
union all
Select  a.*,b.* 
from b 
left join a on d=id

You get

id  d
1   NULL
2   2
3   NULL
2   2
NULL    4
NULL    5

But if you union only you get same result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM