简体   繁体   中英

Inner Join versus Union All

Which version of the query is faster / best practice? (Curiousity thing.)

More importantly, are they equivalent?

Do these queries accomplish the same thing in this example?

1) INNER JOIN with two OR conditions:

SELECT
  DISTINCT (cat.id) as 'Cat ID:'
FROM
  cat
INNER JOIN cuteness_showdown ON
  (cat.id = cuteness_showdown.cat_1 OR cat.id = cuteness_showdown.cat_2);

2) Query each column separately and UNION ALL:

SELECT
  DISTINCT (table_1.id) as 'Cat ID:'
FROM
  (SELECT
    cuteness_showdown.cat_1 AS id
  FROM
    cuteness_showdown
  UNION ALL
  SELECT
    cuteness_showdown.cat_2 AS id
    FROM
    cuteness_showdown) AS table_1;

Now, which version is faster / best practice if I need a column from another table?

1) INNER JOIN with two OR conditions (no change):

SELECT
  DISTINCT (cat.id) as 'Cat ID:',
  cat.name as 'Cat Name:'
FROM
  cat
INNER JOIN cuteness_showdown ON
  (cat.id = cuteness_showdown.cat_1 OR cat.id = cuteness_showdown.cat_2);

2) Query each column separately and UNION ALL (needed to INNER JOIN cat table):

SELECT
  DISTINCT (table_1.id) as 'Cat ID:'
  cat.name as 'Cat Name:'
FROM
  (SELECT
    cuteness_showdown.cat_1 AS id
  FROM
    cuteness_showdown
  UNION ALL
  SELECT
    cuteness_showdown.cat_2 AS id
  FROM
    cuteness_showdown) AS table_1
INNER JOIN cat on
  (table_1.id = cat.id);

To find out which is faster, break out a terminal, write a script that runs each 1000 times and compare the results :)

As for whether they are equivalent, the query optimiser will very often come up with the exact same execution plan for several SQL queries that do the same thing, so they may well be. I can't tell you whether these ones will get that treatment, but you can use EXPLAIN to see the execution plans for yourself and compare them, assuming you have some data.

If the execution plans are indeed the same, best practice is about choosing the more readable statement so that anyone else who comes along to maintain the code can do so easily. Alternatively, if they are not the same, then you have to decide whether a harder-to read statement is worth the extra performance gain, which depends on how big a deal performance is in your project. I'd argue that if you have a relatively small DB which is unlikely to scale much and sub 10ms response times, then performance isn't an issue so just make it easy to maintain.

There is another option,

SELECT
  DISTINCT (cat.id) as 'Cat ID:'
FROM
  cat
INNER JOIN cuteness_showdown ON
  cat.id IN (cuteness_showdown.cat_1,cuteness_showdown.cat_2)

Very minor difference, but anecdotally IN() has always seemed to execute more efficiently against keys than an OR, will see if I can mock up some tables for some firm data one way or the other.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM