SQL inner join multiple tables with one query

Question

I've a query like below,

SELECT
c.testID,
FROM a
INNER JOIN b ON a.id=b.ID
INNER JOIN c ON b.r_ID=c.id
WHERE c.test IS NOT NULL;

Can this query be optimized further?, I want inner join between three tables to happen only if it meets the where clause.

Answer 1

Where clause works as filter on the data what appears after all JOINs, whereas if you use same restriction to JOIN clause itself then it will be optimized in sense of avoiding filter after join. That is, join on filtered data instead.

SELECT c.testID,
FROM a
INNER JOIN b ON a.id = b.ID
INNER JOIN c ON b.r_ID = c.id AND c.test IS NOT NULL;

Moreover, you must create an index for the column test in table c to speed up the query.

Also, learn EXPLAIN command to the queries for best results.

Answer 2

Try the following:

SELECT
c.testID
FROM c 
INNER JOIN b ON c.test IS NOT NULL AND b.r_ID=c.testID 
INNER JOIN a ON a.id=b.r_ID;

I changed the order of the joins and conditions so that the first statement to be evaluated is c.test IS NOT NULL

Disclaimer: You should use the explain command in order to see the execution. I'm pretty sure that even the minor change I just did might have no difference due to the MySql optimizer that work on all queries.

See the MySQL Documentation: Optimizing Queries with EXPLAIN

Three queries Compared

Have a look at the following fiddle: https://www.db-fiddle.com/f/fXsT8oMzJ1H31FwMHrxR3u/0

I ran three different queries and in the end, MySQL optimized and ran them the same way.

Three Queries:

EXPLAIN SELECT
c.testID
FROM c 
INNER JOIN b ON c.test IS NOT NULL AND b.r_ID=c.testID 
INNER JOIN a ON a.id=b.r_ID;


EXPLAIN SELECT c.testID
FROM a
INNER JOIN b ON a.id = b.r_id
INNER JOIN c ON b.r_ID = c.testID AND c.test IS NOT NULL;

EXPLAIN SELECT
c.testID
FROM a
INNER JOIN b ON a.id=b.r_ID
INNER JOIN c ON b.r_ID=c.testID
WHERE c.test IS NOT NULL;

Answer 3

All tables should have a PRIMARY KEY . Assuming that id is the PRIMARY KEY for the tables that it is in, then you need these secondary keys for maximal performance:

c:  INDEX(test, test_id, id)  -- `test` must be first
b:  INDEX(r_ID)

Both of those are useful and "covering".

Another thing to note: b and a is virtually unused in the query, so you may as well write the query this way:

SELECT c.testID,
    FROM c
    WHERE c.test IS NOT NULL;

At that point, all you need is INDEX(test, testID) .

I suspect you "simplified" your query by leaving out some uses of a and b . Well, I simplified it from there, just as the Optimizer should have done. (However, elimination of tables is an optimization that it does not do; it figures that is something the user would have done .)

On the other hand, b and a are not totally useless. The JOIN verify that there are corresponding rows, possibly many such rows, in those tables. Again, I think you had some other purpose.

SQL inner join multiple tables with one query

Question

3 answers

solution1
2 2020-02-05 08:59:04

solution2
1 ACCPTED 2020-02-05 09:06:09

Try the following:

Three queries Compared

solution3
0 2020-02-11 21:38:20

SQL inner join multiple tables with one query

Question

3 answers

solution1 2 2020-02-05 08:59:04

solution2 1 ACCPTED 2020-02-05 09:06:09

Try the following:

Three queries Compared

solution3 0 2020-02-11 21:38:20

solution1
2 2020-02-05 08:59:04

solution2
1 ACCPTED 2020-02-05 09:06:09

solution3
0 2020-02-11 21:38:20