简体   繁体   中英

SQL inner join multiple tables with one query

I've a query like below,

SELECT
c.testID,
FROM a
INNER JOIN b ON a.id=b.ID
INNER JOIN c ON b.r_ID=c.id
WHERE c.test IS NOT NULL;

Can this query be optimized further?, I want inner join between three tables to happen only if it meets the where clause.

Where clause works as filter on the data what appears after all JOINs, whereas if you use same restriction to JOIN clause itself then it will be optimized in sense of avoiding filter after join. That is, join on filtered data instead.

SELECT c.testID,
FROM a
INNER JOIN b ON a.id = b.ID
INNER JOIN c ON b.r_ID = c.id AND c.test IS NOT NULL;

Moreover, you must create an index for the column test in table c to speed up the query.

Also, learn EXPLAIN command to the queries for best results.

Try the following:

SELECT
c.testID
FROM c 
INNER JOIN b ON c.test IS NOT NULL AND b.r_ID=c.testID 
INNER JOIN a ON a.id=b.r_ID;

I changed the order of the joins and conditions so that the first statement to be evaluated is c.test IS NOT NULL

Disclaimer: You should use the explain command in order to see the execution. I'm pretty sure that even the minor change I just did might have no difference due to the MySql optimizer that work on all queries.

See the MySQL Documentation: Optimizing Queries with EXPLAIN

Three queries Compared

Have a look at the following fiddle: https://www.db-fiddle.com/f/fXsT8oMzJ1H31FwMHrxR3u/0

I ran three different queries and in the end, MySQL optimized and ran them the same way.

在此处输入图片说明

Three Queries:

EXPLAIN SELECT
c.testID
FROM c 
INNER JOIN b ON c.test IS NOT NULL AND b.r_ID=c.testID 
INNER JOIN a ON a.id=b.r_ID;


EXPLAIN SELECT c.testID
FROM a
INNER JOIN b ON a.id = b.r_id
INNER JOIN c ON b.r_ID = c.testID AND c.test IS NOT NULL;

EXPLAIN SELECT
c.testID
FROM a
INNER JOIN b ON a.id=b.r_ID
INNER JOIN c ON b.r_ID=c.testID
WHERE c.test IS NOT NULL;

All tables should have a PRIMARY KEY . Assuming that id is the PRIMARY KEY for the tables that it is in, then you need these secondary keys for maximal performance:

c:  INDEX(test, test_id, id)  -- `test` must be first
b:  INDEX(r_ID)

Both of those are useful and "covering".

Another thing to note: b and a is virtually unused in the query, so you may as well write the query this way:

SELECT c.testID,
    FROM c
    WHERE c.test IS NOT NULL;

At that point, all you need is INDEX(test, testID) .

I suspect you "simplified" your query by leaving out some uses of a and b . Well, I simplified it from there, just as the Optimizer should have done. (However, elimination of tables is an optimization that it does not do; it figures that is something the user would have done .)

On the other hand, b and a are not totally useless. The JOIN verify that there are corresponding rows, possibly many such rows, in those tables. Again, I think you had some other purpose.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM