简体   繁体   中英

MySQL joins + sub-query confusion

I need to ask something that really confusing regarding joins

What I think and I am sure this used to work for me in past but seems like am missing some thing:

This query results in data from both tables where client_id is 1 - Working fine

/* gets data for one client_id*/

approach 1A

SELECT * 
FROM clients  LEFT JOIN client_modules 
                ON client_modules.client_id = clients.client_id
WHERE clients.client_id = 1;

Now, this query as what I think should return the same result because, I have used the sub-query in the join to filter the result first ie. get data from modules for client_id 1 already but, for some reason its giving other client_ids data in final result.

/* gets data for one client_id sub-query approach*/

approach 2A

SELECT * 
FROM clients LEFT JOIN (SELECT client_id, module_name 
                        FROM client_modules 
                        WHERE client_modules.client_id = 1) 
             AS client_moduless ON client_moduless.client_id = clients.client_id;



/* gets data for all client_ids */

approach 1B

SELECT * FROM clients
LEFT JOIN client_modules ON client_modules.client_id = clients.client_id;



/* gets data for all client_ids*/

approach 2B

SELECT * 
FROM clients LEFT JOIN (SELECT client_id, module_name 
                        FROM client_modules) AS client_moduless
              ON client_moduless.client_id = clients.client_id;

Questions:

1) Which approach is more efficient to use with large amount of data among xA and xB ?

2) Why is the second approach 2A giving results from client_ids other then 1 , although running the sub-query in join separately works fine

3) Will the sub-query in 2B execute for every record from parent if not using where clause?

4) what if I change the 1A query to

SELECT * FROM clients
JOIN client_modules ON client_modules.client_id = clients.client_id AND client_modules.client_id = 1

have just removed the where clause on client table and putted that in the join clause on child table is this efficient or the where clause?

Regards

2a pulls extra clients because you've used a left join - ie all records from the left side, but only related records from the right side. You should use 'full join' or 'join', not 'left join'.

Not sure about the other questions, but my preference would be for 1B over sub-selects

Edit to add - to understand what's happening with a left join, consider one of its many uses - which clients don't have any records in client_modules?

It's tempting to write:

SELECT * FROM clients WHERE clientid NOT IN (select distinct(clientid) FROM client_modules)

However, the following is probably more efficient to write:

SELECT * FROM clients
     LEFT JOIN  client_modules ON clients.clientid = client_modules.clientid
WHERE client_modules.clientid IS NULL

(ie only show the records from clients that can't be joined to a client_module row)

No of them. In my mind i think you should not be using a left join . You should use a join . Like vise:

SELECT * FROM clients
JOIN client_modules ON client_modules.client_id = clients.client_id
WHERE clients.client_id = 1;

That is more strait forward and you are limiting the LEFT JOIN in the where statement so it has the same effect.

SELECT * 
FROM clients LEFT JOIN (SELECT client_id, module_name 
                        FROM client_modules 
                        WHERE client_modules.client_id = 1) 
             AS client_moduless ON client_moduless.client_id = clients.client_id;

With this query you will return all the rows in the client table and for those that it can match the client_moduless.client_id = clients.client_id you will have a that table. But this is not a limiting JOIN it is a left join so that means that when no values is matched it return null. The subquery will be run for every row. To get the same effect you can do it like this:

SELECT * 
FROM clients JOIN (SELECT client_id, module_name 
                        FROM client_modules 
                        WHERE client_modules.client_id = 1) 
             AS client_moduless ON client_moduless.client_id = clients.client_id;

Now this will limit the clients table and you will just get the values that has a match in client_modules . But I cannot see a point of doing that. I would go with the strait forward join instead.

It also depend on what you are interested in. If you are just in the columns from the clients table. Then you can do this:

SELECT * 
FROM clients
WHERE EXISTS
(
    SELECT 
        NULL
    FROM
        client_modules=1
        AND client_moduless.client_id = clients.client_id
)

So if you want all the columns from the clients table and the client_modules go with the join. Otherwise go with the exists

EDIT

I think this:

SELECT * FROM clients
JOIN client_modules ON client_modules.client_id = clients.client_id 
AND client_modules.client_id = 1

And this:

SELECT * FROM clients
JOIN client_modules ON client_modules.client_id = clients.client_id
WHERE client_modules.client_id = 1

is the same. This will most cretin result in the same query plan.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM