简体   繁体   中英

Make more efficient SELECT query with LEFT join and conditions on the Join

This question is on the back of one I asked earlier today . The answer I got solved my problem in that, by limiting rows returned I can see that it does what I want.

But now when I try to run the whole query, with no limit, with the purpose of exporting into excel for analysis, I seem unable to get anywhere. I get kicked off SQL in that MySQL workbench asks me for my password again and the query stops running. I'm not sure if that piece of information is something else happening and a diversion from my real question which is "How can I get this query to run faster, if at all?" Currently it runs for at least 5 minutes before "kicking me off".

When I EXPLAIN the query here is what is provided:

1   SIMPLE  co  ALL                 185610  Using temporary; Using filesort
1   SIMPLE  my  ref PRIMARY PRIMARY 4   bm_emails.co.id 23  
1   SIMPLE  nvk eq_ref  PRIMARY PRIMARY 4   bm_emails.co.id 1   

Presumably the temporary table is causing the "Using Temporary" is a problem but I'm unsure how to get around it while maintaining my query. The actual query is here:

SELECT 
    co.email,
    nvk.nvk_medium,
    CAST(MIN(co.created) AS DATE) AS first_contact,
    MIN(CASE WHEN my.my_id = 581 THEN my.data END) AS WA_Created,
    MIN(CASE WHEN my.my_id = 3347 THEN my.data END) AS WA_Upgraded
FROM bm_emails.cid208 co
LEFT JOIN bm_emails.my208 my ON co.id = my.eid AND (my_id = 581 OR my_id = 3347)
LEFT JOIN bm_emails.nvk208 nvk ON nvk.eid = co.id
GROUP BY email

Union all is often a faster choice than using an OR in a join condition. Check the data results, I think an inner join might make more sense using the UNON but would have to see the data. I would also want to know more about whter you wanted to see records from bm_emails.cid208 that would not join to either records for my_id 581 or myid 3347.

Try this:

SELECT email,nvk_medium, CAST(MIN(created) AS DATE) AS first_contact,WA_Created,WA_Upgraded
FROM 
(
    SELECT 
        co.email,
        nvk.nvk_medium,
        co.created AS first_contact,
       my.data AS WA_Created,
       NULL AS WA_Upgraded
    FROM bm_emails.cid208 co
    LEFT JOIN bm_emails.my208 my ON co.id = my.eid AND my_id = 581 
    LEFT JOIN bm_emails.nvk208 nvk ON nvk.eid = co.id
    UNION ALL
    SELECT 
        co.email,
        nvk.nvk_medium,
        co.created AS first_contact,
       NULL AS WA_Created,
       my.data AS WA_Upgraded
    FROM bm_emails.cid208 co
    LEFT JOIN bm_emails.my208 my ON co.id = my.eid AND my_id = 3347
    LEFT JOIN bm_emails.nvk208 nvk ON nvk.eid = co.id
) a
GROUP BY email,nvk_medium,WA_Created,WA_Upgraded

I would also consider if CAST(MIN(created) AS DATE) should be Min(CAST(created AS DATE)) depending on the datatype of the created field. If is some type of string based field then 10/20/2014 would be less than 2/24/2013 and would be selected. If it something stored in a datetime type of field and you are simply cutting the time off, this would be OK though.

If I assume that bm_emails contains one row per email, then this might go faster:

select co.email,
       (select nvk.nvk_medium from bm_emails.nvk208 nvk where nvk.eid = co.id limit 1) as nvk_medium,
       co.created,
       (select min(my.data) from bm_emails.my208 my where co.id = my.eid and my.mid = 581) as WA_Created,
       (select min(my.data) from bm_emails.my208 my where co.id = my.eid and my.mid = 3347) as WA_Updated
from bm_emails.cid208 co;

This can take advantage of the following indexes:

bm_emails.nvk208(eid, nvk_medium)
bm_emails.my208(eid, mid, data)

Of course, this rests on that first assumption, that the first table has one row per email.

EDIT:

Even with the multiple emails, I would still try this:

select cn.mail, cn.nvk_medium, cn.created,
       (select min(my.data) from bm_emails.my208 my where co.id = my.eid and my.mid = 581) as WA_Created,
       (select min(my.data) from bm_emails.my208 my where co.id = my.eid and my.mid = 3347) as WA_Updated
from (select co.email, nvk.nvk_medium, min(co.created) as created
      from bm_emails.cid208 co left join
           bm_emails.nvk208 nvk 
           on nvk.eid = co.id 
      group by co.email, nvk.nvk_medium
     ) cn;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM