简体   繁体   中英

MySQL Query optimization with JOIN and COUNT

I have the following MySQL Query:

SELECT t1.id, t1.releaseid, t1.site, t1.date, t2.pos FROM `tracers` as t1
LEFT JOIN (
    SELECT `releaseid`, `date`, COUNT(*) AS `pos` 
    FROM `tracers` GROUP BY `releaseid`
) AS t2 ON t1.releaseid = t2.releaseid AND t2.date <= t1.date 
ORDER BY `date` DESC , `pos` DESC LIMIT 0 , 100

The idea being to select a release and count how many other sites had also released it prior to the recorded date, to get the position.

Explain says:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   PRIMARY t1  ALL NULL    NULL    NULL    NULL    498422  Using temporary; Using filesort
1   PRIMARY <derived2>  ALL NULL    NULL    NULL    NULL    91661    
2   DERIVED tracers index   NULL    releaseid   4   NULL    498422   

Any suggestions on how to eliminate the Using temporary; Using filesort? It takes a loooong time. The indexes I have thought of and tried haven't helped anything.

This answer below maybe not change explain output, however if your major problem is sorting data, which it identified by removing order clause will makes your query run faster , try to sort your subquery join table first and your query will be:

SELECT t1.id, t1.releaseid, t1.site, t1.date, t2.pos FROM `tracers` as t1
LEFT JOIN (
    SELECT `releaseid`, `date`, COUNT(*) AS `pos` 
    FROM `tracers` GROUP BY `releaseid`
    ORDER BY `pos` DESC -- additional order
) AS t2 ON t1.releaseid = t2.releaseid AND t2.date <= t1.date 
ORDER BY `date` DESC , `pos` DESC LIMIT 0 , 100

Note: My db version is mysql-5.0.96-x64, maybe in another version you get different result.

Try adding an index on tracers.releaseid and one on tracers.date

  1. make sure you have an index on releaseid.
  2. flip your JOIN, the sub-query must be on the left side in the LEFT JOIN.
  3. put the ORDER BY and LIMIT clauses inside the sub-query.

Try having two indices, one on (date) and one on (releaseid, date) .

Another thing is that your query does not seem to be doing what you describe it does. Does it actually count correctly?

Try rewriting it as:

SELECT t1.id, t1.releaseid, t1.site, t1.`date`
     , COUNT(*) AS pos
FROM tracers AS t1
  JOIN tracers AS t2
    ON  t2.releaseid = t1.releaseid
    AND t2.`date` <= t1.`date` 
GROUP BY t1.releaseid
ORDER BY t1.`date` DESC
       , pos DESC
LIMIT 0 , 100

or as:

SELECT t1.id, t1.releaseid, t1.site, t1.`date`
     , ( SELECT COUNT(*)
         FROM tracers AS t2
         WHERE t2.releaseid = t1.releaseid
           AND t2.`date` <= t1.`date`
       ) AS pos
FROM tracers AS t1
ORDER BY t1.`date` DESC
       , pos DESC
LIMIT 0 , 100

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM