简体   繁体   English

Mysql Sub Select查询优化

[英]Mysql Sub Select Query Optimization

I'm running a query daily to compile stats - but it seems really inefficient. 我每天运行一个查询来编译统计信息-但它似乎效率很低。 This is the Query: 这是查询:

SELECT a.id, tstamp, label_id, (SELECT author_id FROM b WHERE b.tid = a.id ORDER BY b.tstamp DESC LIMIT 1) AS author_id
FROM a, b
WHERE (status = '2' OR status = '3') 
AND category != 6
AND a.id = b.tid
AND (b.type = 'C' OR b.type = 'R')
AND a.tstamp1 BETWEEN {$timestamp_start} AND {$timestamp_end}
ORDER BY b.tstamp DESC
LIMIT 500

This query seems to run really slow. 该查询似乎运行得很慢。 Apologies for the crap naming - I've been asked to not reveal the actual table names. 不好意思的命名-我被要求不要透露实际的表名。

The reason there is a sub select is because the outer select gets one row from the table a and it gets a row from table b. 之所以有子选择,是因为外部选择从表a中获得一行,而从表b中获得一行。 But also need to know the latest author_id from table b as well, so I run a subselect to return that one. 但是还需要知道表b中的最新author_id,因此我运行一个subselect来返回该表。 I don't want to run another select inside a php loop - as that is also inefficient. 我不想在php循环内运行另一个选择-因为这也效率低下。

It works correctly - I just need to find a much faster way of getting this data set. 它可以正常工作-我只需要找到一种更快的方法来获取此数据集。

Try: 尝试:

  SELECT a.id,
         b.tstamp,
         label_id,
         y.author_id
    FROM TABLE_A a
    JOIN TABLE_B b ON b.tid = a.id
    JOIN (SELECT b.tid,
                 MAX(b.tstamp) 'm_tstamp'
            FROM TABLE_B b
        GROUP BY b.tid) x ON x.tid = a.id
    JOIN (SELECT b.tid,
                 b.author_id,
                 b.tstamp
            FROM TABLE_B b
        GROUP BY b.tid) y ON y.tid = a.id
                         AND y.tstamp = x.m_tstamp
   WHERE status IN ('2', '3')
     AND b.type IN ('C', 'R')
     AND category != 6
     AND a.tstamp1 BETWEEN {$timestamp_start} AND {$timestamp_end}
ORDER BY b.tstamp DESC 
   LIMIT 500

If b.tstamp is unique within b.tid , take OMG Ponies' solution. 如果b.tstamp是内唯一b.tid ,采取OMG小马的解决方案。

Otherwise you could try this solution. 否则,您可以尝试此解决方案。 It sorts the whole result by b.tstamp DESC and adds a ranking per author_id . 它通过b.tstamp DESC对整个结果进行排序,并为每个author_id添加一个排名。 The outer selects takes only the row with rank = 1 , which is the one with the greatest tstamp per author_id . 外部选择仅采用rank = 1的行,这是每个author_id具有最大tstamp author_id

SELECT id, tstamp, label_id, author_id
  FROM (SELECT id,
               tstamp,
               label_id,
               author_id,
               CASE
                 WHEN @author_id != author_id THEN @row_num := 1 
                 ELSE @row_num := @row_num + 1
               END AS rank,
               @author_id := b.author_id
          FROM a,
               b,
               (SELECT @row_num := 0, @author_id := NULL) y
          WHERE a.id = b.tid
          AND (status = '2' OR status = '3') 
          AND category != 6
          AND (b.type = 'C' OR b.type = 'R')
          AND a.tstamp1 BETWEEN {$timestamp_start} AND {$timestamp_end}
          ORDER BY b.author_id, b.tstamp DESC
  ) x
 WHERE x.rank = 1
LIMIT 500

I have not tried it, so please comment if it does not work. 我还没有尝试过,所以如果它不起作用,请发表评论。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM