简体   繁体   English

MySQL 查询优化。 避免临时和文件排序

[英]MySQL query optimization. Avoiding temporary & filesort

Currently I have a table with close to 1 million rows, which I need to query from.目前我有一个近 100 万行的表,我需要从中查询。 What I need to be able to do is stack rank packages on the number of products they include from a given list of product id's.我需要做的是从给定的产品ID列表中对它们包含的产品数量进行堆栈排名。

SELECT count(productID) AS commonProducts, packageID
FROM supply
WHERE productID IN (2,3,4,5,6,7,8,9,10) 
GROUP BY packageID 
ORDER BY commonProducts 
DESC LIMIT 10

The query works fine, but I would like to improve upon it.查询工作正常,但我想改进它。 I tried a multi-column index on productID and packageID, but it seemed to seek more rows than just having a separate index for each of the columns.我在 productID 和 packageID 上尝试了一个多列索引,但它似乎寻求更多的行,而不仅仅是为每个列设置一个单独的索引。

MySQL Explain MySQL说明

select_type: SIMPLE
table: supply
type: range
possible_keys: supplyID
key: supplyID
key_len: 3
ref: null
rows: 996
extra: Using where; Using temporary; Using filesort

My main concern is that the query is using a temporary table and filesort.我主要担心的是查询正在使用临时表和文件排序。 How could I go about optimizing this query?我怎么能 go 关于优化这个查询? I presume that the biggest issues is count() and the ORDER BY on the results of count().我认为最大的问题是 count() 和 count() 结果的 ORDER BY。

You can remove the temp table using a Dependent Subquery :您可以使用Dependent Subquery删除临时表:

select * from 
  (
   SELECT count(productID) AS commonProducts, s.productId, s.packageID 
   FROM supply as s
   WHERE EXISTS
   (
      select 1 from supply as innerS 
        where innerS.productID in (2,3,4,5,6,7,8,9,10) 
          and s.productId = innerS.productId 
   )
   GROUP BY s.packageID
  ) AS t
ORDER BY t.commonProducts 
DESC LIMIT 10

The inner query links to the outer query and preserves the index.内部查询链接到外部查询并保留索引。 You'll find that any query that sorts on commonProducts, including the above query, will use a filesort, as count(*) is definitely not indexed.您会发现任何对 commonProducts 进行排序的查询,包括上述查询,都将使用文件排序,因为count(*)绝对没有被索引。 But fear not, filesort is just a fancy word for sort -- mysql can choose to use an effective in-memory sort -- and whether you did it now or as a mergesort on the way to an indexed temporary table, you'll have to pay for that sorting somewhere.但不要害怕,filesort 只是一个花哨的排序词——mysql 可以选择使用有效的内存排序——无论你现在使用它还是作为索引临时表的合并排序,你都将拥有在某处支付排序费用。 However, this case is pretty good because filesort will stop sorting once it hits the LIMIT you've put in place.但是,这种情况非常好,因为文件排序一旦达到您设置的LIMIT就会停止排序。 It will not sort the entire list of commonProducts.它不会对整个 commonProducts 列表进行排序。

Update更新

If this query is going to be run all the time, I would recommend (without getting too fancy) setting triggers on the supply table to update a legitimate table that tracks counters like this one.如果要一直运行此查询,我建议(不要太花哨)在供应表上设置触发器以更新跟踪此类计数器的合法表。

Creatng a temporary resulte set:创建一个临时结果集:

SELECT  TMP.*
FROM (  SELECT count(productID) AS commonProducts, packageID
        FROM supply
        WHERE productID IN (2,3,4,5,6,7,8,9,10)
        GROUP BY packageID 
) AS TMP 

ORDER BY commonProducts 
DESC LIMIT 10

Perhaps it's not the most elegant way and I cannot guarantee it will be faster because everything depends on your particular data.也许这不是最优雅的方式,我不能保证它会更快,因为一切都取决于您的特定数据。 But in some cases this gives much better results:但在某些情况下,这会产生更好的结果:

SELECT count(*) AS commonProducts, packageID
FROM (
    SELECT packageID FROM supply WHERE productID = 2
    UNION ALL
    SELECT packageID FROM supply WHERE productID = 3
    UNION ALL
    .
    .
    .
    SELECT packageID FROM supply WHERE productID = 10
) AS t
GROUP BY packageID
ORDER BY commonProducts DESC
LIMIT 10

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM