如何优化我的sql查询，需要2分25秒才能运行

Question

I have a table with 3 GB data (It will keep on increasing) and I need to display total sales, top category and top product (Maximum occurrence in the column). 我有一个包含3 GB数据的表（它将不断增加），并且我需要显示总销售额，最高类别和最高产品（该列中出现的次数最多）。 Following is the query that's giving me the above mentioned result: 以下是给我上述结果的查询：

select t.category, 
       sum(t.sale) sales,
        (select product 
        from demo 
        where  category = t.category
        group by product
        order by count(*) desc
        limit 1) top_product
from demo t
group by t.category

The above query takes approximately 2 mins and 25 seconds. 上面的查询大约需要2分钟25秒。 I couldn't find any way to optimize it. 我找不到任何优化方法。 Is there any other way that someone could recommend? 有人可以推荐其他方式吗？

Example table: 表格示例：

category  product    sale 
C1         P1        10
C2         P2        12
C3         P1        14
C1         P2        15
C1         P1        02
C2         P2        10
C2         P3        22
C3         P1        01
C3         P2        27
C3         P3        02

Output: 输出：

category  Top product   Total sales 
    C1         P1        27
    C2         P2        44
    C3         P1        44

Answer 1

Your query could be written like this: 您的查询可以这样写：

SELECT g1.category, g1.sum_sale, g2.product
FROM (
    SELECT category, SUM(sale) AS sum_sale
    FROM demo
    GROUP BY category
) AS g1
INNER JOIN (
    SELECT category, product, COUNT(*) AS product_count
    FROM demo
    GROUP BY category, product
) AS g2 ON g1.category = g2.category
INNER JOIN (
    SELECT category, MAX(product_count) AS product_count_max
    FROM (
        SELECT category, product, COUNT(*) AS product_count
        FROM demo
        GROUP BY category, product
    ) AS x
    GROUP BY category
) AS g3 ON g2.category = g3.category AND g2.product_count = g3.product_count_max

Basically it tries to find the maximum count(*) per category and from that it calculates the product. 基本上，它尝试查找每个类别的最大数量（*），然后从中计算出乘积。 It could benefit from appropriate indexes. 它可以从适当的索引中受益。

Answer 2

A MySQL only hack solution is using GROUP_CONCAT in combination with nested SUBSTRING_INDEX functions to get the first element in an Ordered comma separated string. 仅限MySQL的骇客解决方案结合使用GROUP_CONCAT和嵌套的SUBSTRING_INDEX函数来获取有序逗号分隔字符串中的第一个元素。

It is not an ideal approach ; 这不是理想的方法 ； but it will reduce the number of subqueries required, and may be efficient for your peculiar case. 但这会减少所需的子查询数量 ，并且可能对您的特殊情况有效。

You will also need to use SET SESSION group_concat_max_len = @@max_allowed_packet; 您还需要使用SET SESSION group_concat_max_len = @@max_allowed_packet; . 。

We basically determine sales and count of occurrence, for a product and category combination. 我们基本上确定产品和类别组合的销售额和发生次数。 This result-set is then used as a Derived Table , and we use the Group_concat() hack to determine the product with maximum count in a category. 然后将该结果集用作“ 派生表” ，然后使用Group_concat() hack来确定类别中具有最大数量的产品。

SET SESSION group_concat_max_len = @@max_allowed_packet;

SELECT 
  dt.category, 
  SUM(dt.sale_per_category_product) AS total_sales, 
  SUBSTRING_INDEX(
    SUBSTRING_INDEX(
      GROUP_CONCAT(dt.product ORDER BY dt.product_count_per_category DESC)
                    , ','
                    , 1
                   )
                 , ','
                 , -1
                ) AS top_product 
FROM 
(
  SELECT 
    category, 
    product, 
    SUM(sale) AS sale_per_category_product, 
    COUNT(*) AS product_count_per_category 
  FROM demo 
  GROUP BY category, product 
) AS dt 
GROUP BY dt.category

Schema (MySQL v5.7) 模式（MySQL v5.7）

| category | total_sales | top_product |
| -------- | ----------- | ------------|
| C1       | 27          | P1          |
| C2       | 44          | P2          |
| C3       | 44          | P1          |

View on DB Fiddle 在数据库小提琴上查看

如何优化我的sql查询，需要2分25秒才能运行

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-10-29 09:47:13

解决方案2
1 2018-10-29 09:49:37

如何优化我的sql查询，需要2分25秒才能运行

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-10-29 09:47:13

解决方案2 1 2018-10-29 09:49:37

解决方案1
1 已采纳 2018-10-29 09:47:13

解决方案2
1 2018-10-29 09:49:37