[英]MySQL count(DISTINCT) very slow - better with subqueries?
我有一个平板,约有10mio行,每行有15列。 索引设置为column_1,column_2,column_3和my_time。
SELECT Date(my_time) my_time,
count(DISTINCT column_1) c_c1,
count(DISTINCT column_2) c_c2
FROM `table_name`
WHERE `column_3` in (10,11,100,50,213,756)
AND Date(my_time) > '2016-09-01'
AND Date(my_time) < '2016-09-30'
GROUP BY Date(my_time)
ORDER BY Date(my_time) ASC
结果大约需要20-30秒。
有人知道如何通过子查询来改进此查询吗? 如果是子查询,可以给我看一个示例查询,如何提高性能?
谢谢!
您可以使用适当的索引来加快速度:
create index idx_speedy on table_name(column_3, my_time);
甚至更好的覆盖指数:
create index idx_speedy on table_name(column_3, my_time, column_1, column_2);
为了更好地利用索引,请尝试避免在where子句中的列上使用函数,即在此处避免Date(my_time)
。
SELECT Date(my_time) my_time,
COUNT(DISTINCT column_1) AS c_c1,
COUNT(DISTINCT column_2) AS c_c2
FROM table_name
WHERE column_3 in (10, 11, 100, 50, 213, 756)
AND my_time >= '2016-09-02'
AND my_time < '2016-09-30'
GROUP BY Date(my_time)
ORDER BY Date(my_time) ASC;
如果MySQL支持函数索引,我们可以坚持使用Date(my_time)
并为您的查询创建该索引:
create index idx_speedy on table_name(column_3, Date(my_time), column_1, column_2);
由于MySQL不支持此功能,因此您可以决定创建一个生成的列 :
alter table table_name add my_date date generated always as ( Date(my_time) );
创建索引
create index idx_speedy on table_name(column_3, my_date, column_1, column_2);
并相应地重新编写查询:
SELECT my_date,
COUNT(DISTINCT column_1) AS c_c1,
COUNT(DISTINCT column_2) AS c_c2
FROM table_name
WHERE column_3 in (10, 11, 100, 50, 213, 756)
AND my_date BETWEEN '2016-09-02' AND '2016-09-29'
GROUP BY my_date
ORDER BY my_date ASC;
如果我没有记错的话,从MySQL 5.7.6开始可用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.