[英]mysql - group by indexed columns + where by indexed column caused speed decrease
I have table statistics
with next structure: 我有下一个结构的表
statistics
:
+-------------------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+----------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| created_at | datetime | YES | MUL | NULL | |
| year_in_tz | smallint(5) unsigned | YES | MUL | NULL | |
| month_in_tz | tinyint(3) unsigned | YES | MUL | NULL | |
+-------------------+----------------------+------+-----+---------+----------------+
With keys on created_at, year_in_tz, month_in_tz and on (year_in_tz, month_in_tz): 使用created_at,year_in_tz,month_in_tz和(year_in_tz,month_in_tz)上的键:
ALTER TABLE `statistics` ADD INDEX created_at (created_at);
alter table statistics add index year_in_tz (year_in_tz);
alter table statistics add index month_in_tz (month_in_tz);
alter table statistics add index year_month_in_tz(year_in_tz,month_in_tz);
Some queries example... 一些查询示例...
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz
FROM `statistics`
GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (5.04 sec)
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| 1 | SIMPLE | statistics | index | NULL | year_month_in_tz | 5 | NULL | 7797984 | Using index |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
1 row in set (0.01 sec)
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz
FROM `statistics`
WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59')
GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (1 min 33.46 sec)
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| 1 | SIMPLE | statistics | index | created_at | year_month_in_tz | 5 | NULL | 7797984 | Using where |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
1 row in set (0.07 sec)
So if I use where statement with clause on indexed column + group by indexed columns, speed is extremely low. 因此,如果我在索引列上使用where语句with子句+按索引列分组,则速度极低。 Maybe someone know how to improve last query to make it faster ?
也许有人知道如何改进上一个查询以使其更快 ?
PS After playing with indexes, I found that new index on (created_at, year_in_tz, month_in_tz) made query run faster, but I want 0-1 seconds per query, not 10 seconds: PS在处理索引之后,我发现(created_at,year_in_tz,month_in_tz)上的新索引使查询运行得更快,但我希望每个查询0-1秒,而不是10秒:
alter table lending_statistics add index created_at_with_year_and_month_in_tz (created_at,year_in_tz,month_in_tz);
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | statistics | range | created_at,created_at_with_year_and_month_in_tz | created_at_with_year_and_month_in_tz | 9 | NULL | 3612208 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
1 row in set (0.05 sec) 设置1行(0.05秒)
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `lending_statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (10.62 sec)
Add the field ID to your index created_at_with_year_and_month_in_tz and then change your select statement to use 将字段ID添加到您的索引created_at_with_year_and_month_in_tz,然后更改您的select语句以使用
select count(id) ....
In MySQL 5.6 the ICP feature might help in this case cause all fields accessed are part of the index. 在MySQL 5.6中,由于所有访问的字段都是索引的一部分,因此ICP功能可能会有所帮助。 I believe that MySQL might reads the actual data record when you specify count(*) hence it needs to read the index file as well as the datafile.
我相信,当您指定count(*)时,MySQL可能会读取实际的数据记录,因此它需要读取索引文件以及数据文件。
Try this, there is a known MySQL issue with datetime indexes 试试这个,有一个日期时间索引的MySQL已知问题
WHERE
created_at BETWEEN
CAST('2011-10-31 20:00:00' AS datetime) AND
CAST('2011-12-31 19:59:59' AS datetime)
Slow COUNT(*)
queries is the often trouble of MySQL & PostgreSQL (and other RDBMS), because sequental table scan is performed during the query execution. 缓慢的
COUNT(*)
查询是MySQL和PostgreSQL(及其他RDBMS)经常遇到的麻烦,因为在查询执行期间会执行顺序表扫描。 Try to think about caching your aggregated data somewhere else: memcached , redis , etc. 尝试考虑将聚合数据缓存到其他位置: memcached , redis等。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.