如何优化查询到大表

Question

I have a table with 18,310,298 records right now. 我现在有一张18,310,298条记录的表格。

And next query 和下一个查询

SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE 
`event` = '11' AND
`obj_type` = '2' AND
`region` = 'us' AND 
DATE(`date`) = DATE('20120213010502');

With next structure 有了下一个结构

CREATE TABLE `logs` (
  `log_id` int(11) NOT NULL AUTO_INCREMENT,
  `event` tinyint(4) NOT NULL,
  `obj_type` tinyint(1) NOT NULL DEFAULT '0',
  `obj_id` int(11) unsigned NOT NULL DEFAULT '0',
  `region` varchar(3) NOT NULL DEFAULT '',
  `date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`log_id`),
  KEY `event` (`event`),
  KEY `obj_type` (`obj_type`),
  KEY `region` (`region`),
  KEY `for_stat` (`event`,`obj_type`,`obj_id`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |

and MySQL explain show the next 和MySQL解释显示下一个

+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys                  | key      | key_len | ref         | rows   | filtered | Extra                    |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
|  1 | SIMPLE      | logs  | ref  | event,obj_type,region,for_stat | for_stat | 2       | const,const | 837216 |   100.00 | Using where; Using index |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

Running such query in daily peak usage time take about 5 seconds. 在每日高峰使用时间运行此类查询大约需要5秒钟。

What can I do to make it faster ? 我该怎么做才能让它更快？

UPDATED: Regarding all comments I modified INDEX and take off DATE function in WHERE clause 更新：关于所有注释我修改了INDEX并取消了WHERE子句中的DATE函数

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| logs  |          0 | PRIMARY  |            1 | log_id      | A         |    15379109 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | event    |            1 | event       | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | obj_type |            1 | obj_type    | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | region   |            1 | region      | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            1 | event       | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            2 | obj_type    | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            3 | region      | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            4 | date        | A         |         157 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+


    mysql> explain extended SELECT COUNT(obj_id) as cnt 
    ->     FROM `common`.`logs` 
    ->     WHERE `event`= '11' AND 
    ->     `obj_type` = '2' AND 
    ->     `region`= 'est' AND 
    ->     date between '2012-11-25 00:00:00' and '2012-11-25 23:59:59';
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | type  | possible_keys                  | key      | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | logs  | range | event,obj_type,region,for_stat | for_stat | 21      | NULL | 9674 |    75.01 | Using where |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+

It seems it's running faster. 它似乎运行得更快。 Thanks everyone. 感谢大家。

Answer 1

The EXPLAIN output shows that the query is using only the first two columns of the for_stat index. EXPLAIN输出显示查询仅使用for_stat索引的前两列。

This is because the query doesn't use obj_id in the WHERE clause. 这是因为查询在WHERE子句中不使用obj_id 。 If you create a new key without obj_id (or modify the existing key to reorder the columns), more of the key can be used and you may see better performance: 如果您创建没有obj_id的新密钥（或修改现有密钥以对列重新排序），则可以使用更多密钥，您可能会看到更好的性能：

KEY `for_stat2` (`event`,`obj_type`,`region`,`date`)

If it's still too slow, changing the last condition, where you use DATE() , as said by Salman and Sashi, might improve things. 如果它仍然太慢，改变最后一个条件，你使用DATE() ，如Salman和Sashi所说，可能会改善一些事情。

Answer 2

The date function on the date column is making the full table scan. date列上的日期函数正在进行全表扫描。 Try this :: 尝试这个：：

SELECT COUNT(obj_id) as cnt
                FROM
                    `common`.`logs` 
                WHERE 
                    `event`      = 11
                AND
                    `obj_type`   = 2

                AND
                    `region`     = 'us'
                AND
                    `date` = DATE('20120213010502')

Answer 3

@Joni already explained what is wrong with your index. @Joni已经解释了你的索引有什么问题。 For query, I assume that your example query selects all records for 2012-02-13 regardless of time. 对于查询，我假设您的示例查询选择2012-02-13所有记录，而不管时间。 You can change the where clause to use >= and < instead of DATE cast: 您可以将where子句更改为使用>=和<而不是DATE强制转换：

SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE 
`event` = 11 AND
`obj_type` = 2 AND
`region` = 'us' AND 
`date` >= DATE('20120213010502') AND
`date` <  DATE('20120213010502') + INTERVAL 1 DAY

Answer 4

As logging (inserts) needs to be fast too, use as less indices as possible. 由于日志记录（插入）也需要很快，因此尽量使用较少的索引。

Evaluation may take long as that is admin, not necessarily needing indices. 评估可能需要很长时间，因为这是管理员，不一定需要索引。

CREATE TABLE `logs` (
  `log_id` int(11) NOT NULL AUTO_INCREMENT,
  `event` tinyint(4) NOT NULL,
  `obj_type` tinyint(1) NOT NULL DEFAULT '0',
  `obj_id` int(11) unsigned NOT NULL DEFAULT '0',
  `region` varchar(3) NOT NULL DEFAULT '',
  `date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`log_id`),
  KEY `for_stat` (`event`,`obj_type`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |

And about the date search @SashiKant and @SalmanA already answered. 关于日期搜索@SashiKant和@SalmanA已经回答了。

Answer 5

Is Mysql you should place index columns by collation count; 是Mysql你应该按归类计数放置索引列; less possible values in table - placed closer to the left. 表格中可能的值较小 - 靠近左侧。 Also you can try to change column region to enum() and try to search date with BETWEEN clause. 您也可以尝试将列region更改为enum（）并尝试使用BETWEEN子句搜索date 。 Mysql is not using third column in the index because it's usage takes more efforts then just filtering (it's a common thing in Mysql). Mysql没有在索引中使用第三列，因为它的使用需要更多的努力然后只是过滤（这在Mysql中很常见）。

如何优化查询到大表

问题描述

5 个解决方案

解决方案1
2 2012-11-30 11:22:31

解决方案2
0 2012-11-30 11:12:21

解决方案3
0 已采纳 2012-11-30 11:19:27

解决方案4
0 2012-11-30 11:21:02

解决方案5
0 2012-11-30 11:37:31

如何优化查询到大表

问题描述

5 个解决方案

解决方案1 2 2012-11-30 11:22:31

解决方案2 0 2012-11-30 11:12:21

解决方案3 0 已采纳 2012-11-30 11:19:27

解决方案4 0 2012-11-30 11:21:02

解决方案5 0 2012-11-30 11:37:31

解决方案1
2 2012-11-30 11:22:31

解决方案2
0 2012-11-30 11:12:21

解决方案3
0 已采纳 2012-11-30 11:19:27

解决方案4
0 2012-11-30 11:21:02

解决方案5
0 2012-11-30 11:37:31