简体   繁体   English

如何优化查询到大表

[英]how to optimize query to big table

I have a table with 18,310,298 records right now. 我现在有一张18,310,298条记录的表格。

And next query 和下一个查询

SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE 
`event` = '11' AND
`obj_type` = '2' AND
`region` = 'us' AND 
DATE(`date`) = DATE('20120213010502');

With next structure 有了下一个结构

CREATE TABLE `logs` (
  `log_id` int(11) NOT NULL AUTO_INCREMENT,
  `event` tinyint(4) NOT NULL,
  `obj_type` tinyint(1) NOT NULL DEFAULT '0',
  `obj_id` int(11) unsigned NOT NULL DEFAULT '0',
  `region` varchar(3) NOT NULL DEFAULT '',
  `date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`log_id`),
  KEY `event` (`event`),
  KEY `obj_type` (`obj_type`),
  KEY `region` (`region`),
  KEY `for_stat` (`event`,`obj_type`,`obj_id`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |

and MySQL explain show the next 和MySQL解释显示下一个

+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys                  | key      | key_len | ref         | rows   | filtered | Extra                    |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
|  1 | SIMPLE      | logs  | ref  | event,obj_type,region,for_stat | for_stat | 2       | const,const | 837216 |   100.00 | Using where; Using index |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

Running such query in daily peak usage time take about 5 seconds. 在每日高峰使用时间运行此类查询大约需要5秒钟。

What can I do to make it faster ? 我该怎么做才能让它更快?

UPDATED: Regarding all comments I modified INDEX and take off DATE function in WHERE clause 更新:关于所有注释我修改了INDEX并取消了WHERE子句中的DATE函数

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| logs  |          0 | PRIMARY  |            1 | log_id      | A         |    15379109 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | event    |            1 | event       | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | obj_type |            1 | obj_type    | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | region   |            1 | region      | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            1 | event       | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            2 | obj_type    | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            3 | region      | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            4 | date        | A         |         157 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+


    mysql> explain extended SELECT COUNT(obj_id) as cnt 
    ->     FROM `common`.`logs` 
    ->     WHERE `event`= '11' AND 
    ->     `obj_type` = '2' AND 
    ->     `region`= 'est' AND 
    ->     date between '2012-11-25 00:00:00' and '2012-11-25 23:59:59';
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | type  | possible_keys                  | key      | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | logs  | range | event,obj_type,region,for_stat | for_stat | 21      | NULL | 9674 |    75.01 | Using where |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+

It seems it's running faster. 它似乎运行得更快。 Thanks everyone. 感谢大家。

The EXPLAIN output shows that the query is using only the first two columns of the for_stat index. EXPLAIN输出显示查询仅使用for_stat索引的前两列。

This is because the query doesn't use obj_id in the WHERE clause. 这是因为查询在WHERE子句中不使用obj_id If you create a new key without obj_id (or modify the existing key to reorder the columns), more of the key can be used and you may see better performance: 如果您创建没有obj_id的新密钥(或修改现有密钥以对列重新排序),则可以使用更多密钥,您可能会看到更好的性能:

KEY `for_stat2` (`event`,`obj_type`,`region`,`date`)

If it's still too slow, changing the last condition, where you use DATE() , as said by Salman and Sashi, might improve things. 如果它仍然太慢,改变最后一个条件,你使用DATE() ,如Salman和Sashi所说,可能会改善一些事情。

The date function on the date column is making the full table scan. date列上的日期函数正在进行全表扫描。 Try this :: 尝试这个 ::

SELECT COUNT(obj_id) as cnt
                FROM
                    `common`.`logs` 
                WHERE 
                    `event`      = 11
                AND
                    `obj_type`   = 2

                AND
                    `region`     = 'us'
                AND
                    `date` = DATE('20120213010502')

@Joni already explained what is wrong with your index. @Joni已经解释了你的索引有什么问题。 For query, I assume that your example query selects all records for 2012-02-13 regardless of time. 对于查询,我假设您的示例查询选择2012-02-13所有记录,而不管时间。 You can change the where clause to use >= and < instead of DATE cast: 您可以将where子句更改为使用>=<而不是DATE强制转换:

SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE 
`event` = 11 AND
`obj_type` = 2 AND
`region` = 'us' AND 
`date` >= DATE('20120213010502') AND
`date` <  DATE('20120213010502') + INTERVAL 1 DAY

As logging (inserts) needs to be fast too, use as less indices as possible. 由于日志记录(插入)也需要很快,因此尽量使用较少的索引。

Evaluation may take long as that is admin, not necessarily needing indices. 评估可能需要很长时间,因为这是管理员,不一定需要索引。

CREATE TABLE `logs` (
  `log_id` int(11) NOT NULL AUTO_INCREMENT,
  `event` tinyint(4) NOT NULL,
  `obj_type` tinyint(1) NOT NULL DEFAULT '0',
  `obj_id` int(11) unsigned NOT NULL DEFAULT '0',
  `region` varchar(3) NOT NULL DEFAULT '',
  `date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`log_id`),
  KEY `for_stat` (`event`,`obj_type`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |

And about the date search @SashiKant and @SalmanA already answered. 关于日期搜索@SashiKant和@SalmanA已经回答了。

Is Mysql you should place index columns by collation count; 是Mysql你应该按归类计数放置索引列; less possible values in table - placed closer to the left. 表格中可能的值较小 - 靠近左侧。 Also you can try to change column region to enum() and try to search date with BETWEEN clause. 您也可以尝试将列region更改为enum()并尝试使用BETWEEN子句搜索date Mysql is not using third column in the index because it's usage takes more efforts then just filtering (it's a common thing in Mysql). Mysql没有在索引中使用第三列,因为它的使用需要更多的努力然后只是过滤(这在Mysql中很常见)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM