简体   繁体   English

mysql InnoDb在SELECT查询上非常慢

[英]Mysql InnoDb is very slow on SELECT query

I have a mysql table with following structure: 我有一个具有以下结构的mysql表:

mysql> show create table logs \G;

Create Table: CREATE TABLE `logs` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `request` text,
  `response` longtext,
  `msisdn` varchar(255) DEFAULT NULL,
  `username` varchar(255) DEFAULT NULL,
  `shortcode` varchar(255) DEFAULT NULL,
  `response_code` varchar(255) DEFAULT NULL,
  `response_description` text,
  `transaction_name` varchar(250) DEFAULT NULL,
  `system_owner` varchar(250) DEFAULT NULL,
  `request_date_time` datetime DEFAULT NULL,
  `response_date_time` datetime DEFAULT NULL,
  `comments` text,
  `user_type` varchar(255) DEFAULT NULL,
  `channel` varchar(20) DEFAULT 'WEB',

  /**

  other columns here....

  other 18 columns here, with Type varchar and Text

  **/

  PRIMARY KEY (`id`),
  KEY `transaction_name` (`transaction_name`) USING BTREE,
  KEY `msisdn` (`msisdn`) USING BTREE,
  KEY `username` (`username`) USING BTREE,
  KEY `request_date_time` (`request_date_time`) USING BTREE,
  KEY `system_owner` (`system_owner`) USING BTREE,
  KEY `shortcode` (`shortcode`) USING BTREE,
  KEY `response_code` (`response_code`) USING BTREE,
  KEY `channel` (`channel`) USING BTREE,
  KEY `request_date_time_2` (`request_date_time`),
  KEY `response_date_time` (`response_date_time`)
) ENGINE=InnoDB AUTO_INCREMENT=59582405 DEFAULT CHARSET=utf8

and it has more than 30000000 records in it. 并且其中有超过30000000条记录。

mysql> select count(*) from logs;
+----------+
| count(*) |
+----------+
| 38962312 |
+----------+
1 row in set (1 min 17.77 sec)

Now the problem is that it is very slow, the result of select takes ages to fetch records from table. 现在的问题是它非常慢,select的结果要花一些时间才能从表中获取记录。

My following sub query takes almost 30 minutes to fetch records of one day: 我的以下子查询需要近30分钟的时间来获取一天的记录:

    SELECT 
    COUNT(sub.id) AS count,
    DATE(sub.REQUEST_DATE_TIME) AS transaction_date,
    sub.SYSTEM_OWNER,
    sub.transaction_name,
    sub.response,
    MIN(sub.response_time),
    MAX(sub.response_time),
    AVG(sub.response_time),
    sub.channel
FROM
    (SELECT 
        id,
            REQUEST_DATE_TIME,
            RESPONSE_DATE_TIME,
            TIMESTAMPDIFF(SECOND, REQUEST_DATE_TIME, RESPONSE_DATE_TIME) AS response_time,
            SYSTEM_OWNER,
            transaction_name,
            (CASE
                WHEN response_code IN ('0' , '00', 'EIL000') THEN 'Success'
                ELSE 'Failure'
            END) AS response,
            channel
    FROM
        logs
    WHERE
        response_code != ''
            AND DATE(REQUEST_DATE_TIME) BETWEEN '2016-10-26 00:00:00' AND '2016-10-27 00:00:00'
            AND SYSTEM_OWNER != '') sub
GROUP BY DATE(sub.REQUEST_DATE_TIME) , sub.channel , sub.SYSTEM_OWNER , sub.transaction_name , sub.response
ORDER BY DATE(sub.REQUEST_DATE_TIME) DESC , sub.SYSTEM_OWNER , sub.transaction_name , sub.response DESC;

I have also added indexes to my table, but still it is very slow. 我还向表中添加了索引,但是仍然很慢。

Any help on how can I make it fast ? 任何有关如何使它快速运行的帮助?

EDIT: Ran the above query using EXPLAIN 编辑:使用EXPLAIN运行以上查询

+----+-------------+------------+------+----------------------------+------+---------+------+----------+---------------------------------+
| id | select_type | table      | type | possible_keys              | key  | key_len | ref  | rows     | Extra                           |
+----+-------------+------------+------+----------------------------+------+---------+------+----------+---------------------------------+
|  1 | PRIMARY     | <derived2> | ALL  | NULL                       | NULL | NULL    | NULL | 16053297 | Using temporary; Using filesort |
|  2 | DERIVED     | logs       | ALL  | system_owner,response_code | NULL | NULL    | NULL | 32106592 | Using where                     |
+----+-------------+------------+------+----------------------------+------+---------+------+----------+---------------------------------+

As it stands, the query must scan the entire table. 就目前而言,查询必须扫描整个表。

But first, let's air a possible bug: 但首先,让我们发布一个可能的错误:

AND DATE(REQUEST_DATE_TIME) BETWEEN '2016-10-26 00:00:00'
                                AND '2016-10-27 00:00:00'

Gives you the logs for two days -- all of the 26th and all of the 27th. 给你两天的日志-所有的26日和所有的27日。 Or is that what you really wanted? 还是那是您真正想要的? ( BETWEEN is inclusive .) BETWEEN包括端值 。)

But the performance problem is that the index will not be used because request_date_time is hiding inside a function ( DATE ). 但是性能问题在于,由于request_date_time隐藏在函数( DATE )中,因此将不使用索引。

Jump forward to a better way to phrase it: 跳到一种更好的短语表达方式:

AND REQUEST_DATE_TIME >= '2016-10-26'
AND REQUEST_DATE_TIME  < '2016-10-26' + INTERVAL 1 DAY
  • A DATETIME can be compared against a date. 可以将DATETIME与日期进行比较。
  • Midnight of the morning of the 26th is included, but midnight of the 27th is not. 包括26日凌晨,但不包括27日凌晨。
  • You can easily change 1 to however many days you wish -- without having to deal with leap days, etc. 您可以轻松地将1更改为任意天数-无需处理leap日等。
  • This formulation allows the use of the index on request_date_time , thereby cutting back severely on amount of data to be scanned. 此公式允许使用request_date_time上的索引,从而大大减少了要扫描的数据量。

As for other tempting areas: 至于其他诱人的领域:

  • != does not optimize well, so no 'composite' index is likely to be beneficial. !=不能很好地优化,因此没有“复合”索引可能是有益的。
  • Since we can't really get past the WHERE , no index is useful for GROUP BY or ORDER BY . 由于我们无法真正超越WHERE ,因此没有索引对于GROUP BYORDER BY是有用的。
  • My comments about DATE() in WHERE do not apply to GROUP BY ; 我在WHEREDATE()评论不适用于GROUP BY no change needed. 无需更改。

Why have the subquery? 为什么有子查询? I think it can be done in a single layer. 我认为可以在单个层中完成。 This will eliminate a rather large temp table. 这将消除一个相当大的临时表。 (Yeah, it means 3 uses of TIMESTAMPDIFF() , but that is probably a lot cheaper than the temp table.) (是的,这意味着TIMESTAMPDIFF() 3种用法,但这可能比temp表便宜很多。)

How much RAM? 多少内存? What is the value of innodb_buffer_pool_size ? innodb_buffer_pool_size的值是innodb_buffer_pool_size

If my comments are not enough, and if you frequently run a query like this (over a day or over a date range), then we can talk about building and maintaining a Summary table , which might give you a 10x speedup. 如果我的评论还不够,并且您经常在一天或一个日期范围内运行这样的查询,那么我们可以讨论构建和维护Summary表 ,这可能使您的速度提高10倍。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM