how to optimize query to big table

Question

I have a table with 18,310,298 records right now.

And next query

SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE 
`event` = '11' AND
`obj_type` = '2' AND
`region` = 'us' AND 
DATE(`date`) = DATE('20120213010502');

With next structure

CREATE TABLE `logs` (
  `log_id` int(11) NOT NULL AUTO_INCREMENT,
  `event` tinyint(4) NOT NULL,
  `obj_type` tinyint(1) NOT NULL DEFAULT '0',
  `obj_id` int(11) unsigned NOT NULL DEFAULT '0',
  `region` varchar(3) NOT NULL DEFAULT '',
  `date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`log_id`),
  KEY `event` (`event`),
  KEY `obj_type` (`obj_type`),
  KEY `region` (`region`),
  KEY `for_stat` (`event`,`obj_type`,`obj_id`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |

and MySQL explain show the next

+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys                  | key      | key_len | ref         | rows   | filtered | Extra                    |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
|  1 | SIMPLE      | logs  | ref  | event,obj_type,region,for_stat | for_stat | 2       | const,const | 837216 |   100.00 | Using where; Using index |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

Running such query in daily peak usage time take about 5 seconds.

What can I do to make it faster ?

UPDATED: Regarding all comments I modified INDEX and take off DATE function in WHERE clause

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| logs  |          0 | PRIMARY  |            1 | log_id      | A         |    15379109 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | event    |            1 | event       | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | obj_type |            1 | obj_type    | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | region   |            1 | region      | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            1 | event       | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            2 | obj_type    | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            3 | region      | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            4 | date        | A         |         157 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+


    mysql> explain extended SELECT COUNT(obj_id) as cnt 
    ->     FROM `common`.`logs` 
    ->     WHERE `event`= '11' AND 
    ->     `obj_type` = '2' AND 
    ->     `region`= 'est' AND 
    ->     date between '2012-11-25 00:00:00' and '2012-11-25 23:59:59';
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | type  | possible_keys                  | key      | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | logs  | range | event,obj_type,region,for_stat | for_stat | 21      | NULL | 9674 |    75.01 | Using where |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+

It seems it's running faster. Thanks everyone.

Answer 1

The EXPLAIN output shows that the query is using only the first two columns of the for_stat index.

This is because the query doesn't use obj_id in the WHERE clause. If you create a new key without obj_id (or modify the existing key to reorder the columns), more of the key can be used and you may see better performance:

KEY `for_stat2` (`event`,`obj_type`,`region`,`date`)

If it's still too slow, changing the last condition, where you use DATE() , as said by Salman and Sashi, might improve things.

Answer 2

The date function on the date column is making the full table scan. Try this ::

SELECT COUNT(obj_id) as cnt
                FROM
                    `common`.`logs` 
                WHERE 
                    `event`      = 11
                AND
                    `obj_type`   = 2

                AND
                    `region`     = 'us'
                AND
                    `date` = DATE('20120213010502')

Answer 3

@Joni already explained what is wrong with your index. For query, I assume that your example query selects all records for 2012-02-13 regardless of time. You can change the where clause to use >= and < instead of DATE cast:

SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE 
`event` = 11 AND
`obj_type` = 2 AND
`region` = 'us' AND 
`date` >= DATE('20120213010502') AND
`date` <  DATE('20120213010502') + INTERVAL 1 DAY

Answer 4

As logging (inserts) needs to be fast too, use as less indices as possible.

Evaluation may take long as that is admin, not necessarily needing indices.

CREATE TABLE `logs` (
  `log_id` int(11) NOT NULL AUTO_INCREMENT,
  `event` tinyint(4) NOT NULL,
  `obj_type` tinyint(1) NOT NULL DEFAULT '0',
  `obj_id` int(11) unsigned NOT NULL DEFAULT '0',
  `region` varchar(3) NOT NULL DEFAULT '',
  `date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`log_id`),
  KEY `for_stat` (`event`,`obj_type`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |

And about the date search @SashiKant and @SalmanA already answered.

Answer 5

Is Mysql you should place index columns by collation count; less possible values in table - placed closer to the left. Also you can try to change column region to enum() and try to search date with BETWEEN clause. Mysql is not using third column in the index because it's usage takes more efforts then just filtering (it's a common thing in Mysql).

how to optimize query to big table

Question

5 answers

solution1
2 2012-11-30 11:22:31

solution2
0 2012-11-30 11:12:21

solution3
0 ACCPTED 2012-11-30 11:19:27

solution4
0 2012-11-30 11:21:02

solution5
0 2012-11-30 11:37:31

how to optimize query to big table

Question

5 answers

solution1 2 2012-11-30 11:22:31

solution2 0 2012-11-30 11:12:21

solution3 0 ACCPTED 2012-11-30 11:19:27

solution4 0 2012-11-30 11:21:02

solution5 0 2012-11-30 11:37:31

solution1
2 2012-11-30 11:22:31

solution2
0 2012-11-30 11:12:21

solution3
0 ACCPTED 2012-11-30 11:19:27

solution4
0 2012-11-30 11:21:02

solution5
0 2012-11-30 11:37:31