简体   繁体   中英

mysql group by joined column too slow

I have two tables events and event_params

the first table stores the events with these columns

events | CREATE TABLE `events` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `project` varchar(24) NOT NULL,
  `event` varchar(24) NOT NULL,
  `date` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `project` (`project`,`event`)
) ENGINE=InnoDB AUTO_INCREMENT=2915335 DEFAULT CHARSET=latin1

and second stores parameters for each event with these columns

event_params | CREATE TABLE `event_params` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `event_id` int(10) unsigned NOT NULL,
  `name` varchar(24) NOT NULL,
  `value` varchar(524) CHARACTER SET utf8 NOT NULL,
  PRIMARY KEY (`id`),
  KEY `name` (`name`),
  KEY `event_id` (`event_id`),
  KEY `value` (`value`),
) ENGINE=InnoDB AUTO_INCREMENT=20789391 DEFAULT CHARSET=latin1

now I want to get count of events those have various values on a specified parameter

I wrote this query for campaign parameter but this is too slow (15 secs to respond)

SELECT
    event_params.value as campaign,
    count(*) as count
FROM `events`
    left join event_params on event_params.event_id = events.id
                          and event_params.name = 'campaign'
WHERE events.project = 'foo'
GROUP by event_params.value

and here is the EXPLAIN query result:

+----+-------------+--------------+------------+------+---------------------+----------+---------+------------------+------+----------+----------------------------------------------+
| id | select_type | table        | partitions | type | possible_keys       | key      | key_len | ref              | rows | filtered | Extra                                        |
+----+-------------+--------------+------------+------+---------------------+----------+---------+------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | events       | NULL       | ref  | project             | project  | 26      | const            |    1 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | event_params | NULL       | ref  | name,event_id,value | event_id | 4       | events.events.id |    4 |   100.00 | Using where                                  |
+----+-------------+--------------+------------+------+---------------------+----------+---------+------------------+------+----------+----------------------------------------------+

can i speed up this query ?

You may try adding the following index on the event_params table, which might speed up the join:

CREATE INDEX idx1 ON event_params (event_id, name, value);

The aggregation step probably can't be optimized much because the COUNT operation involves counting each record.

Move the "campaign value" into the main table, with a suitable length for VARCHAR and then

SELECT
    campaign,
    count(*) as count
FROM `events`
WHERE project = 'foo'
GROUP by campaign

And have

INDEX(project, campaign)

A bit of advice when tempted to use EAV: Move the 'important' values into the main table; leave only the rarely used or rarely set 'values' in the other table. Also (assuming there are no dups), have

PRIMARY KEY(event_id, name)

More discussion: http://mysql.rjweb.org/doc.php/eav

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM