简体   繁体   中英

MySQL GROUP BY with Using Temporary unnecessarily?

I am trying to optimize a query. Using EXPLAIN tells me it is Using temporary . This is really inefficient given the size of the table (20m+ records). Looking at the MySQL documentation Internal Temporary Tables I don't see anything that would imply the need for a Temporary table in my query. I also tried setting the ORDER BY to the same as the GROUP BY, but still says Using Temporary and query takes forever to run. I am using MySQL 5.7.

Is there a way to avoid using a temporary table for this query:

SELECT url,count(*) as sum 
FROM `digital_pageviews` as `dp` 
WHERE `publisher_uuid` = '8b83120e-3e19-4c34-8556-7b710bd7b812' 
GROUP BY url 
ORDER BY NULL;

This is my table schema:

create table digital_pageviews
(
  id             int unsigned auto_increment
    primary key,
  visitor_uuid   char(36)            null,
  publisher_uuid char(36) default '' not null,
  property_uuid  char(36)            null,
  ip_address     char(15)            not null,
  referrer       text                null,
  url_delete     text                null,
  url            varchar(255)        null,
  url_tmp        varchar(255)        null,
  meta           text                null,
  date_created   timestamp           not null,
  date_updated   timestamp           null
)
  collate = utf8_unicode_ci;

create index digital_pageviews_url_index
  on digital_pageviews (url);

create index ndx_date_created
  on digital_pageviews (date_created);

create index ndx_property_uuid
  on digital_pageviews (property_uuid);

create index ndx_publisher_uuid
  on digital_pageviews (publisher_uuid);

create index ndx_visitor_uuid_page
  on digital_pageviews (visitor_uuid);

The reason it needs a temporary table is that it cannot both filter by publisher_uuid and sort on a column without an index to do both. The first step is to filter by publisher_uuid , so it uses the index on publisher_uuid .

However, next it has to group by and order the records, which will require a temporary table because it cannot use an index which will do this. The reason it cannot use an index is that it already used the publisher_uuid , which is not indexed on the url field to do the group by or on the field you are ordering by.

To filter where publisher_uuid = '8b83120e-3e19-4c34-8556-7b710bd7b812' , group by url , and order by url , create an index with these fields in this order:

  • publisher_uuid
  • url
create index ndx_publisher_uuid
  on digital_pageviews (publisher_uuid, url);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM