简体   繁体   English

MySQL GROUP BY与不必要地使用Temporary吗?

[英]MySQL GROUP BY with Using Temporary unnecessarily?

I am trying to optimize a query. 我正在尝试优化查询。 Using EXPLAIN tells me it is Using temporary . 使用EXPLAIN告诉我它是Using temporary This is really inefficient given the size of the table (20m+ records). 考虑到表的大小(20m +条记录),这确实效率很低。 Looking at the MySQL documentation Internal Temporary Tables I don't see anything that would imply the need for a Temporary table in my query. 在查看MySQL文档的内部临时表时,我看不到任何暗示查询中需要临时表的内容。 I also tried setting the ORDER BY to the same as the GROUP BY, but still says Using Temporary and query takes forever to run. 我还尝试将ORDER BY设置为与GROUP BY相同,但仍然说“使用临时”,并且查询需要永远运行。 I am using MySQL 5.7. 我正在使用MySQL 5.7。

Is there a way to avoid using a temporary table for this query: 有没有一种方法可以避免对此查询使用临时表:

SELECT url,count(*) as sum 
FROM `digital_pageviews` as `dp` 
WHERE `publisher_uuid` = '8b83120e-3e19-4c34-8556-7b710bd7b812' 
GROUP BY url 
ORDER BY NULL;

This is my table schema: 这是我的表架构:

create table digital_pageviews
(
  id             int unsigned auto_increment
    primary key,
  visitor_uuid   char(36)            null,
  publisher_uuid char(36) default '' not null,
  property_uuid  char(36)            null,
  ip_address     char(15)            not null,
  referrer       text                null,
  url_delete     text                null,
  url            varchar(255)        null,
  url_tmp        varchar(255)        null,
  meta           text                null,
  date_created   timestamp           not null,
  date_updated   timestamp           null
)
  collate = utf8_unicode_ci;

create index digital_pageviews_url_index
  on digital_pageviews (url);

create index ndx_date_created
  on digital_pageviews (date_created);

create index ndx_property_uuid
  on digital_pageviews (property_uuid);

create index ndx_publisher_uuid
  on digital_pageviews (publisher_uuid);

create index ndx_visitor_uuid_page
  on digital_pageviews (visitor_uuid);

The reason it needs a temporary table is that it cannot both filter by publisher_uuid and sort on a column without an index to do both. 之所以需要一个临时表,是因为它既不能按publisher_uuid进行过滤,也不能对没有索引的列进行排序。 The first step is to filter by publisher_uuid , so it uses the index on publisher_uuid . 第一步是按publisher_uuid进行过滤,因此它将使用publisher_uuid上的索引。

However, next it has to group by and order the records, which will require a temporary table because it cannot use an index which will do this. 但是,接下来它必须对记录进行分组和排序,这将需要一个临时表,因为它不能使用执行此操作的索引。 The reason it cannot use an index is that it already used the publisher_uuid , which is not indexed on the url field to do the group by or on the field you are ordering by. 之所以不能使用索引,是因为它已经使用了publisher_uuid ,但未在url字段上建立索引,因此无法对您依据的字段进行分组。

To filter where publisher_uuid = '8b83120e-3e19-4c34-8556-7b710bd7b812' , group by url , and order by url , create an index with these fields in this order: 要过滤哪里publisher_uuid = '8b83120e-3e19-4c34-8556-7b710bd7b812' ,按url分组和按url排序,请按以下顺序创建包含这些字段的索引:

  • publisher_uuid Publisher_uuid
  • url 网址
create index ndx_publisher_uuid
  on digital_pageviews (publisher_uuid, url);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM