简体   繁体   English

提高LEFT JOIN / GROUP BY的性能

[英]Improving performance of LEFT JOIN / GROUP BY

I have two tables snippets and platforms . 我有两个表snippetsplatforms Each snippet belongs to a platform (fork_id is nullable and links to another record on the same table). 每个片段都属于一个平台(fork_id可为空,并且链接到同一表上的另一条记录)。 Structure: 结构体:

PLATFORMS (id, name, slug, syntax)
SNIPPETS (id, platform_id, fork_id, private etc.) 

I'm now trying to run a query to get the total number of snippets for each platform. 我现在正在尝试运行查询以获取每个平台的片段总数。 The query is slow (between 10 - 20 seconds) when the snippets table has a million records. 当摘要表具有一百万条记录时,查询速度很慢(在10到20秒之间)。

SELECT platforms.id, name, slug, syntax, COUNT(*) AS total FROM platforms 
LEFT JOIN snippets on platforms.id = snippets.platform_id
WHERE fork_id IS NULL
AND private = 0
GROUP BY platforms.id, name
ORDER BY total DESC, name asc;

Some additional information: 一些其他信息:

  • snippets.id and platform.id have primary key indexes. snippets.id和platform.id具有主键索引。
  • fork_id and platform_id have foreign key indexes. fork_id和platform_id具有外键索引。
  • private and platforms.name have indexes. private和platform.name具有索引。

Running an EXPLAIN query offers the following: 运行EXPLAIN查询可提供以下内容:

说明查询

How can I get the performance to an acceptable level? 如何使性能达到可接受的水平? Thanks! 谢谢!

In MySQL, this type of query can be faster with a correlated subquery: 在MySQL中,使用相关子查询可以更快地进行这种查询:

SELECT p.id, p.name, p.slug, p.syntax,
       (SELECT COUNT(*)
        FROM snippets s
        WHERE p.id = s.platform_id AND
              s.fork_id IS NULL AND
              s.private = 0
       ) AS total
FROM platforms  p
ORDER BY total DESC, name asc;

Then, you want an index on snippets(platform_id, fork_id, private) . 然后,您想要在snippets(platform_id, fork_id, private)上建立索引。

I should note that your original query is equivalent to: 请注意,您的原始查询等同于:

SELECT p.id, p.name, p.slug, p.syntax, COUNT(*) AS total
FROM platforms p JOIN
     snippets s
     on p.id = s.platform_id
WHERE s.fork_id IS NULL AND s.private = 0
GROUP BY p.id, p.name
ORDER BY total DESC, name asc;

Because the WHERE clause turns the LEFT JOIN into an INNER JOIN . 因为WHERE子句将LEFT JOIN转换为INNER JOIN For this query, you can try an index on snippets(private, fork_id, platform_id) . 对于此查询,您可以尝试在snippets(private, fork_id, platform_id)上建立索引。

Two things going on here I can see. 我可以看到两件事。 One is the counting, and the other is the display of the details from the platforms table. 一个是计数,另一个是显示platforms表中的详细信息。

Let's count first. 让我们先数一下。

                     SELECT platform_id, COUNT(*) snips
                       FROM snippets
                      WHERE fork_id IS NULL
                        AND private = 0
                      GROUP BY platform_id

To make this as fast as possible, create a compound index on the ( private, fork_id, platform_id) columns of the snippets table. 为了使此过程尽可能快,请在snippets表的( private, fork_id, platform_id)列上创建一个复合索引。 That way the inner query can do a so-called index scan, about which you can read. 这样,内部查询就可以进行所谓的索引扫描,您可以对其进行读取。

Let's report the details now that we have the counts. 现在,有了数量,让我们报告详细信息。

       SELECT a.id, a.name, a.slug, a.syntax, b.snips
         FROM platforms a
    LEFT JOIN (
                     SELECT platform_id, COUNT(*) snips
                       FROM snippets
                      WHERE fork_id IS NULL
                        AND private = 0
                      GROUP BY platform_id
              ) b ON a.platform_id = b.platform_id
       ORDER BY b.snips DESC, a.name ASC;

The trick is: simplify simplify simplify when aggregating (grouping) large tables. 诀窍是:简化简化(聚合)大型表时的简化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM