简体   繁体   English

优化SQL查询几个JOIN

[英]Optimize SQL-Query several JOINs

I have an SQL query with a nested join: 我有一个带有嵌套联接的SQL查询:

SELECT rh.host, rh.report, COUNT(results.id), COUNT(results_2.id), COUNT(results_3.id), COUNT(results_4.id)
FROM report_hosts rh
INNER JOIN report_results rr ON rh.report = rr.report
LEFT OUTER JOIN results ON rr.result = results.id AND results.type =  'Hole' AND results.host = rh.host
LEFT OUTER JOIN results results_2 ON rr.result = results_2.id AND results_2.type =  'Warning' AND results_2.host = rh.host
LEFT OUTER JOIN results results_3 ON rr.result = results_3.id AND results_3.type =  'Note' AND results_3.host = rh.host
LEFT OUTER JOIN results results_4 ON rr.result = results_4.id AND results_4.type =  'Log' AND results_4.host = rh.host
GROUP BY rh.host

The query as-is takes about 5sec with 99.7% copying to temp table . 按原样查询大约需要5秒钟,将99.7%的副本复制到temp table An EXPLAIN of the full query shows as: 一个EXPLAIN完整查询显示为:

+----+-------------+-----------+--------+---------------+---------+---------+-------------------+------+---------------------------------+
| id | select_type | table     | type   | possible_keys | key     | key_len | ref               | rows | Extra                           |
+----+-------------+-----------+--------+---------------+---------+---------+-------------------+------+---------------------------------+
|  1 | SIMPLE      | rr        | ALL    | report        | NULL    | NULL    | NULL              | 3139 | Using temporary; Using filesort |
|  1 | SIMPLE      | rh        | ref    | report        | report  | 5       | openvas.rr.report |  167 | Using where                     |
|  1 | SIMPLE      | results   | eq_ref | PRIMARY,type  | PRIMARY | 4       | openvas.rr.result |    1 |                                 |
|  1 | SIMPLE      | results_2 | eq_ref | PRIMARY,type  | PRIMARY | 4       | openvas.rr.result |    1 |                                 |
|  1 | SIMPLE      | results_3 | eq_ref | PRIMARY,type  | PRIMARY | 4       | openvas.rr.result |    1 |                                 |
|  1 | SIMPLE      | results_4 | eq_ref | PRIMARY,type  | PRIMARY | 4       | openvas.rr.result |    1 |                                 |
+----+-------------+-----------+--------+---------------+---------+---------+-------------------+------+---------------------------------+

When I remove the LEFT JOIN s, the query executes in about 1s, each LEFT JOIN adds about one additional second execution time. 当我删除LEFT JOIN ,查询将在大约1秒内执行,每个LEFT JOIN增加大约一秒的执行时间。

My question: Can anyone explain, why the copy to temp table task of one join takes longer if there are more LEFT JOIN s? 我的问题:谁能解释,如果有更多的LEFT JOIN ,为什么一个LEFT JOIN复制到临时表任务会花费更长的时间? Is MySQL copying the temp table several times for each JOIN? MySQL是否为每个JOIN多次复制临时表?

How can I avoid this? 如何避免这种情况? Am I missing an index? 我是否缺少索引?

What I intend to accomplish: I have a table with scanning results of several hosts. 我打算完成的工作:我有一个表,其中包含多个主机的扫描结果。 Each result is classified in types ( "Hole", "Warning", "Note" or "Log"). 每个结果均按类型分类(“孔”,“警告”,“注释”或“日志”)。 I want to select each host and the corresponding amount of Holes, Warnings, Notes and Logs. 我要选择每个主机以及相应数量的“孔”,“警告”,“注释”和“日志”。 As a "restriction" I have the fact, that not each host has each type of results. 作为“限制”,我有一个事实,即并非每个主机都有每种类型的结果。

You're joining a single table several times, which effectively is like joining multiple tables. 您要多次联接单个表,这实际上就像联接多个表一样。 You should be able to handle that with some case statements and a where clause instead. 您应该能够使用一些case语句和where子句来处理该问题。 (In fact you may not need the where clause.) (实际上,您可能不需要where子句。)

SELECT rh.host, rh.report, 
 COUNT(CASE WHEN results.type = 'Hole' THEN 1 ELSE NULL END) as Holes, 
 COUNT(CASE WHEN results.type = 'Warning' THEN 1 ELSE NULL END) as Warnings,
 COUNT(CASE WHEN results.type = 'Note' THEN 1 ELSE NULL END) as Notes, 
 COUNT(CASE WHEN results.type = 'Log' THEN 1 ELSE NULL END) as Logs
FROM 
 report_hosts rh
INNER JOIN 
 report_results rr 
ON 
 rh.report = rr.report
LEFT OUTER JOIN 
 results 
ON 
 rr.result = results.id 
 AND results.host = rh.host
WHERE
 results.type = 'Hole' 
 OR results.type = 'Warning' 
 OR results.type = 'Note' 
 OR results.type = 'Log'
GROUP BY rh.host, rh.report

Case statements, IME, are not the greatest performers, but your data bloat from the many joins may offset that and give this better performance. 案例陈述(IME)并不是最出色的执行者,但是您因众多联接而导致的数据膨胀可能会抵消这种情况,从而提高性能。

Using a lot of data (in your case an extra left join ) will mean storing it in memory. 使用大量数据(在您的情况下为额外的left join )将意味着将其存储在内存中。

If you deplete the buffers your query will need to be stored to a temporary result table on drive. 如果耗尽缓冲区,则查询将需要存储到驱动器上的临时结果表中。

Try using the same number of left join s but limiting the number of rows with a limit . 尝试使用相同数量的left join秒,但限制了行数limit It should confirm that the problem lies in the buffers (meaning it will run faster). 它应该确认问题出在缓冲区中(这意味着它将运行得更快)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM