简体   繁体   中英

Optimize SQL-Query several JOINs

I have an SQL query with a nested join:

SELECT rh.host, rh.report, COUNT(results.id), COUNT(results_2.id), COUNT(results_3.id), COUNT(results_4.id)
FROM report_hosts rh
INNER JOIN report_results rr ON rh.report = rr.report
LEFT OUTER JOIN results ON rr.result = results.id AND results.type =  'Hole' AND results.host = rh.host
LEFT OUTER JOIN results results_2 ON rr.result = results_2.id AND results_2.type =  'Warning' AND results_2.host = rh.host
LEFT OUTER JOIN results results_3 ON rr.result = results_3.id AND results_3.type =  'Note' AND results_3.host = rh.host
LEFT OUTER JOIN results results_4 ON rr.result = results_4.id AND results_4.type =  'Log' AND results_4.host = rh.host
GROUP BY rh.host

The query as-is takes about 5sec with 99.7% copying to temp table . An EXPLAIN of the full query shows as:

+----+-------------+-----------+--------+---------------+---------+---------+-------------------+------+---------------------------------+
| id | select_type | table     | type   | possible_keys | key     | key_len | ref               | rows | Extra                           |
+----+-------------+-----------+--------+---------------+---------+---------+-------------------+------+---------------------------------+
|  1 | SIMPLE      | rr        | ALL    | report        | NULL    | NULL    | NULL              | 3139 | Using temporary; Using filesort |
|  1 | SIMPLE      | rh        | ref    | report        | report  | 5       | openvas.rr.report |  167 | Using where                     |
|  1 | SIMPLE      | results   | eq_ref | PRIMARY,type  | PRIMARY | 4       | openvas.rr.result |    1 |                                 |
|  1 | SIMPLE      | results_2 | eq_ref | PRIMARY,type  | PRIMARY | 4       | openvas.rr.result |    1 |                                 |
|  1 | SIMPLE      | results_3 | eq_ref | PRIMARY,type  | PRIMARY | 4       | openvas.rr.result |    1 |                                 |
|  1 | SIMPLE      | results_4 | eq_ref | PRIMARY,type  | PRIMARY | 4       | openvas.rr.result |    1 |                                 |
+----+-------------+-----------+--------+---------------+---------+---------+-------------------+------+---------------------------------+

When I remove the LEFT JOIN s, the query executes in about 1s, each LEFT JOIN adds about one additional second execution time.

My question: Can anyone explain, why the copy to temp table task of one join takes longer if there are more LEFT JOIN s? Is MySQL copying the temp table several times for each JOIN?

How can I avoid this? Am I missing an index?

What I intend to accomplish: I have a table with scanning results of several hosts. Each result is classified in types ( "Hole", "Warning", "Note" or "Log"). I want to select each host and the corresponding amount of Holes, Warnings, Notes and Logs. As a "restriction" I have the fact, that not each host has each type of results.

You're joining a single table several times, which effectively is like joining multiple tables. You should be able to handle that with some case statements and a where clause instead. (In fact you may not need the where clause.)

SELECT rh.host, rh.report, 
 COUNT(CASE WHEN results.type = 'Hole' THEN 1 ELSE NULL END) as Holes, 
 COUNT(CASE WHEN results.type = 'Warning' THEN 1 ELSE NULL END) as Warnings,
 COUNT(CASE WHEN results.type = 'Note' THEN 1 ELSE NULL END) as Notes, 
 COUNT(CASE WHEN results.type = 'Log' THEN 1 ELSE NULL END) as Logs
FROM 
 report_hosts rh
INNER JOIN 
 report_results rr 
ON 
 rh.report = rr.report
LEFT OUTER JOIN 
 results 
ON 
 rr.result = results.id 
 AND results.host = rh.host
WHERE
 results.type = 'Hole' 
 OR results.type = 'Warning' 
 OR results.type = 'Note' 
 OR results.type = 'Log'
GROUP BY rh.host, rh.report

Case statements, IME, are not the greatest performers, but your data bloat from the many joins may offset that and give this better performance.

Using a lot of data (in your case an extra left join ) will mean storing it in memory.

If you deplete the buffers your query will need to be stored to a temporary result table on drive.

Try using the same number of left join s but limiting the number of rows with a limit . It should confirm that the problem lies in the buffers (meaning it will run faster).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM