简体   繁体   English

使用多个联接优化MySQL查询

[英]Optimizing a MySQL query with several joins

One of the queries used by a web app we're running is as follows: 我们正在运行的Web应用程序使用的查询之一如下:

SELECT
       p.id, r.id AS report_id, tr.result_id,
       r.report_date, r.department, r.reportStatus, rs.specimen,
       tr.name, tr.value, tr.flag, tr.unit, tr.reference_range
FROM patients AS p
INNER JOIN
    patients_reports AS pr ON pr.patient_id = p.id
INNER JOIN
    reports AS r ON pr.report_id = r.id
INNER JOIN
    results AS rs ON r.id = rs.report_id
INNER JOIN
    test_results AS tr ON rs.id = tr.result_id
WHERE pr.patient_id = 17548
ORDER BY rs.specimen, tr.name, r.report_date;

The explain plan looks like this: 解释计划如下所示:

+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key       | key_len | ref               | rows   | Extra                                        |
+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+----------------------------------------------+
|  1 | SIMPLE      | p     | const  | PRIMARY       | PRIMARY   | 4       | const             |      1 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | rs    | ALL    | PRIMARY       | NULL      | NULL    | NULL              | 152817 |                                              |
|  1 | SIMPLE      | r     | eq_ref | PRIMARY       | PRIMARY   | 4       | demo.rs.report_id |      1 |                                              |
|  1 | SIMPLE      | pr    | eq_ref | PRIMARY       | PRIMARY   | 8       | const,demo.r.id   |      1 | Using where; Using index                     |
|  1 | SIMPLE      | tr    | ref    | result_id     | result_id | 5       | demo.rs.id        |      1 | Using where                                  |
+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+----------------------------------------------+

The query returns 27371 rows. 查询返回27371行。 There are 152730 rows in test_results at the moment. 目前,test_results中有152730行。 This is just a small amount of demo data. 这只是少量的演示数据。

I've tried to get the query to be more efficient, but I'm having trouble getting it to execute more quickly. 我试图使查询更有效,但是我很难使它更快地执行。 I've had a look at various articles on documentation and questions on stackoverflow, but have not been able to fix this. 我看过各种有关文档的文章以及有关stackoverflow的问题,但无法解决此问题。

I tried removing one of the joins as follows: 我尝试按以下方式删除其中一个联接:

SELECT
       pr.patient_id, r.id AS report_id, tr.result_id,
       r.report_date, r.department, r.reportStatus, rs.specimen,
       tr.name, tr.value, tr.flag, tr.unit, tr.reference_range
FROM patients_reports AS pr
INNER JOIN
    reports AS r ON pr.report_id = r.id
INNER JOIN
    results AS rs ON r.id = rs.report_id
INNER JOIN
    test_results AS tr ON rs.id = tr.result_id
WHERE pr.patient_id = 17548
ORDER BY rs.specimen, tr.name, r.report_date;

The query plan is then as follows: 查询计划如下:

+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+---------------------------------+
| id | select_type | table | type   | possible_keys | key       | key_len | ref               | rows   | Extra                           |
+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+---------------------------------+
|  1 | SIMPLE      | rs    | ALL    | PRIMARY       | NULL      | NULL    | NULL              | 152817 | Using temporary; Using filesort |
|  1 | SIMPLE      | r     | eq_ref | PRIMARY       | PRIMARY   | 4       | demo.rs.report_id |      1 |                                 |
|  1 | SIMPLE      | pr    | eq_ref | PRIMARY       | PRIMARY   | 8       | const,demo.r.id   |      1 | Using where; Using index        |
|  1 | SIMPLE      | tr    | ref    | result_id     | result_id | 5       | demo.rs.id        |      1 | Using where                     |
+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+---------------------------------+

So not much different. 所以没有太大的不同。

I've tried rearranging the query and using STRAIGHT_JOIN amongst other things, but I'm not getting anywhere. 我尝试过重新排列查询并使用STRAIGHT_JOIN进行其他操作,但是我什么也没找到。

I'd appreciate some suggestions on how to optimize the query. 我希望您能对如何优化查询提出一些建议。 Thanks. 谢谢。

EDIT: Argh! 编辑:啊! I did not have an index on results.report_id, but it does not seem to have helped: 我没有关于results.report_id的索引,但是它似乎没有帮助:

+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+--------+---------------------------------+
| id | select_type | table | type   | possible_keys     | key       | key_len | ref               | rows   | Extra                           |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+--------+---------------------------------+
|  1 | SIMPLE      | rs    | ALL    | PRIMARY,report_id | NULL      | NULL    | NULL              | 152817 | Using temporary; Using filesort |
|  1 | SIMPLE      | r     | eq_ref | PRIMARY           | PRIMARY   | 4       | demo.rs.report_id |      1 |                                 |
|  1 | SIMPLE      | pr    | eq_ref | PRIMARY           | PRIMARY   | 8       | const,demo.r.id   |      1 | Using where; Using index        |
|  1 | SIMPLE      | tr    | ref    | result_id         | result_id | 5       | demo.rs.id        |      1 | Using where                     |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+--------+---------------------------------+

EDIT2: EDIT2:

patients_reports looks like this: Patient_reports看起来像这样:

+------------+---------+------+-----+---------+-------+
| Field      | Type    | Null | Key | Default | Extra |
+------------+---------+------+-----+---------+-------+
| patient_id | int(11) | NO   | PRI | 0       |       |
| report_id  | int(11) | NO   | PRI | 0       |       |
+------------+---------+------+-----+---------+-------+

EDIT3: EDIT3:

After adding the results.report_id index and trying the STRAIGHT_JOIN again as suggested by @DRapp: 在添加了results.report_id索引并按照@DRapp的建议再次尝试STRAIGHT_JOIN之后:

SELECT STRAIGHT_JOIN
       r.id AS report_id, tr.result_id,
       r.report_date, r.department, r.reportStatus, rs.specimen,
       tr.name, tr.value, tr.flag, tr.unit, tr.reference_range
FROM patients_reports AS pr
INNER JOIN
    reports AS r ON pr.report_id = r.id
INNER JOIN
    results AS rs ON r.id = rs.report_id
INNER JOIN
    test_results AS tr ON rs.id = tr.result_id
WHERE pr.patient_id = 17548
ORDER BY rs.specimen, tr.name, r.report_date;

the plan looks like this: 该计划如下所示:

+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys     | key       | key_len | ref               | rows | Extra                                        |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+------+----------------------------------------------+
|  1 | SIMPLE      | pr    | ref    | PRIMARY           | PRIMARY   | 4       | const             | 3646 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | r     | eq_ref | PRIMARY           | PRIMARY   | 4       | demo.pr.report_id |    1 |                                              |
|  1 | SIMPLE      | rs    | ref    | PRIMARY,report_id | report_id | 5       | demo.r.id         |  764 | Using where                                  |
|  1 | SIMPLE      | tr    | ref    | result_id         | result_id | 5       | demo.rs.id        |    1 | Using where                                  |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+------+----------------------------------------------+

So I think that looks much better, but I'm not sure exactly how to tell. 因此,我认为这看起来要好得多,但我不确定如何分辨。 Also the query still seems to take about the same about of time as before. 此外,查询似乎仍需要大约与以前相同的时间。

I would use STRAIGHT_JOIN and go with your second query that has the patients_reports table first and secondarily join to the patient table for their name info. 我将使用STRAIGHT_JOIN并使用第二个查询,该查询首先具有Patients_reports表,然后再连接到Patient表获取其姓名信息。 Additionally, if I didn't see it, was there an index on the patients_reports table by the PATIENT_ID column either by itself, or as first element of a compound index key? 另外,如果我没有看到它,是否在PATIENT_ID列上的Patient_reports表上有单独的索引,还是作为复合索引键的第一个元素?

Additionally, ensure RESULTS has an index on Report_ID, same with TEST_RESULTS (index on Result_ID) 此外,请确保RESULTS在Report_ID上具有与TEST_RESULTS相同的索引(在Result_ID上具有索引)

Is results.report_id indexed? 是否为results.report_id编制了索引? It's failing to find a key and doing a table scan it looks like. 它找不到键并进行表扫描。 I'm assuming results.id is actually the primary key. 我假设results.id实际上是主键。

Also, if it report_id was the primary key, and it's INNODB, it should be clustered on that index, so absolutely no clue why that isn't screaming fast if it is configured that way. 另外,如果REPORT_ID是主键,这是INNODB,应该对指数进行聚类,所以绝对不知道为什么,如果它配置成这种方式是不尖叫快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM