如何使用内连接优化查询

Question

My mysql query is too slow and i don't know how to optimize it.我的 mysql 查询太慢了，我不知道如何优化它。 My webapp cant load this query because take too much time to run and the webserver have a limit time to get the result.我的 web 应用程序无法加载此查询，因为运行时间太长，并且网络服务器获得结果的时间有限。

    SELECT rc.trial_id,
    rc.created,
    rc.date_registration,
    rc.agemin_value,
    rc.agemin_unit,
    rc.agemax_value,
    rc.agemax_unit,
    rc.exclusion_criteria,
    rc.study_design,
    rc.expanded_access_program,
    rc.number_of_arms,
    rc.enrollment_start_actual,
    rc.target_sample_size,
    (select name from repository_institution where id = rc.primary_sponsor_id) as 
    primary_sponsor,
    (select label from vocabulary_studytype where id = rc.study_type_id) as study_type,
    (select label from vocabulary_interventionassigment where id = 
    rc.intervention_assignment_id) as intervention_assignment,
    (select label from vocabulary_studypurpose where id = rc.purpose_id) as study_purpose,  
    (select label from vocabulary_studymasking where id = rc.masking_id) as study_mask,
    (select label from vocabulary_studyallocation where id = rc.allocation_id) as 
    study_allocation,        
    (select label from vocabulary_studyphase where id = rc.phase_id) as phase,
    (select label from vocabulary_recruitmentstatus where id = rc.recruitment_status_id) as 
    recruitment_status,
    GROUP_CONCAT(vi.label) 
    FROM
    repository_clinicaltrial rc 
    inner JOIN repository_clinicaltrial_i_code rcic ON rcic.clinicaltrial_id = rc.id JOIN 
    vocabulary_interventioncode vi ON vi.id = rcic.interventioncode_id 
    GROUP BY rc.id;

Using inner join instead join could be a solution?使用内部连接代替 join 可能是一个解决方案？

Answer 1

Changing to JOINs vs continuous selects per every row will definitely improve.更改为 JOIN 与每行连续选择肯定会有所改善。 Also, since you are using MySQL, using the keyword "STRAIGHT_JOIN" tells MySQL to do the query in the order I provided.此外，由于您使用的是 MySQL，因此使用关键字“STRAIGHT_JOIN”会告诉 MySQL 按照我提供的顺序执行查询。 Since your "rc" table is the primary and all the others are lookups, this will make MySQL use it in that context rather than hoping some other lookup table be the basis of the rest of the joins.由于您的“rc”表是主表，而所有其他表都是查找表，这将使 MySQL 在该上下文中使用它，而不是希望其他查找表成为其余连接的基础。

SELECT STRAIGHT_JOIN
        rc.trial_id,
        rc.created,
        rc.date_registration,
        rc.agemin_value,
        rc.agemin_unit,
        rc.agemax_value,
        rc.agemax_unit,
        rc.exclusion_criteria,
        rc.study_design,
        rc.expanded_access_program,
        rc.number_of_arms,
        rc.enrollment_start_actual,
        rc.target_sample_size,
        ri.name primary_sponsor,
        st.label study_type,
        via.label intervention_assignment,
        vsp.label study_purpose,
        vsm.label study_mask,
        vsa.label study_allocation,
        vsph.label phase,
        vrs.label recruitment_status,
        GROUP_CONCAT(vi.label) 
    FROM
        repository_clinicaltrial rc 
            JOIN repository_clinicaltrial_i_code rcic 
                ON rc.id = rcic.clinicaltrial_id
                JOIN vocabulary_interventioncode vi 
                    ON rcic.interventioncode_id = vi.id
            JOIN repository_institution ri
                on rc.primary_sponsor_id = ri.id
            JOIN vocabulary_studytype st
                on rc.study_type_id = st.id
            JOIN vocabulary_interventionassigment via 
                on rc.intervention_assignment_id = via.id
            JOIN vocabulary_studypurpose vsp 
                ON rc.purpose_id = vsp.id
            JOIN vocabulary_studymasking vsm 
                ON rc.masking_id = vsm.id
            JOIN vocabulary_studyallocation vsa 
                ON rc.allocation_id = vsa.id
            JOIN vocabulary_studyphase vsph
                ON rc.phase_id = vsph.id
            JOIN vocabulary_recruitmentstatus vrs 
                ON rc.recruitment_status_id = vrs.id 
    GROUP BY 
        rc.id;

One final note.最后一点。 You are using a GROUP BY and applying to the GROUP_CONCAT() which is ok.您正在使用 GROUP BY 并应用于 GROUP_CONCAT() ，这没问题。 However, proper group by says you need to group by all non-aggregate columns, which in this case is every other column in the list.但是，正确的 group by 表示您需要按所有非聚合列进行分组，在这种情况下是列表中的所有其他列。 You may know this, and the fact the lookups will be the same based on the "rc" associated columns, but its not good practice to do so.您可能知道这一点，并且基于“rc”相关列的查找将是相同的，但这样做并不是一个好习惯。

Answer 2

Your joins and subqueries are probably not the problem.您的连接和子查询可能不是问题所在。 Assuming you have correct indexes on the tables, then these are fast.假设您在表上有正确的索引，那么这些索引很快。 "Correct indexes" means that the id column is the primary key -- a very reasonable assumption. “正确的索引”意味着id列是primary key ——这是一个非常合理的假设。

My guess is that the GROUP BY is the performance issue.我的猜测是GROUP BY是性能问题。 So, I would suggest structuring the query with no `GROUP BY:因此，我建议在没有 `GROUP BY 的情况下构建查询：

select . . .
       (select group_concat(vi.label)
        from repository_clinicaltrial_i_code rcic 
             vocabulary_interventioncode vi 
             on vi.id = rcic.interventioncode_id 
        where rcic.clinicaltrial_id = rc.id
       )
from repository_clinicaltrial rc ;

For this, you want indexes on:为此，您需要索引：

repository_clinicaltrial_i_code(clinicaltrial_id, interventioncode_id)
vocabulary_interventioncode(id, label)

如何使用内连接优化查询

问题描述

2 个解决方案

解决方案1
0 2020-11-23 12:47:06

解决方案2
0 2020-11-23 13:56:36

如何使用内连接优化查询

问题描述

2 个解决方案

解决方案1 0 2020-11-23 12:47:06

解决方案2 0 2020-11-23 13:56:36

解决方案1
0 2020-11-23 12:47:06

解决方案2
0 2020-11-23 13:56:36