繁体   English   中英

优化子查询,使两个查询成为一个查询

[英]Optimizing sub-queries, making two queries become one

在此示例中,以下查询用于执行成员搜索,仅使用姓氏。 如果搜索完整的匹配名称,查询将在几秒钟后返回; 但如果:LastName = 'S' ,则查询需要12秒钟以上的时间才能返回。

如何加快查询速度? 如果我可以在一秒钟内完成两个查询,那我难道不能只做一个查询就这么快吗? 由于使用了插件和其他方法,让我成为一个查询对我来说是最容易的,因此是我的问题。

Member表包含我们曾经拥有的每个成员。 该表有一些我们没有注册的成员,因此它们仅存在于此表中,而不存在于RegistrationRegistration_History Registration_History具有我要显示的大多数成员的额外信息。 Registration具有与RH大部分相同的信息(RH具有Reg所没有的某些字段),但是有时它具有RH没有的成员,这就是为什么在这里将其加入。 编辑:成员可以在注册中具有多个行。 我想填写Registration_History中的列,但是,Registration中仅存在一些旧成员。 与其他成员不同,这些旧成员在注册中只包含1行,因此我不必担心注册如何排序,只需从那里抓取1行即可。

SQL Fiddle与示例数据库设计

在所有3个表中索引MemberID 在放入SELECT RHSubSelect.rehiId子查询之前,此查询几乎要花一整分钟才能返回。

如果我将查询分为2个查询,请执行以下操作:

SELECT
    MemberID
FROM
    Member
WHERE 
    Member.LastName LIKE CONCAT('%', :LastName, '%')

然后将那些MemberID放入数组中,然后将该数组传递给RHSubSelect.MemberID IN ($theArray) (而不是Member子查询),结果很快就会返回(大约一秒钟)。

完整查询:(完整的SELECT语句位于小提琴中,为简便起见, SELECT *

SELECT
    *
FROM
 Member
    LEFT JOIN
        Registration_History FORCE INDEX (PRIMARY)
            ON
                Registration_History.rehiId = (
                                                SELECT
                                                    RHSubSelect.rehiId
                                                FROM
                                                    Registration_History AS RHSubSelect
                                                WHERE
                                                    RHSubSelect.MemberID IN (
                                                                                SELECT
                                                                                    Member.MemberID
                                                                                FROM
                                                                                    Member
                                                                                WHERE 
                                                                                    Member.LastName LIKE CONCAT('%', :LastName, '%')
                                                                            )                                                                   
                                                ORDER BY 
                                                    RHSubSelect.EffectiveDate DESC
                                                LIMIT 0, 1
                                            )                                   
    LEFT JOIN
        Registration FORCE INDEX(MemberID)
            ON
                Registration.MemberID = Member.MemberID
WHERE 
    Member.LastName LIKE CONCAT('%', :LastName, '%') 
GROUP BY
    Member.MemberID
ORDER BY 
    Relevance ASC,LastName ASC,FirstName asc 
LIMIT 0, 1000

MySQL说明,在查询中带有FORCE INDEX() “ MySQL说明”

(如果未显示带有说明的图像,也位于此处: http : //oi41.tinypic.com/2iw4t8l.jpg

您似乎要检查的主要内容是姓氏,其中以%开头。 这将使该列上的索引无效,并且您的SQL对其进行了两次搜索。

我不确定您要尝试做什么。 您的SQL似乎会获得名称上与所需成员匹配的所有成员,然后获得这些成员的最后registration_history记录。 您可能会从任何一个匹配成员中获得一个成员,这看起来很奇怪,除非您只希望获得一个成员。

如果是这种情况,接下来的一些小整理(删除和IN并将其更改为JOIN)可能会稍微改善一下情况。

SELECT
    COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    CASE
        WHEN Member.LastNameTrimmed = :LastName
        THEN 1
        WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%')
        THEN 2
        ELSE 3
    END AS Relevance 
    FROM Member
    LEFT JOIN Registration_History FORCE INDEX (PRIMARY)
    ON Registration_History.rehiId = 
    (
        SELECT RHSubSelect.rehiId
        FROM Registration_History AS RHSubSelect
        INNER JOIN Member 
        ON RHSubSelect.MemberID = Member.MemberID
        WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
        ORDER BY RHSubSelect.EffectiveDate DESC
        LIMIT 0, 1
    )                                   
    LEFT JOIN Registration FORCE INDEX(MemberID)
    ON  Registration.MemberID = Member.MemberID
    WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
    GROUP BY Member.MemberID
    ORDER BY Relevance ASC,LastName ASC,FirstName asc 
    LIMIT 0, 1000

但是,如果这不是您想要的,则可能会进行进一步的更改。

进行更多清理,使用领先的通配符消除其中一种:

SELECT
    COALESCE(NULLIF(Sub2.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Sub2.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    CASE
        WHEN Member.LastNameTrimmed = :LastName
        THEN 1
        WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%')
        THEN 2
        ELSE 3
    END AS Relevance 
FROM Member
LEFT OUTER JOIN Registration 
ON  Registration.MemberID = Member.MemberID
LEFT OUTER JOIN
(
    SELECT Registration_History.MemberID, Registration_History.rehiID, Registration_History.RegYear, Registration_History.RegNumber
    FROM Registration_History
    INNER JOIN
    (
        SELECT RHSubSelect.MemberID, MAX(RHSubSelect.EffectiveDate) AS EffectiveDate
        FROM Registration_History AS RHSubSelect
        GROUP BY RHSubSelect.MemberID
    ) Sub1
    ON Registration_History.MemberID = Sub1.MemberID AND Registration_History.EffectiveDate = Sub1.EffectiveDate
) Sub2
ON  Sub2.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc 
LIMIT 0, 1000

这将使所有成员具有匹配的名称,其匹配的注册记录以及具有最新有效日期的他们的registration_history记录。

我认为最后一个GROUP BY是没有必要的(假设成员和注册之间存在1对1的关系,如果不是,则可能要使用GROUP BY以外的其他方式),但我现在将其保留。

害怕没有表声明和一些相同的数据,我无法真正测试它。

编辑-发挥作用,尝试在选择中减少早期处理的数量:-

SELECT
    COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Sub1.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Sub1.RegNumber, ''), NULLIF(Sub1.MemberID, '')) AS RegNumber,
    Sub1.MemberID,
    Sub1.LastName,
    Sub1.FirstName,
    CASE
        WHEN Sub1.LastName = :LastName
        THEN 1
        WHEN Sub1.LastName LIKE CONCAT(:LastName, '%')
        THEN 2
        ELSE 3
    END AS Relevance 
FROM
(
    SELECT 
        Member.MemberID,
        Member.LastName,
        Member.FirstName,
        Registration.Year,
        Registration.RegNumber,
        MAX(Registration_History.EffectiveDate) AS EffectiveDate
    FROM Member
    LEFT OUTER JOIN Registration 
    ON  Registration.MemberID = Member.MemberID
    LEFT OUTER JOIN Registration_History 
    ON Registration_History.MemberID = Member.MemberID
    WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
    GROUP BY Member.MemberID,
        Member.LastName,
        Member.FirstName,
        Registration.Year,
        Registration.RegNumber
) Sub1
LEFT OUTER JOIN Registration_History
ON Registration_History.MemberID = Sub1.MemberID AND Registration_History.EffectiveDate = Sub1.EffectiveDate
ORDER BY Relevance ASC,LastName ASC,FirstName asc 
LIMIT 0, 1000

再次编辑。

试试看。 您要排序的项目全部来自成员表,因此在子选择中尽可能早地排除可能是有意义的。

SELECT
    COALESCE(NULLIF(Registration_History2.EffectiveDate, ''), NULLIF(Registration2.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History2.RegNumber, ''), NULLIF(Registration2.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    Member.Relevance 
    FROM
    (
        SELECT Member.MemberID,
                Member.LastName,
                Member.FirstName,
                CASE
                    WHEN Member.LastName = :LastName
                    THEN 1
                    WHEN Member.LastName LIKE CONCAT(:LastName, '%')
                    THEN 2
                    ELSE 3
                END AS Relevance 
        FROM Member
        WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
        ORDER BY Relevance ASC,LastName ASC,FirstName asc 
        LIMIT 0, 1000
    ) Member
    LEFT OUTER JOIN 
    (
        SELECT MemberID, MAX(EffectiveDate) AS EffectiveDate
        FROM Registration_History 
        GROUP BY MemberID
    ) Registration_History
    ON Registration_History.MemberID = Member.MemberID
    LEFT OUTER JOIN Registration_History Registration_History2
    ON Registration_History2.MemberID = Registration_History.MemberID
    AND Registration_History2.EffectiveDate = Registration_History.EffectiveDate
    LEFT OUTER JOIN 
    (
        SELECT MemberID, MAX(Year) AS Year
        FROM Registration 
        GROUP BY MemberID
    ) Registration
    ON Registration.MemberID = Member.MemberID
    LEFT OUTER JOIN 
    (
        SELECT MemberID, Year, MAX(RegNumber) AS RegNumber
        FROM Registration 
        GROUP BY MemberID, Year
    ) Registration2
    ON Registration2.MemberID = Member.MemberID
    AND Registration2.Year = Registration.Year

再次编辑

未测试以下内容,因此,这只是用于尝试解决此问题的另一种方法而已,它使用GROUP_CONCAT的一个小技巧:-

SELECT
    COALESCE(NULLIF(Registration_History.EffectiveDate, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    Member.Relevance 
    FROM
    (
        SELECT Member.MemberID,
                Member.LastName,
                Member.FirstName,
                CASE
                    WHEN Member.LastName = :LastName
                    THEN 1
                    WHEN Member.LastName LIKE CONCAT(:LastName, '%')
                    THEN 2
                    ELSE 3
                END AS Relevance 
        FROM Member
        WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
        ORDER BY Relevance ASC,LastName ASC,FirstName asc 
        LIMIT 0, 1000
    ) Member
    LEFT OUTER JOIN 
    (
        SELECT MemberID, 
                SUBSTRING_INDEX(GROUP_CONCAT(EffectiveDate ORDER BY EffectiveDate DESC), ",", 1) AS EffectiveDate,
                SUBSTRING_INDEX(GROUP_CONCAT(RegNumber ORDER BY EffectiveDate DESC), ",", 1) AS RegNumber
        FROM Registration_History 
        GROUP BY MemberID
    ) Registration_History
    ON Registration_History.MemberID = Member.MemberID
    LEFT OUTER JOIN 
    (
        SELECT MemberID, 
                SUBSTRING_INDEX(GROUP_CONCAT(Year ORDER BY Year DESC), ",", 1) AS Year,
                SUBSTRING_INDEX(GROUP_CONCAT(RegNumber ORDER BY Year DESC), ",", 1) AS RegNumber
        FROM Registration 
        GROUP BY MemberID
    ) Registration
    ON Registration.MemberID = Member.MemberID

我的建议是这样的查询:

SELECT *
FROM Member
LEFT JOIN Registration USING (MemberID)
LEFT JOIN Registration_History ON rehiID = (
  SELECT rehiID
  FROM Registration_History AS RHSubSelect
  WHERE RHSubSelect.MemberID = Member.MemberID
  ORDER BY EffectiveDate DESC
  LIMIT 1
)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')

它的工作方式是,从与LastName匹配的Member表中开始进行选择。 然后,您就可以简单地LEFT JOIN Registration表添加LEFT JOIN ,因为特定成员在该表中最多可以有1个条目。 最后,您通过一个子选择LEFT JOIN连接Registration_History表。

子选择将查找与当前MemberID匹配的最新有效日期 ,并返回该记录的rehiID LEFT JOIN则必须exacty匹配rehiID。 如果在Registration_History中没有该成员的条目,则不加入任何内容。

从理论上讲,这应该相对较快,因为您只在主查询中执行LIKE比较。 由于表在MemberID上建立了索引,因此注册联接应该很快。 但是,我怀疑您需要在Registration_History上附加索引才能获得最佳性能。

您已经有了主键rehID ,该索引已索引,这是我们对rehID进行LEFT JOIN 所需的 但是,子查询需要匹配WHERE子句中的MemberID以及按EffectiveDate进行排序。 为了获得最佳性能,我认为您需要结合MemberIDEffectiveDate列的附加索引。

请注意,我的示例查询只是使事情变得简单的最低要求。 显然,您需要将*替换为要返回的所有字段(与原始查询相同)。 另外,您还需要添加ORDER BYLIMIT子句。 但是,不需要GROUP BY

SQL Fiddle链接: http ://sqlfiddle.com/#!2 / 4a947a / 1

上面的小提琴显示了完整的查询,除了它的姓氏是硬编码的。 我已经修改了您的原始样本数据,以包含更多记录并更改了一些值。 我还在Registration_History表上添加了额外的索引。

针对LIMIT进行优化

如果您打算再次运行计时,那么我很想知道在加入RegistrationRegistration_History表之前,先使用Kickstart建议的修改对Member表进行子选择时查询的性能如何。

SELECT
    COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    Member.Relevance
FROM (
  SELECT MemberID, LastName, FirstName,
    CASE
      WHEN Member.LastNameTrimmed = :LastName THEN 1
      WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%') THEN 2
      ELSE 3
    END AS Relevance 
  FROM Member
  WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
  ORDER BY Relevance ASC,LastName ASC,FirstName ASC
  LIMIT 0, 1000
) Member
LEFT JOIN Registration USING (MemberID)
LEFT JOIN Registration_History ON rehiID = (
  SELECT rehiID
  FROM Registration_History AS RHSubSelect
  WHERE RHSubSelect.MemberID = Member.MemberID
  ORDER BY EffectiveDate DESC
  LIMIT 1
)

使用LIMIT时,它的性能应明显优于我的原始查询,因为它不必为LIMIT排除的记录执行一堆不必要的连接。

如果我正确理解了您的问题(您只需要选择特定的用户及其最近的历史记录-正确)吗? 如果是,则您的问题实际上是每组最大记录问题的非常容易的变体。 无需任何子查询:

查询#1

SELECT Member.*, rh1.*
FROM Member
LEFT JOIN Registration_History AS rh1 USING (MemberID)
LEFT JOIN Registration_History AS rh2
    ON rh1.MemberId = rh2.MemberId AND rh1.EffectiveDate < rh2.EffectiveDate
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
    AND rh2.MemberId IS NULL
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000

查询3

(#2已删除,此处采取#3以避免评论中的混淆)

SELECT Member.*, max(rh1.EffectiveDate), rh1.*
FROM Member
LEFT JOIN Registration_History AS rh1 USING (MemberID)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000

查询#4

这是受James查询启发的,但是删除了limitorder by (请注意,您不仅应该为此而且所有查询都在EffectiveDate上定义索引,以提高效率!)

select *
from Member
left join Registration_History AS rh1 on rh1.MemberID = Member.MemberID
    and rh1.EffectiveDate = (select max(rh2.EffectiveDate)
                             from Registration_History as rh2
                             where rh2.MemberID = Member.MemberID)
                        )
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000

请在您的数据库中发布实际持续时间!

试试这个查询:

set @lastname = 'Smith1';

-- explain extended
SELECT  
    COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    CASE
      WHEN Member.LastNameTrimmed = 'Smith' THEN 1
      WHEN Member.LastNameTrimmed LIKE CONCAT(@lastname, '%') THEN 2
      ELSE 3
    END AS Relevance 
FROM (
    SELECT  Member.*,
        ( SELECT RHSubSelect.rehiId
            FROM  Registration_History AS RHSubSelect
            WHERE RHSubSelect.MemberID = Member.MemberID                                         
            ORDER BY RHSubSelect.EffectiveDate DESC
            LIMIT 0,1
         ) rh_MemberId
    FROM Member
    WHERE Member.LastName LIKE CONCAT('%', @lastname, '%')
) Member
LEFT JOIN  Registration_History 
    ON Registration_History.rehiId = Member.rh_MemberId
LEFT JOIN Registration -- FORCE INDEX(MemberID)
    ON Registration.MemberID = Member.MemberID
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc 
LIMIT 0, 1000
;

好的,这是我的镜头,我用了很多曲子。 第一,由于您没有指出如何使它起作用,因此我不得不从其中选择“相关性”字段。 接下来,由于您想要注册历史记录中给定成员的最新条目(如果它们存在于R / H中),那么看来生效日期与ReHiID相关联,因此我使用它似乎是解决问题的重要方法为后续的左联接工作。

因此,内部查询仅对您要查找的名称的条件进行初步传递,并应用相关性并限制此处的1000个条目。 这样,它不必在外部级别通过20,000个条目就可以加入……只要有资格参加的1000个即可。

然后,将结果按指示与其他表左连接...仅注册一个条目(如果存在),并与成员和最大ReHiID上的R / H左连接。

要应用您要查找的名称,只需在查询中更改(select @LookForMe:='S')sqlvars行...

select *
   from
      ( select
              M.*,
              max( RH.EffectiveDate ) as MaxEffectiveDate,
              max( R.RegNumber ) as MaxRegNumber,
              CASE WHEN M.LastNameTrimmed = @LookForMe THEN 1
              WHEN M.LastNameTrimmed LIKE CONCAT(@LookForMe, '%') THEN 2
              ELSE 3 END AS Relevance 
           from
              ( select @LookForMe := 'S' ) sqlvars,
              Member M
                 LEFT JOIN Registration_History RH
                    on M.MemberID = RH.MemberID
                 LEFT JOIN Registration R
                    on M.MemberID = R.MemberID
           where 
              M.LastName LIKE CONCAT('%', 'S', '%')
           group by
              M.MemberID
           order by
              Relevance, 
              M.LastName,
              M.FirstName
           limit
              0,1000 ) PreQuery
      LEFT JOIN Registration R2
         on PreQuery.MemberNumber = R2.MemberNumber
         AND PreQuery.MaxRegNumber = R2.RegNumber
      LEFT JOIN Registration_History RH2
         ON PreQuery.MemberNumber = RH2.MemberNumber
        AND PreQuery.MaxEffectiveDate = RH2.EffectiveDate

让我们来看看如何快速处理您的生产数据以及我们有多接近。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM