[英]Optimizing sub-queries, making two queries become one
在此示例中,以下查询用于执行成员搜索,仅使用姓氏。 如果搜索完整的匹配名称,查询将在几秒钟后返回; 但如果:LastName = 'S'
,则查询需要12秒钟以上的时间才能返回。
如何加快查询速度? 如果我可以在一秒钟内完成两个查询,那我难道不能只做一个查询就这么快吗? 由于使用了插件和其他方法,让我成为一个查询对我来说是最容易的,因此是我的问题。
Member
表包含我们曾经拥有的每个成员。 该表有一些我们没有注册的成员,因此它们仅存在于此表中,而不存在于Registration
或Registration_History
。 Registration_History
具有我要显示的大多数成员的额外信息。 Registration
具有与RH大部分相同的信息(RH具有Reg所没有的某些字段),但是有时它具有RH没有的成员,这就是为什么在这里将其加入。 编辑:成员可以在注册中具有多个行。 我想填写Registration_History中的列,但是,Registration中仅存在一些旧成员。 与其他成员不同,这些旧成员在注册中只包含1行,因此我不必担心注册如何排序,只需从那里抓取1行即可。
在所有3个表中索引MemberID
。 在放入SELECT RHSubSelect.rehiId
子查询之前,此查询几乎要花一整分钟才能返回。
如果我将查询分为2个查询,请执行以下操作:
SELECT
MemberID
FROM
Member
WHERE
Member.LastName LIKE CONCAT('%', :LastName, '%')
然后将那些MemberID
放入数组中,然后将该数组传递给RHSubSelect.MemberID IN ($theArray)
(而不是Member子查询),结果很快就会返回(大约一秒钟)。
完整查询:(完整的SELECT语句位于小提琴中,为简便起见, SELECT *
)
SELECT
*
FROM
Member
LEFT JOIN
Registration_History FORCE INDEX (PRIMARY)
ON
Registration_History.rehiId = (
SELECT
RHSubSelect.rehiId
FROM
Registration_History AS RHSubSelect
WHERE
RHSubSelect.MemberID IN (
SELECT
Member.MemberID
FROM
Member
WHERE
Member.LastName LIKE CONCAT('%', :LastName, '%')
)
ORDER BY
RHSubSelect.EffectiveDate DESC
LIMIT 0, 1
)
LEFT JOIN
Registration FORCE INDEX(MemberID)
ON
Registration.MemberID = Member.MemberID
WHERE
Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY
Member.MemberID
ORDER BY
Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
MySQL说明,在查询中带有FORCE INDEX()
:
(如果未显示带有说明的图像,也位于此处: http : //oi41.tinypic.com/2iw4t8l.jpg )
您似乎要检查的主要内容是姓氏,其中以%开头。 这将使该列上的索引无效,并且您的SQL对其进行了两次搜索。
我不确定您要尝试做什么。 您的SQL似乎会获得名称上与所需成员匹配的所有成员,然后获得这些成员的最后registration_history记录。 您可能会从任何一个匹配成员中获得一个成员,这看起来很奇怪,除非您只希望获得一个成员。
如果是这种情况,接下来的一些小整理(删除和IN并将其更改为JOIN)可能会稍微改善一下情况。
SELECT
COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastNameTrimmed = :LastName
THEN 1
WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM Member
LEFT JOIN Registration_History FORCE INDEX (PRIMARY)
ON Registration_History.rehiId =
(
SELECT RHSubSelect.rehiId
FROM Registration_History AS RHSubSelect
INNER JOIN Member
ON RHSubSelect.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY RHSubSelect.EffectiveDate DESC
LIMIT 0, 1
)
LEFT JOIN Registration FORCE INDEX(MemberID)
ON Registration.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
但是,如果这不是您想要的,则可能会进行进一步的更改。
进行更多清理,使用领先的通配符消除其中一种:
SELECT
COALESCE(NULLIF(Sub2.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Sub2.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastNameTrimmed = :LastName
THEN 1
WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM Member
LEFT OUTER JOIN Registration
ON Registration.MemberID = Member.MemberID
LEFT OUTER JOIN
(
SELECT Registration_History.MemberID, Registration_History.rehiID, Registration_History.RegYear, Registration_History.RegNumber
FROM Registration_History
INNER JOIN
(
SELECT RHSubSelect.MemberID, MAX(RHSubSelect.EffectiveDate) AS EffectiveDate
FROM Registration_History AS RHSubSelect
GROUP BY RHSubSelect.MemberID
) Sub1
ON Registration_History.MemberID = Sub1.MemberID AND Registration_History.EffectiveDate = Sub1.EffectiveDate
) Sub2
ON Sub2.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
这将使所有成员具有匹配的名称,其匹配的注册记录以及具有最新有效日期的他们的registration_history记录。
我认为最后一个GROUP BY是没有必要的(假设成员和注册之间存在1对1的关系,如果不是,则可能要使用GROUP BY以外的其他方式),但我现在将其保留。
害怕没有表声明和一些相同的数据,我无法真正测试它。
编辑-发挥作用,尝试在选择中减少早期处理的数量:-
SELECT
COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Sub1.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Sub1.RegNumber, ''), NULLIF(Sub1.MemberID, '')) AS RegNumber,
Sub1.MemberID,
Sub1.LastName,
Sub1.FirstName,
CASE
WHEN Sub1.LastName = :LastName
THEN 1
WHEN Sub1.LastName LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM
(
SELECT
Member.MemberID,
Member.LastName,
Member.FirstName,
Registration.Year,
Registration.RegNumber,
MAX(Registration_History.EffectiveDate) AS EffectiveDate
FROM Member
LEFT OUTER JOIN Registration
ON Registration.MemberID = Member.MemberID
LEFT OUTER JOIN Registration_History
ON Registration_History.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY Member.MemberID,
Member.LastName,
Member.FirstName,
Registration.Year,
Registration.RegNumber
) Sub1
LEFT OUTER JOIN Registration_History
ON Registration_History.MemberID = Sub1.MemberID AND Registration_History.EffectiveDate = Sub1.EffectiveDate
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
再次编辑。
试试看。 您要排序的项目全部来自成员表,因此在子选择中尽可能早地排除可能是有意义的。
SELECT
COALESCE(NULLIF(Registration_History2.EffectiveDate, ''), NULLIF(Registration2.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History2.RegNumber, ''), NULLIF(Registration2.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
Member.Relevance
FROM
(
SELECT Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastName = :LastName
THEN 1
WHEN Member.LastName LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM Member
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
) Member
LEFT OUTER JOIN
(
SELECT MemberID, MAX(EffectiveDate) AS EffectiveDate
FROM Registration_History
GROUP BY MemberID
) Registration_History
ON Registration_History.MemberID = Member.MemberID
LEFT OUTER JOIN Registration_History Registration_History2
ON Registration_History2.MemberID = Registration_History.MemberID
AND Registration_History2.EffectiveDate = Registration_History.EffectiveDate
LEFT OUTER JOIN
(
SELECT MemberID, MAX(Year) AS Year
FROM Registration
GROUP BY MemberID
) Registration
ON Registration.MemberID = Member.MemberID
LEFT OUTER JOIN
(
SELECT MemberID, Year, MAX(RegNumber) AS RegNumber
FROM Registration
GROUP BY MemberID, Year
) Registration2
ON Registration2.MemberID = Member.MemberID
AND Registration2.Year = Registration.Year
再次编辑
未测试以下内容,因此,这只是用于尝试解决此问题的另一种方法而已,它使用GROUP_CONCAT的一个小技巧:-
SELECT
COALESCE(NULLIF(Registration_History.EffectiveDate, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
Member.Relevance
FROM
(
SELECT Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastName = :LastName
THEN 1
WHEN Member.LastName LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM Member
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
) Member
LEFT OUTER JOIN
(
SELECT MemberID,
SUBSTRING_INDEX(GROUP_CONCAT(EffectiveDate ORDER BY EffectiveDate DESC), ",", 1) AS EffectiveDate,
SUBSTRING_INDEX(GROUP_CONCAT(RegNumber ORDER BY EffectiveDate DESC), ",", 1) AS RegNumber
FROM Registration_History
GROUP BY MemberID
) Registration_History
ON Registration_History.MemberID = Member.MemberID
LEFT OUTER JOIN
(
SELECT MemberID,
SUBSTRING_INDEX(GROUP_CONCAT(Year ORDER BY Year DESC), ",", 1) AS Year,
SUBSTRING_INDEX(GROUP_CONCAT(RegNumber ORDER BY Year DESC), ",", 1) AS RegNumber
FROM Registration
GROUP BY MemberID
) Registration
ON Registration.MemberID = Member.MemberID
我的建议是这样的查询:
SELECT *
FROM Member
LEFT JOIN Registration USING (MemberID)
LEFT JOIN Registration_History ON rehiID = (
SELECT rehiID
FROM Registration_History AS RHSubSelect
WHERE RHSubSelect.MemberID = Member.MemberID
ORDER BY EffectiveDate DESC
LIMIT 1
)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
它的工作方式是,从与LastName匹配的Member表中开始进行选择。 然后,您就可以简单地LEFT JOIN
Registration表添加LEFT JOIN
,因为特定成员在该表中最多可以有1个条目。 最后,您通过一个子选择LEFT JOIN
连接Registration_History表。
子选择将查找与当前MemberID匹配的最新有效日期 ,并返回该记录的rehiID 。 该LEFT JOIN
则必须exacty匹配rehiID。 如果在Registration_History中没有该成员的条目,则不加入任何内容。
从理论上讲,这应该相对较快,因为您只在主查询中执行LIKE
比较。 由于表在MemberID上建立了索引,因此注册联接应该很快。 但是,我怀疑您需要在Registration_History上附加索引才能获得最佳性能。
您已经有了主键rehID ,该索引已索引,这是我们对rehID进行LEFT JOIN
所需的 。 但是,子查询需要匹配WHERE
子句中的MemberID以及按EffectiveDate进行排序。 为了获得最佳性能,我认为您需要结合MemberID和EffectiveDate列的附加索引。
请注意,我的示例查询只是使事情变得简单的最低要求。 显然,您需要将*
替换为要返回的所有字段(与原始查询相同)。 另外,您还需要添加ORDER BY
和LIMIT
子句。 但是,不需要GROUP BY
。
SQL Fiddle链接: http ://sqlfiddle.com/#!2 / 4a947a / 1
上面的小提琴显示了完整的查询,除了它的姓氏是硬编码的。 我已经修改了您的原始样本数据,以包含更多记录并更改了一些值。 我还在Registration_History表上添加了额外的索引。
针对LIMIT进行优化
如果您打算再次运行计时,那么我很想知道在加入Registration和Registration_History表之前,先使用Kickstart建议的修改对Member表进行子选择时查询的性能如何。
SELECT
COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
Member.Relevance
FROM (
SELECT MemberID, LastName, FirstName,
CASE
WHEN Member.LastNameTrimmed = :LastName THEN 1
WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%') THEN 2
ELSE 3
END AS Relevance
FROM Member
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000
) Member
LEFT JOIN Registration USING (MemberID)
LEFT JOIN Registration_History ON rehiID = (
SELECT rehiID
FROM Registration_History AS RHSubSelect
WHERE RHSubSelect.MemberID = Member.MemberID
ORDER BY EffectiveDate DESC
LIMIT 1
)
使用LIMIT时,它的性能应明显优于我的原始查询,因为它不必为LIMIT排除的记录执行一堆不必要的连接。
如果我正确理解了您的问题(您只需要选择特定的用户及其最近的历史记录-正确)吗? 如果是,则您的问题实际上是每组最大记录问题的非常容易的变体。 无需任何子查询:
SELECT Member.*, rh1.*
FROM Member
LEFT JOIN Registration_History AS rh1 USING (MemberID)
LEFT JOIN Registration_History AS rh2
ON rh1.MemberId = rh2.MemberId AND rh1.EffectiveDate < rh2.EffectiveDate
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
AND rh2.MemberId IS NULL
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000
(#2已删除,此处采取#3以避免评论中的混淆)
SELECT Member.*, max(rh1.EffectiveDate), rh1.*
FROM Member
LEFT JOIN Registration_History AS rh1 USING (MemberID)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000
这是受James查询启发的,但是删除了limit
和order by
(请注意,您不仅应该为此而且所有查询都在EffectiveDate上定义索引,以提高效率!)
select *
from Member
left join Registration_History AS rh1 on rh1.MemberID = Member.MemberID
and rh1.EffectiveDate = (select max(rh2.EffectiveDate)
from Registration_History as rh2
where rh2.MemberID = Member.MemberID)
)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000
请在您的数据库中发布实际持续时间!
试试这个查询:
set @lastname = 'Smith1';
-- explain extended
SELECT
COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastNameTrimmed = 'Smith' THEN 1
WHEN Member.LastNameTrimmed LIKE CONCAT(@lastname, '%') THEN 2
ELSE 3
END AS Relevance
FROM (
SELECT Member.*,
( SELECT RHSubSelect.rehiId
FROM Registration_History AS RHSubSelect
WHERE RHSubSelect.MemberID = Member.MemberID
ORDER BY RHSubSelect.EffectiveDate DESC
LIMIT 0,1
) rh_MemberId
FROM Member
WHERE Member.LastName LIKE CONCAT('%', @lastname, '%')
) Member
LEFT JOIN Registration_History
ON Registration_History.rehiId = Member.rh_MemberId
LEFT JOIN Registration -- FORCE INDEX(MemberID)
ON Registration.MemberID = Member.MemberID
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
;
好的,这是我的镜头,我用了很多曲子。 第一,由于您没有指出如何使它起作用,因此我不得不从其中选择“相关性”字段。 接下来,由于您想要注册历史记录中给定成员的最新条目(如果它们存在于R / H中),那么看来生效日期与ReHiID相关联,因此我使用它似乎是解决问题的重要方法为后续的左联接工作。
因此,内部查询仅对您要查找的名称的条件进行初步传递,并应用相关性并限制此处的1000个条目。 这样,它不必在外部级别通过20,000个条目就可以加入……只要有资格参加的1000个即可。
然后,将结果按指示与其他表左连接...仅注册一个条目(如果存在),并与成员和最大ReHiID上的R / H左连接。
要应用您要查找的名称,只需在查询中更改(select @LookForMe:='S')sqlvars行...
select *
from
( select
M.*,
max( RH.EffectiveDate ) as MaxEffectiveDate,
max( R.RegNumber ) as MaxRegNumber,
CASE WHEN M.LastNameTrimmed = @LookForMe THEN 1
WHEN M.LastNameTrimmed LIKE CONCAT(@LookForMe, '%') THEN 2
ELSE 3 END AS Relevance
from
( select @LookForMe := 'S' ) sqlvars,
Member M
LEFT JOIN Registration_History RH
on M.MemberID = RH.MemberID
LEFT JOIN Registration R
on M.MemberID = R.MemberID
where
M.LastName LIKE CONCAT('%', 'S', '%')
group by
M.MemberID
order by
Relevance,
M.LastName,
M.FirstName
limit
0,1000 ) PreQuery
LEFT JOIN Registration R2
on PreQuery.MemberNumber = R2.MemberNumber
AND PreQuery.MaxRegNumber = R2.RegNumber
LEFT JOIN Registration_History RH2
ON PreQuery.MemberNumber = RH2.MemberNumber
AND PreQuery.MaxEffectiveDate = RH2.EffectiveDate
让我们来看看如何快速处理您的生产数据以及我们有多接近。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.