如何加快 sql 查询？

Question

I have an SQL query as follows:我有一个 SQL 查询如下：

SELECT p.Id1,p.Id2,p.Id3 
FROM dataset1 p
WHERE p.Id2 IN (
    SELECT r.Id4 
    FROM dataset1 r 
    WHERE r.Id5=125 AND r.Id6>=100000000000000 AND r.Id6<1000000000000000
) 
ORDER BY p.Id1 DESC, p.Id2 DESC

However there appears to be huge amounts of data with Id6 in this range and thus, it takes a quite long time to compute.然而，在这个范围内似乎有大量 Id6 的数据，因此计算需要相当长的时间。 But I only have one hour to compute the query.但我只有一小时来计算查询。 I thus, am wondering if someone could help me to improve the performance of this query.因此，我想知道是否有人可以帮助我提高此查询的性能。

Thanks.谢谢。

Answer 1

Since the filtering seems to be done on r , arrange for it to be looked at first:由于过滤似乎是在r上完成的，所以先安排一下：

SELECT  p.Id1, p.Id2, p.Id3
    FROM  ( SELECT id4
       FROM dataset1 AS r
       WHERE  r.id5 = 125
         AND  r.Id6 >= 100000000000000
         AND  r.Id6 <  100000000000000 ) AS x
    JOIN dataset1 AS p  ON p.id2 = x.id4
    ORDER BY  p.Id1 DESC, p.Id2 DESC;

For that, these indexes should be beneficial:为此，这些索引应该是有益的：

INDEX(id5, id6, id4)   -- covering
INDEX(id2, id1, id3)   -- covering

You have a "range" test on id6 , yet the range is empty.您对id6进行了“范围”测试，但范围为空。 I assume that was a mistake.我认为那是一个错误。 Please don't simplify a query too much;请不要过分简化查询； we might give you advice that does not apply.我们可能会给您不适用的建议。 I am assuming that the range is really a range.我假设范围确实是一个范围。

Answer 2

IN tend to optimize poorly when the subquery returns a lot of data.当子查询返回大量数据时， IN往往优化不佳。 You can try using EXISTS instead:您可以尝试改用EXISTS ：

SELECT p.Id1, p.Id2, p.Id3 
FROM dataset1 p
WHERE EXISTS (
    SELECT 1
    FROM dataset1 r 
    WHERE 
        r.Id4 = p.Id2
        AND r.Id5 = 125 
        AND r.Id6 >= 100000000000000 
        AND r.Id6 <  100000000000000
) 
ORDER BY p.Id1 DESC, p.Id2 DESC

Then, consider a multi-column index on (Id4, Id5, Id6) to speed up the subquery.然后，考虑在(Id4, Id5, Id6)上建立一个多列索引来加速子查询。 The idea is to put the more restrictive criteria first - so obviously you want Id6 last, but you might want to try inverting the first two columns to see if any combination performs better than the other.这个想法是把更严格的标准放在第一位 - 所以很明显你想要Id6最后，但你可能想尝试反转前两列，看看是否有任何组合比另一个表现更好。

Side note: both the lower and upper bound for Id6 have the same value in your query.旁注： Id6的下限和上限在您的查询中具有相同的值。 I take this as a typo (otherwise your query would always return no row).我认为这是一个错字（否则您的查询将始终不返回任何行）。

Answer 3

To improve performance don't use an inner query.为了提高性能，不要使用内部查询。 You can get you desired result by using an inner join too:您也可以使用内部连接来获得所需的结果：

SELECT 
    p.Id1, p.Id2, p.Id3 
FROM 
    dataset1 p 
INNER JOIN 
    dataset1 r ON p.Id2 = r.Id4 
               AND r.Id5 = 125 
               AND r.Id6 >= 100000000000000 
               AND r.Id6 < 100000000000000
ORDER BY 
    p.Id1 DESC, p.Id2 DESC

如何加快 sql 查询？

问题描述

3 个解决方案

解决方案1
2 2020-07-18 01:14:09

解决方案2
1 2020-07-18 00:57:13

解决方案3
0 2020-07-18 01:10:35

如何加快 sql 查询？

问题描述

3 个解决方案

解决方案1 2 2020-07-18 01:14:09

解决方案2 1 2020-07-18 00:57:13

解决方案3 0 2020-07-18 01:10:35

解决方案1
2 2020-07-18 01:14:09

解决方案2
1 2020-07-18 00:57:13

解决方案3
0 2020-07-18 01:10:35