[英]Why is query by id with literal ids is much faster than query by subquery?
I build a query that sorts through more than 200k records and returns 28 records in 7s.我构建了一个查询,可以对超过 20 万条记录进行排序,并在 7 秒内返回 28 条记录。 It is the following:它是以下内容:
select idcustomer
from sale s
inner join customer c on c.idCustomer = s.fkCustomer
WHERE s.dateSale between '1700-01-01' and '2022-01-26'
and c.fkCompany = 'b92c5957-9275-4fa5-9970-1a41eb524328'
This query is using the following index:此查询使用以下索引:
create index customer_fkCompanyx on dactai.customer(fkCompany, idcustomer);
I wanted to also query the columns c.firstname
and c.lastname
but when I add them to the above query the query takes 40s to return.我还想查询列c.firstname
和c.lastname
但是当我将它们添加到上面的查询时,查询需要 40 秒才能返回。 I read that it might be because of RID and the database having to query the actual table instead of the index to get the additional columns.我读到这可能是因为 RID 和数据库必须查询实际表而不是索引来获取额外的列。 I then tried adding those two columns to the index creating a "cover index" but the columns are too large and extrapolate the max size of the index.然后我尝试将这两列添加到创建“覆盖索引”的索引中,但列太大并推断出索引的最大大小。
I then tried a different approach, I tried adding this query as a subquery like so:然后我尝试了一种不同的方法,我尝试将此查询添加为子查询,如下所示:
select firstname from customer where idcustomer in (
select idcustomer
from sale s
inner join customer c on c.idCustomer = s.fkCustomer
WHERE s.dateSale between '1700-01-01' and '2022-01-26'
and c.fkCompany = 'b92c5957-9275-4fa5-9970-1a41eb524328'
)
This approach takes 20 seconds, so 13s more than the subquery on its own.这种方法需要 20 秒,因此比子查询本身多 13 秒。 The curious thing and the reason for this post is that the outer query with all 28 literal ids returns instantly.奇怪的是,写这篇文章的原因是带有所有 28 个文字 ID 的外部查询会立即返回。
Why would the outer query that takes 0s plus the sub query that takes 7s result in a query that takes 20s when the outer query with the same 28 ids but literal, takes 0 seconds?为什么采用 0s 的外部查询加上采用 7s 的子查询会导致采用 20s 的查询,而具有相同 28 个 id 但文字的外部查询需要 0 秒? Also, is there a way to add the first and last names to the original query without it taking 40s?另外,有没有办法在不花费 40 秒的情况下将名字和姓氏添加到原始查询中?
c: INDEX(fkCompany, idCustomer)
s: INDEX(dateSale, fkCustomer)
s: INDEX(fkCustomer, dateSale)
Also, try另外,试试
select c.firstname
from customer AS c
JOIN sale AS s ON c.idCustomer = s.fkCustomer
WHERE s.dateSale between '1700-01-01' AND '2022-01-26'
and c.fkCompany = 'b92c5957-9275-4fa5-9970-1a41eb524328'
And change my first index suggestion to并将我的第一个索引建议更改为
c: INDEX(fkCompany, idCustomer, firstname, lastname)
For further discussion, please provide SHOW CREATE TABLE
for each table.如需进一步讨论,请为每个表提供SHOW CREATE TABLE
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.