[英]Outer query is very slow when inner query returns no results
I'm trying to fetch a row from a table called export
with random weights.我试图从一个名为
export
的表中获取一行随机权重。 It should then fetch one row from another table export_chunk
which references the first row.然后它应该从引用第一行的另一个表
export_chunk
获取一行。 This is the query:这是查询:
SELECT * FROM export_chunk
WHERE export_id=(
SELECT id FROM export
WHERE schedulable=1
ORDER BY -LOG(1 - RAND())/export.weight LIMIT 1)
AND status='PENDING'
LIMIT 2;
The export
table can have 1000 rows while the export_chunk table can have millions of rows. export
表可以有 1000 行,而 export_chunk 表可以有数百万行。
The query is very fast when the inner query returns a row.当内部查询返回一行时,查询速度非常快。 However, if there are no rows with
schedulable=1
, the outer query performs a full table scan on export_chunk
.但是,如果没有带有
schedulable=1
行,则外部查询将对export_chunk
执行全表扫描。 Why does this happen and is there any way to prevent it?为什么会发生这种情况,有什么办法可以防止吗?
EDIT: Trying COALESCE()编辑:尝试 COALESCE()
Akina in the comments suggested using COALESCE, ie.: Akina 在评论中建议使用 COALESCE,即:
SELECT * FROM export_chunk
WHERE export_id=COALESCE(
SELECT id FROM export
WHERE schedulable=1
ORDER BY -LOG(1 - RAND())/export.weight LIMIT 1)
,-1)
AND status='PENDING'
LIMIT 2;
This should work.这应该有效。 When I run:
当我运行时:
SELECT COALESCE((SELECT id FROM export WHERE schedulable=1 ORDER BY -LOG(1-RAND())/export.weight LIMIT 1), -1) FROM export;
It does return -1 for each row which Akina predicted.对于 Akina 预测的每一行,它确实返回 -1。 And if I manually search for -1 instead of the inner query it returns no rows very quickly.
如果我手动搜索 -1 而不是内部查询,它会很快不返回任何行。 However, when I try to use COALESCE on the inner query it is still really slow.
但是,当我尝试在内部查询上使用 COALESCE 时,它仍然很慢。 I do not understand why.
我不理解为什么。
Test this:测试这个:
SELECT export_chunk.*
FROM export_chunk
JOIN ( SELECT id
FROM export
WHERE schedulable=1
ORDER BY -LOG(1 - RAND())/export.weight
LIMIT 1 ) AS random_row ON export_chunk.export_id=random_row.id
WHERE export_chunk.status='PENDING'
LIMIT 2;
Does this matches needed logic?这是否符合所需的逻辑? especially when no matching rows in the subquery - do you need none output rows (like now) or any 2 rows in this case?
特别是当子查询中没有匹配的行时 - 在这种情况下,您是否需要没有输出行(如现在)或任何 2 行?
PS.附注。 LIMIT without ORDER BY in outer query is strange.
外部查询中没有 ORDER BY 的 LIMIT 很奇怪。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.