[英]Postgres WHERE NOT IN taking a long time to execute
I have 2 tables location
and distance
我有 2 张桌子的location
和distance
This query takes a very long time to execute:此查询需要很长时间才能执行:
SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source, location AS dest
WHERE (source.id, dest.id) NOT IN (
SELECT source_id, destination_id FROM distance
)
Even LIMIT 100
Takes >30 seconds to return results (Total results are ~15k)即使LIMIT 100
需要 >30 秒才能返回结果(总结果约为 15k)
The 2 queries individually run almost instantly:这 2 个查询几乎立即单独运行:
SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source, location AS dest
and和
SELECT source_id, destination_id FROM distance
Also, modifying the query with EXCEPT
fixes the query runtime:此外,使用EXCEPT
修改查询可修复查询运行时:
SELECT source.id AS source_id, dest.id AS destination_id
FROM location AS source, location AS dest
EXCEPT
SELECT source_id, destination_id FROM distance
But I want all the 4 columns, not the just 2.但我想要所有 4 列,而不仅仅是 2 列。
How can I fix this?我怎样才能解决这个问题?
The answer is to use WHERE NOT EXISTS
!答案是使用WHERE NOT EXISTS
! (reference) (参考)
I just modified my query like:我刚刚修改了我的查询,如:
SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source, location AS dest
WHERE NOT EXISTS (
SELECT 1 FROM distance
WHERE source_id = source.id AND destination_id = dest.id
)
This runs instantly!这立即运行!
Big thanks RhodiumToad on #postgres IRC!非常感谢RhodiumToad上#postgres IRC!
Turns you should never use NOT IN
.轮到你永远NOT IN
应该使用NOT IN
。 Its in Postgres' Don't do this:它在Postgres' 不要这样做:
Don't use
NOT IN
, or any combination ofNOT
andIN
such asNOT (x IN (select…))
.不要使用NOT IN
或NOT
和IN
任何组合,例如NOT (x IN (select…))
。(If you think you wanted
NOT IN (select …)
then you should rewrite to useNOT EXISTS
instead.) (如果你认为你想要NOT IN (select …)
那么你应该重写为使用NOT EXISTS
。)
Another option is to make use of an anti left join
.另一种选择是使用反left join
。 Also, you want to use proper, explicit join syntax instead of old-school, implicit joins:此外,您希望使用正确的、显式的连接语法而不是老式的隐式连接:
SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source
CROSS JOIN location AS dest
LEFT JOIN distance dist
ON dist.source_id = source.id
AND dist.destination_id = dest.id
WHERE dist.source_id IS NULL
For performance, consider the following indexes:对于性能,请考虑以下索引:
location(id)
distance(source_id, destination_id)
Finally: unless you do expect routes where the starting and ending point are the same, you can use an INNER JOIN
instead of a CROSS JOIN
:最后:除非您确实期望起点和终点相同的路线,否则您可以使用INNER JOIN
而不是CROSS JOIN
:
SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source
INNER JOIN location AS dest
ON source.id <> dest.id
LEFT JOIN distance dist
ON dist.source_id = source.id
AND dist.destination_id = dest.id
WHERE dist.source_id IS NULL
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.