简体   繁体   English

Postgres 不是需要很长时间来执行

[英]Postgres WHERE NOT IN taking a long time to execute

I have 2 tables location and distance我有 2 张桌子的locationdistance

This query takes a very long time to execute:此查询需要长时间才能执行:

SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source, location AS dest
WHERE (source.id, dest.id) NOT IN (
    SELECT source_id, destination_id FROM distance
)

Even LIMIT 100 Takes >30 seconds to return results (Total results are ~15k)即使LIMIT 100需要 >30 秒才能返回结果(总结果约为 15k)

The 2 queries individually run almost instantly:这 2 个查询几乎立即单独运行:

SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source, location AS dest

and

SELECT source_id, destination_id FROM distance

Also, modifying the query with EXCEPT fixes the query runtime:此外,使用EXCEPT修改查询可修复查询运行时:

SELECT source.id AS source_id, dest.id AS destination_id
FROM location AS source, location AS dest
EXCEPT
SELECT source_id, destination_id FROM distance

But I want all the 4 columns, not the just 2.但我想要所有 4 列,而不仅仅是 2 列。

How can I fix this?我怎样才能解决这个问题?

The answer is to use WHERE NOT EXISTS !答案是使用WHERE NOT EXISTS (reference) (参考)

I just modified my query like:我刚刚修改了我的查询,如:

SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source, location AS dest
WHERE NOT EXISTS (
    SELECT 1 FROM distance
    WHERE source_id = source.id AND destination_id = dest.id
)

This runs instantly!这立即运行!

Big thanks RhodiumToad on #postgres IRC!非常感谢RhodiumToad#postgres IRC!

Turns you should never use NOT IN .轮到你永远NOT IN应该使用NOT IN Its in Postgres' Don't do this:它在Postgres' 不要这样做:

Don't use NOT IN , or any combination of NOT and IN such as NOT (x IN (select…)) .不要使用NOT INNOTIN任何组合,例如NOT (x IN (select…))

(If you think you wanted NOT IN (select …) then you should rewrite to use NOT EXISTS instead.) (如果你认为你想要NOT IN (select …)那么你应该重写为使用NOT EXISTS 。)

Another option is to make use of an anti left join .另一种选择是使用反left join Also, you want to use proper, explicit join syntax instead of old-school, implicit joins:此外,您希望使用正确的、显式的连接语法而不是老式的隐式连接:

SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source
CROSS JOIN location AS dest
LEFT JOIN distance dist
    ON  dist.source_id = source.id 
    AND dist.destination_id = dest.id
WHERE dist.source_id IS NULL

For performance, consider the following indexes:对于性能,请考虑以下索引:

location(id)
distance(source_id, destination_id)

Finally: unless you do expect routes where the starting and ending point are the same, you can use an INNER JOIN instead of a CROSS JOIN :最后:除非您确实期望起点和终点相同的路线,否则您可以使用INNER JOIN而不是CROSS JOIN

SELECT source.name AS source, source.id AS source_id, dest.name AS destination, dest.id AS destination_id
FROM location AS source
INNER JOIN location AS dest
    ON source.id <> dest.id
LEFT JOIN distance dist
    ON  dist.source_id = source.id 
    AND dist.destination_id = dest.id
WHERE dist.source_id IS NULL

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM