[英]Optimizing SQL query on table of 10 million rows: neverending query
我有兩張桌子:
CREATE TABLE routing
(
id integer NOT NULL,
link_geom geometry,
source integer,
target integer,
traveltime_min double precision,
CONSTRAINT routing_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
CREATE INDEX routing_id_idx
ON routing
USING btree
(id);
CREATE INDEX routing_link_geom_gidx
ON routing
USING gist
(link_geom);
CREATE INDEX routing_source_idx
ON routing
USING btree
(source);
CREATE INDEX routing_target_idx
ON routing
USING btree
(target);
和
CREATE TABLE test
(
link_id character varying,
link_geom geometry,
id integer NOT NULL,
.. (some more attributes here)
traveltime_min double precision,
CONSTRAINT id PRIMARY KEY (id),
CONSTRAINT test_link_id_key UNIQUE (link_id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE test
OWNER TO postgres;
而我正試圖應用以下查詢:
update routing
set traveltime_min = t2.traveltime_min
from test t2
where t2.id = routing.id
兩個表都有近1000萬行。 問題是這個查詢運行無休止。 在這里'EXPLAIN'顯示:
Update on routing (cost=601725.94..1804772.15 rows=9712264 width=208)
-> Hash Join (cost=601725.94..1804772.15 rows=9712264 width=208)
Hash Cond: (routing.id = t2.id)"
-> Seq Scan on routing (cost=0.00..366200.23 rows=9798223 width=194)"
-> Hash (cost=423414.64..423414.64 rows=9712264 width=18)"
-> Seq Scan on test t2 (cost=0.00..423414.64 rows=9712264 width=18)"
我無法理解可能導致如此緩慢響應的問題。 是否可能是服務器設置引起的問題? 問題是我使用默認的postgrSQL 9.3設置。
在運行UPDATE
之前刪除routing
上的所有索引,然后再次添加它們。 這將帶來巨大的進步。
在運行UPDATE
的會話work_mem
設置work_mem
高。 這將有助於哈希。
將shared_buffers
設置為可用內存的1/4,但不超過1GB。
-- these could be needed if the update would be more selective...
VACUUM analyze routing;
VACUUM analyze test;
UPDATE routing dst
SET traveltime_min = src.traveltime_min
FROM test src
WHERE dst.id = src.id
-- avoid useless updates and row-versions
AND dst.traveltime_min IS DISTINCT FROM src.traveltime_min
;
-- VACUUM analyze routing;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.