[英]Performantly update table with 2 million rows with Postgres/PostGIS
I have two tables:我有两张桌子:
properties
(geo_point POINT, locality_id INTEGER, neighborhood_id INTEGER, id UUID) properties
(geo_point POINT、locality_id INTEGER、neighborhood_id INTEGER、id UUID)places_temp
(id INTEGER, poly GEOMETRY, placetype TEXT) places_temp
(id INTEGER,poly GEOMETRY,placetype TEXT) Note: all columns in places_temp
are indexed.注意:
places_temp
中的所有列都已编入索引。
properties
has ~2 million rows and I would like to: properties
有大约 200 万行,我想:
locality_id
and neighborhood_id
for each row in properties
with the id
from places_temp
where properties.geo_point
is contained by a polygon in places_temp.poly
places_temp
中的id
更新properties
中每一行的locality_id
和neighborhood_id
,其中properties.geo_point
包含在places_temp.poly
中的多边形中Whatever I do it just seems to hang for hours within which time I don't know if it's working, the connection is lost, etc.无论我做什么,它似乎都会挂几个小时,在此期间我不知道它是否正常工作,连接丢失等。
Any thoughts on how to do this performantly?关于如何高效地执行此操作的任何想法?
My query:我的查询:
-- drop indexes on locality_id and neighborhood_id to speed up update
DROP INDEX IF EXISTS idx_properties_locality_id;
DROP INDEX IF EXISTS idx_properties_neighborhood_id;
-- for each property find the locality and neighborhood
UPDATE
properties
SET
locality_id = (
SELECT
id
FROM
places_temp
WHERE
placetype = 'locality'
-- check if geo_point is contained by polygon. geo_point is stored as SRID 26910 so must be
-- transformed first
AND st_intersects (st_transform (geo_point, 4326), poly)
LIMIT 1),
neighborhood_id = (
SELECT
id
FROM
places_temp
WHERE
placetype = 'neighbourhood'
-- check if geo_point is contained by polygon. geo_point is stored as SRID 26910 so must be
-- transformed first
AND st_intersects (st_transform (geo_point, 4326), poly)
LIMIT 1);
-- Add indexes back after update
CREATE INDEX IF NOT EXISTS idx_properties_locality_id ON properties (locality_id);
CREATE INDEX IF NOT EXISTS idx_properties_neighborhood_id ON properties (neighborhood_id);
CREATE INDEX properties_point_idx ON properties USING gist (geo_point);
CREATE INDEX places_temp_poly_idx ON places_temp USING gist (poly);
UPDATE properties p
SET locality_id = x.id
FROM ( SELECT *
, row_number() OVER () rn
FROM places_temp t
WHERE t.placetype = 'locality'
AND st_intersects (st_transform (p.geo_point, 4326), t.poly)
)x
WHERE x.rn = 1
;
And similar for the other field (you could combine them into one query)与其他字段类似(您可以将它们组合成一个查询)
Try this尝试这个
UPDATE
properties
SET
locality_id =t.id, neighbourhood_id
=t.id
From(
SELECT
id
FROM
places_temp
WHERE
placetype in ('locality',
'neighbourhood')
AND st_intersects (st_transform
(geo_point, 4326), poly)
LIMIT 1)) t
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.