简体   繁体   English

使用 Postgres/PostGIS 高效更新 200 万行的表

[英]Performantly update table with 2 million rows with Postgres/PostGIS

I have two tables:我有两张桌子:

  • properties (geo_point POINT, locality_id INTEGER, neighborhood_id INTEGER, id UUID) properties (geo_point POINT、locality_id INTEGER、neighborhood_id INTEGER、id UUID)
  • places_temp (id INTEGER, poly GEOMETRY, placetype TEXT) places_temp (id INTEGER,poly GEOMETRY,placetype TEXT)

Note: all columns in places_temp are indexed.注意: places_temp中的所有列都已编入索引。

properties has ~2 million rows and I would like to: properties有大约 200 万行,我想:

  • update locality_id and neighborhood_id for each row in properties with the id from places_temp where properties.geo_point is contained by a polygon in places_temp.poly使用places_temp中的id更新properties中每一行的locality_idneighborhood_id ,其中properties.geo_point包含在places_temp.poly中的多边形中

Whatever I do it just seems to hang for hours within which time I don't know if it's working, the connection is lost, etc.无论我做什么,它似乎都会挂几个小时,在此期间我不知道它是否正常工作,连接丢失等。

Any thoughts on how to do this performantly?关于如何高效地执行此操作的任何想法?

My query:我的查询:

  -- drop indexes on locality_id and neighborhood_id to speed up update
  DROP INDEX IF EXISTS idx_properties_locality_id;
  DROP INDEX IF EXISTS idx_properties_neighborhood_id;
  -- for each property find the locality and neighborhood
  UPDATE
    properties
  SET
    locality_id = (
      SELECT
        id
      FROM
        places_temp
      WHERE
        placetype = 'locality'
        -- check if geo_point is contained by polygon. geo_point is stored as SRID 26910 so must be
        -- transformed first
        AND st_intersects (st_transform (geo_point, 4326), poly)
      LIMIT 1),
  neighborhood_id = (
    SELECT
      id
    FROM
      places_temp
    WHERE
      placetype = 'neighbourhood'
      -- check if geo_point is contained by polygon. geo_point is stored as SRID 26910 so must be
      -- transformed first
      AND st_intersects (st_transform (geo_point, 4326), poly)
    LIMIT 1);
  -- Add indexes back after update
  CREATE INDEX IF NOT EXISTS idx_properties_locality_id ON properties (locality_id);
  CREATE INDEX IF NOT EXISTS idx_properties_neighborhood_id ON properties (neighborhood_id);
CREATE INDEX properties_point_idx ON properties USING gist (geo_point);
CREATE INDEX places_temp_poly_idx ON places_temp USING gist (poly);

UPDATE properties p
SET locality_id = x.id
FROM ( SELECT *
        , row_number() OVER () rn
        FROM places_temp t 
        WHERE t.placetype = 'locality'
        AND st_intersects (st_transform (p.geo_point, 4326), t.poly)
        )x
WHERE x.rn = 1
      ;

And similar for the other field (you could combine them into one query)与其他字段类似(您可以将它们组合成一个查询)

Try this尝试这个

        UPDATE
      properties
   SET
    locality_id =t.id, neighbourhood_id
    =t.id
    From(
    SELECT
    id
    FROM
    places_temp
  WHERE
    placetype in ('locality',  
   'neighbourhood') 
    AND st_intersects (st_transform 
   (geo_point, 4326), poly)
  LIMIT 1)) t

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM