Using ST_MakePoint for dataset with over 1 billion rows

Question

i have a global dataset in my postgres database (9.2.4 and postgis 2.1.0SVN) with ~ 1.1 billion rows. my aim is to extract relevant rows using a polygon. query is following and running since one day.

UPDATE table SET geom = ST_SetSRID(ST_MakePoint(long,lat),4326) where lat !=666 ;

666 was the placeholder for missing values. column lat has btree index.

free -m gives my following stats for ram

total       used       free     shared    buffers     cached
Mem:         24104      23829        275          0          5      22738
-/+ buffers/cache:       1084      23020
Swap:        24574        309      24265

htop shows my almost no cpu load, with 9% memory.

Is the query running anymore or kinda on hold because of lacking ram?

any comment or hint appreciated.

Answer 1

It's better to use CURSORs and spread your dataset to parts near 1000 rows .

Documetation about cursors is here^ https://www.postgresqltutorial.com/plpgsql-cursor/

Answer 2

If possible never use an update as a mass operation. For that purpose, much, much better is CTAS and dropping and renaming tables. First, create table as select (CTAS)

Create table "new_table" as
select column_1, column_2.... column_n, ST_SetSRID(ST_MakePoint(long,lat),4326) geom
where  lat !=666;

Check your results if necessary.

Now drop your old table and rename new one.

drop table "old_table";
Alter table "new_table" rename "old_table".

Create all needed indexes, foreign keys etc for renamed table.

If there are foreign keys and other constraints on old_table just

alter table old_table disable trigger all;

Using ST_MakePoint for dataset with over 1 billion rows

Question

2 answers

solution1
1 2020-11-25 15:31:16

solution2
0 2018-08-31 09:02:16

Using ST_MakePoint for dataset with over 1 billion rows

Question

2 answers

solution1 1 2020-11-25 15:31:16

solution2 0 2018-08-31 09:02:16

solution1
1 2020-11-25 15:31:16

solution2
0 2018-08-31 09:02:16