How to update a large (1 million+ rows) postgres column of jsonb type values

Question

Trying to update a specific array inside of a jsonb type in a column called params, and having issues with how long it's taking. For example, there is a table with a row that contains an array owners

{
  "hidden": false,
  "org_id": "34692",
  "owners": [
    "tim@facebuk.com"
  ],
  "deleted": false
}

And another example

{
  "hidden": false,
  "org_id": "34692",
  "owners": [
    "tim@google.com"
  ],
  "deleted": false
}

And there's essentially a million of these rows (all with different email domains as owners . I have this query which I want to execute across all of these rows:

UPDATE table set params = CASE WHEN params->>'owners' NOT LIKE '%google.com%' THEN jsonb_set(params, '{owners}', concat('"', substr(md5(random()::text), 0, 25), '@googlefake.com"')::jsonb) ELSE params END

I've tested with a dataset of 100, and it executes perfectly time, but doing this with a 1000x multiple, makes the query forever execute, and I've no clue if it will actually successfully complete. Not entirely sure how to speed up this process or utilize this in a better fashion. I did try indexing eg CREATE INDEX ON table((params->>'owners')); to no avail. Query has run >1 hour, and there are multiple rows similar to this.

Am i indexing incorrectly? Also, I've looked into the gin operator and @> won't help since each owner field differs

Answer 1

Avoid unnecessary updates with a WHERE clause that filters out the rows that don't need to be modified. Perhaps supporting that condition with an index can help.

You may want to run run VACUUM (FULL) once the update is done.

How to update a large (1 million+ rows) postgres column of jsonb type values

Question

1 answers

solution1
0 2022-01-20 05:21:25

How to update a large (1 million+ rows) postgres column of jsonb type values

Question

1 answers

solution1 0 2022-01-20 05:21:25

solution1
0 2022-01-20 05:21:25