I have a table with ~1.4 millions rows. There are about 5 columns with general info on each row and a 6th column with ~1700 JSON key value pairs.
I am building some summaries from a column called ownership by selecting rows where a specific key value exists. The query below runs in 14.5s
SELECT ownership,
SUM (TO_NUMBER(jsonfield->>'firstvalue','9G999g999')) AS total
FROM
mytable
WHERE
jsonfield->>'firstvalue' IS NOT NULL
group by ownership
My queries will be much larger and I know I'll need to make selections on many key values from the jsonfield. For example, if add another key value, the query time increased to 22.9s
SELECT ownership,
SUM (TO_NUMBER(jsonfield->>'firstvalue','9G999g999')) AS total,
SUM (TO_NUMBER(jsonfield->>'secondvalue','9G999g999')) AS totaltwo
FROM
mytable
WHERE
jsonfield->>'firstvalue' IS NOT NULL
OR
jsonfield->>'secondvalue' IS NOT NULL
group by ownership
There may be instances where I'll need to query on several hundred potential values in the jsonfield. Any suggestions on how to optimize my queries which may speed things up?
Great answer below.. As an FYI, I had to convert my json to jsonb like this before I could create the index. I first created a copy of the json column called jsonbsummary that I then converted to jsonb
ALTER TABLE mytable
ALTER COLUMN jsonbsummary
SET DATA TYPE jsonb
USING jsonbsummary::jsonb;
As an additional FYI - Those queries with grouping that originally took 22+ seconds now run in 200ms with the GIN index! See below
SELECT ownership,
SUM (TO_NUMBER(jsonbsummary->>'firstvalue','9G999g999')) AS total,
SUM (TO_NUMBER(jsonbsummary->>'secondvalue','9G999g999')) AS totaltwo
FROM
mytable
WHERE
jsonbsummary ?| array['firstvalue','secondvalue']
group by ownership
You need a GIN index on the JSONB column.
CREATE INDEX idx_json ON mytable USING GIN (jsoncolumn);
To check for the existence of keys, you need to use the ?|
operator which can make use of that index:
select ...
from mytable
where jsoncolumn ?| array['firstvalue', 'secondvalue'];
That is the equivalent to your OR
condition. If you want to find rows that contain all of those keys, use the ?&
instead.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.