简体   繁体   中英

Why is this query taking so long on JSONB Gin index field? Can I fix it so it actually uses the index?

Recently we changed the format of one of our tables from using a single entry in a column to having a JSONB column in the format of ["key1","key2","key3"] etc. Although we built a GIN index on the JSONB field the queries that we use on it are EXTREMELY slow (in the range of 50 minutes in explain plan). I am trying to find out a way to optimize the query and to correctly utilize the index. I pasted the query below as well as the explain plan for it. The indexed fields are visit.visitor, launch.campaign_key, launch.launch_key, visit.store_key and visits.stop JSONB field as GIN index. We are using PostgresQL 9.4

explain (analyze on) select count(subselect.visitors) as visitors,
subselect.campaign as campaign 
from (
    select distinct visit.visitor as visitors,
          launch.campaign_key as campaign 
from visit 
    join launch on (jsonb_exists(visit.stops, launch.launch_key)) where 
       visit.store_key = 'ahBzfmdlYXJsYXVuY2gtaHVi' 
       and launch.state = 'PRODUCTION') as subselect group by subselect.campaign

Explain results:

HashAggregate  (cost=63873548.47..63873550.47 rows=200 width=88) (actual time=248617.348..248617.365 rows=58 loops=1)
  Group Key: launch.campaign_key
  ->  HashAggregate  (cost=63519670.22..63661221.52 rows=14155130 width=88) (actual time=248587.320..248616.558 rows=1938 loops=1)
    Group Key: visit.visitor, launch.campaign_key
    ->  HashAggregate  (cost=63307343.27..63448894.57 rows=14155130 width=88) (actual time=248553.278..248584.868 rows=1938 loops=1)
          Group Key: visit.visitor, launch.campaign_key
          ->  Nested Loop  (cost=4903.09..56997885.96 rows=1261891461 width=88) (actual time=180648.410..248550.249 rows=2085 loops=1)
                Join Filter: jsonb_exists(visit.stops, (launch.launch_key)::text)
                Rows Removed by Join Filter: 624114512
                ->  Bitmap Heap Scan on launch  (cost=3213.19..126084.38 rows=169389 width=123) (actual time=32.082..317.561 rows=166121 loops=1)
                      Recheck Cond: ((state)::text = 'PRODUCTION'::text)
                      Heap Blocks: exact=56635
                      ->  Bitmap Index Scan on launch_state_idx  (cost=0.00..3170.85 rows=169389 width=0) (actual time=21.172..21.172 rows=166121 loops=1)
                            Index Cond: ((state)::text = 'PRODUCTION'::text)
                ->  Materialize  (cost=1689.89..86736.04 rows=22349 width=117) (actual time=0.000..0.487 rows=3757 loops=166121)
                      ->  Bitmap Heap Scan on visit  (cost=1689.89..86624.29 rows=22349 width=117) (actual time=1.324..14.381 rows=3757 loops=1)
                            Recheck Cond: ((store_key)::text = 'ahBzfmdlYXJsYXVuY2gtaHVicg8LEgVTdG9yZRinzbKcDQw'::text)
                            Heap Blocks: exact=3672
                            ->  Bitmap Index Scan on visit_store_key_idx  (cost=0.00..1684.31 rows=22349 width=0) (actual time=0.780..0.780 rows=3757 loops=1)
                                  Index Cond: ((store_key)::text = 'ahBzfmdlYXJsYXVuY2gtaHVicg8LEgVTdG9yZRinzbKcDQw'::text)
Planning time: 0.232 ms
Execution time: 248708.088 ms

I should mention the index on stops is built CREATE INDEX ON visit USING GIN (stops)

I'm wondering if switching to building it to CREATE INDEX ON visit USING GIN (stops->'value')

Will resolve the issue?

The wrapper function jsonb_exists() prevents the use of the gin index on visits.stops . Instead of

from visit 
join launch on (jsonb_exists(visit.stops, launch.launch_key))

try

from visit 
join launch on visit.stops ? launch.launch_key::text

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM