簡體   English   中英

像在 JSONB 列上使用 GIN INDEX 的搜索性能

[英]Like search performance with GIN INDEX on JSONB column

我對 Postgres 比較陌生。

我有一個 JSONB 列,其中包含如下示例數據。

{
    "book_data": {
        "author": "abcd",
        "title": "This is a literature book",
        "year": "2021",
        "price":2000,
        "noofpages":100,
        "subject": "Language"
    }
}

在這里用戶可以按作者、年份、價格、主題、標題的值進行搜索。 當用戶按標題搜索時,他們可以進行部分或全部搜索。 所以在這里用戶可以搜索整個標題(“這是一本文學書”)或部分(比如“文學”)我們在列上有一個 GIN 索引。 當我們對值進行精確匹配時,查詢表現良好。

select * FROM book_ms.book_data a
WHERE a.book_details  @@ '$.book_data.author == "abcd" 
&& $.book_data.title == "This is a litreature book"'

但是在進行如下部分匹配時,它需要很長時間。 我只會對標題進行部分搜索。 對於其他人,它將始終是完全匹配的。 有什么辦法可以提高點贊搜索的速度

select * FROM book_ms.book_data a
WHERE a.book_details  @@ '$.book_data.author == "abcd" 
&& $.book_data.title like_regex "litreature"'

我是否必須為此查找文本索引。 如果是這樣,我們是否有一些與 oracle 文本索引中使用的節組等效的東西

下面 EXPLAIN (ANALYZE,BUFFERS) - 對於類似匹配

"Bitmap Heap Scan on book_data a  (cost=71.57..1784.58 rows=461 width=935) (actual time=63.726..600.494 rows=3430 loops=1)"
"  Recheck Cond: (book_details @@ '($.""book_data"".""author"" == ""abcd"" && $.""book_data"".""title"" like_regex ""litreature"")'::jsonpath)"
"  Rows Removed by Index Recheck: 198778"
"  Heap Blocks: exact=30016"
"  Buffers: shared hit=30241"
"  ->  Bitmap Index Scan on search_ult_text_ndx_3  (cost=0.00..71.45 rows=461 width=0) (actual time=57.008..57.009 rows=202208 loops=1)"
"        Index Cond: (book_details @@ '($.""book_data"".""author"" == ""abcd"" && $.""book_data"".""title"" like_regex ""litreature"")'::jsonpath)"
"        Buffers: shared hit=225"
"Planning Time: 0.216 ms"
"Execution Time: 601.197 ms"
"Note: This is not an Approved plan.  No usable Approved plan was found."
"SQL Hash: -403339798, Plan Hash: -176285915"

下面是精確匹配

"Bitmap Heap Scan on book_data a  (cost=135.57..1848.58 rows=461 width=935) (actual time=23.597..23.707 rows=25 loops=1)"
"  Recheck Cond: (book_details @@ '($.""book_data"".""author"" == ""abcd"" && $.""book_data"".""title"" == ""This is a literature book"")'::jsonpath)"
"  Heap Blocks: exact=24"
"  Buffers: shared hit=137 read=17"
"  I/O Timings: read=21.189"
"  ->  Bitmap Index Scan on search_ult_text_ndx_3  (cost=0.00..135.45 rows=461 width=0) (actual time=23.572..23.572 rows=25 loops=1)"
"        Index Cond: (book_details @@ '($.""book_data"".""author"" == ""abcd"" && $.""book_data"".""title"" == ""This is a literature book"")'::jsonpath)"
"        Buffers: shared hit=113 read=17"
"        I/O Timings: read=21.189"
"Planning Time: 0.207 ms"
"Execution Time: 23.732 ms"
"Note: This is not an Approved plan.  No usable Approved plan was found."
"SQL Hash: -403339798, Plan Hash: -176285915"

沒有索引可以加速這種WHERE條件。

如果路徑總是{book_data,author} ,你可以寫:

WHERE book_details -> 'book_data' ->> 'author' LIKE '%litreature%'

然后可以使用book_details -> 'book_data' ->> 'author'上的book_details -> 'book_data' ->> 'author'索引:

CREATE EXTENSION IF NOT EXISTS pg_trgm;

CREATE INDEX ON book_ms.book_data USING gin
   ((book_details -> 'book_data' ->> 'author') gin_trgm_ops);

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM