简体   繁体   English

Postgres 对 json 字段的全文搜索

[英]Postgres full text search over json field

I can not understand why index work but I got empty result set.我不明白为什么索引有效,但结果集为空。

https://www.db-fiddle.com/f/n9SyXK6GY3va2CZ41jNGQ5/2 https://www.db-fiddle.com/f/n9SyXK6GY3va2CZ41jNGQ5/2

I have table:我有表:

create table content
(
    id bigserial not null constraint content_pk primary key,
    created_at timestamp with time zone not null,
    form json not null
);

Field form store data in format:字段表单以以下格式存储数据:

{
  "title_text": "Test test",
  "content": {
    "blocks": [
      {
        "key": "6j131",
        "text": "Lorem ipsum dolor sit amet,"
      },
      {
        "key": "6nml9",
        "text": "In tincidunt tincidunt porttitor."
      }
    ],
  }
}

I tried create index to search by value from title_text and from concatination of all nodes content->blocks[]->text .我尝试创建索引以从title_text和所有节点 content- >blocks[]->text 的串联中按值进行搜索。

My queries:我的疑问:

(function by sample of https://www.facebook.com/afiskon thak you) (功能通过https://www.facebook.com/afiskon thak you 的示例)

CREATE OR REPLACE FUNCTION make_tsvector(title TEXT, content json)
  RETURNS tsvector AS
'
BEGIN
    RETURN (setweight(to_tsvector(''simple'', title), ''A'')
    || setweight(to_tsvector(''simple'', STRING_AGG(content ->> ''text'', '' '')), ''B''));
END
'
    LANGUAGE 'plpgsql' IMMUTABLE;

(create index query) (创建索引查询)

DROP INDEX IF EXISTS idx_content__form__title_text_and_block_text;
CREATE INDEX IF NOT EXISTS idx_content__form__title_text_and_block_text
  ON content
    USING GIST (make_tsvector(
                            content.form ->> 'title_text',
                            content.form -> 'content' -> 'blocks'
                    ));

(and check of my query with EXPLAIN) (并使用 EXPLAIN 检查我的查询)

EXPLAIN
  SELECT c.id, c.form ->> 'title_text'
  FROM content c,
     json_array_elements(c.form -> 'content' -> 'blocks') block
  WHERE make_tsvector(
                  c.form ->> 'title_text',
                  c.form -> 'content' -> 'blocks'
          ) @@ to_tsquery('ipsum')
  GROUP BY c.id;

and I see index works (!)我看到索引有效(!)

HashAggregate  (cost=15.12..15.15 rows=2 width=40)
Group Key: c.id
->  Nested Loop  (cost=4.41..14.62 rows=200 width=64)
    ->  Bitmap Heap Scan on content c  (cost=4.41..10.62 rows=2 width=64)
          Recheck Cond: (make_tsvector((form ->> 'title_text'::text), ((form -> 'content'::text) -> 'blocks'::text)) @@ to_tsquery('ipsum'::text))
          ->  Bitmap Index Scan on idx_content__form__title_text_and_block_text  (cost=0.00..4.40 rows=2 width=0)
                Index Cond: (make_tsvector((form ->> 'title_text'::text), ((form -> 'content'::text) -> 'blocks'::text)) @@ to_tsquery('ipsum'::text))
    ->  Function Scan on json_array_elements block  (cost=0.01..1.01 rows=100 width=0)

but if I use this query I will get empty result .但是如果我使用这个查询,我会得到空结果

Is it problem of STRING_AGG call in index build function?是索引构建函数中 STRING_AGG 调用的问题吗?

Take a closer look at this snippet of your code here.在此处仔细查看您的代码片段。

make_tsvector(
  c.form ->> 'title_text',
  c.form -> 'content' -> 'blocks'
)

You're not selecting what you think.你不是在选择你的想法。

c.form -> 'content' -> 'blocks'

Returns a JSON array , not the individual elements.返回一个JSON 数组,而不是单个元素。 On the other hand in your function, you have this (escaped quotes removed for clarity):另一方面,在你的函数中,你有这个(为了清楚起见,去掉了转义的引号):

content ->> 'text'

The JSON you are passing in isn't an object;您传入的 JSON 不是对象; it's an array of objects .它是一个对象数组 Therefore the lookup fails because the path query is wrong.因此查找失败,因为路径查询错误。

The reason the planner reports that your index is being used is because both the index and your query are pointing to the same invalid path.规划器报告您的索引正在被使用的原因是因为索引和您的查询都指向同一个无效路径。 Since they match, the index is used.由于它们匹配,因此使用索引。 Doesn't mean the index holds useful info though.但这并不意味着索引包含有用的信息。

Find a way to iterate through the array either in the function or in the query calling the function and it should start working.找到一种在函数或调用函数的查询中遍历数组的方法,它应该开始工作。

For @GFB who has already forgotten this nightmare and for those who are still seeking an answer for "how to search in JSON array" especially in Draft.js output对于已经忘记这个噩梦的@GFB 以及那些仍在寻找“如何在 JSON 数组中搜索”的答案的人,尤其是在 Draft.js 输出中

CREATE TABLE IF NOT EXISTS content
(
    id bigserial not null constraint content_pk primary key,
    created_at timestamp with time zone not null,
    form json not null
);

INSERT INTO content (created_at, form) 
VALUES 
('2021-06-25', '{"blocks": [{"key": "6j131","text": "Lorem ipsum dolor sit amet,"},{"key": "6nml9","text": "In tincidunt tincidunt porttitor."}]}'),
('2021-06-25', '{"blocks": [{"key": "6j131","text": "hello world"},{"key": "6nml9","text": "hello Dolly"}]}')
;


SELECT c.*
FROM content c
LEFT JOIN LATERAL json_array_elements(c.form->'blocks') blocks ON TRUE
WHERE blocks->>'text' ILIKE '%hello%'
GROUP BY id;


SELECT c.*
FROM content c, json_array_elements(c.form->'blocks') blocks
WHERE blocks->>'text' ILIKE '%hello%'
GROUP BY id;

you can try this solution here http://sqlfiddle.com/#!17/6ca7c/21你可以在这里尝试这个解决方案http://sqlfiddle.com/#!17/6ca7c/21

PS more about CROSS JOIN LATERAL you can read at this thread PS 更多关于CROSS JOIN LATERAL你可以在这个线程中阅读

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM