正确的Postgres全文搜索索引

Question

I'm creating a multi-column full text search index and currently I have this running 我正在创建一个多列全文搜索索引，目前正在运行

CREATE INDEX products_search_document ON products
USING gin(to_tsvector('english', style_number || ' ' || brand || ' ' || style_description || ' ' || color));

This works great for queries that I'm using like this 这对我正在使用的查询非常有用

SELECT * FROM "products"
WHERE (to_tsvector('english', style_number||' '||brand||' '||style_description||' '||color)
      @@ to_tsquery('english', 'G2000'))

I'd like to use prefix matching now though so that my query would look like this: 我现在想使用前缀匹配，以便查询如下所示：

SELECT * FROM "products"
WHERE (to_tsvector('english', style_number||' '||brand||' '||style_description||' '||color)
      @@ to_tsquery('english', 'G2000:*'))

and when I do run this on my Heroku postgres instance, I'm getting a Seq Scan on products instead of an Indexed scan. 当我在Heroku postgres实例上运行它时，我得到的是Seq Scan on products的Seq Scan on products而不是索引扫描。

What other index would I need to use the prefix matcher in Postgres? 在Postgres中使用前缀匹配器还需要什么其他索引？

Answer 1

奇怪的是，我删除了索引并重新创建了索引……这解决了该问题。

Answer 2

Have you tried doing: 您是否尝试过：

set enable_seqscan=off;

and then running your query to see if it uses it. 然后运行查询以查看它是否使用它。 I don't see why it wouldn't. 我不明白为什么不会。 My suspicion is the planner thinks there is not enough specificity for that particular search so thinks a sequential scan is more efficient than a fulltext scan. 我怀疑规划者认为该特定搜索没有足够的特异性，因此认为顺序扫描比全文扫描更有效。

That said, I think for prefix queries (where you don't won't stem equivalency to kick in eg postgraduate and postgres being considered equivalent) a btree text_pattern_ops, gist(gist_gtrgm_ops) or a gin index (I think spgist might be good but haven't done any metrics on that) on just the concatenated values or even (just on style_number ) if that is all you will be prefixing, would be more efficient than full text. 就是说，我认为对于前缀查询（在这种情况下，您不会阻止等价于研究生和Postgres的查询），btree text_pattern_ops，gist（gist_gtrgm_ops）或gin索引（我认为spgist可能不错，但甚至没有对连接的值做任何度量，或者甚至（仅对style_number进行），如果这只是您要加前缀的值，它将比全文效率更高。 Your query would not use tsvector, would just use 您的查询将不会使用tsvector，而只会使用

style_number LIKE 'G5000%' style_number喜欢'G5000％'

style_number ILIKE 'G5000%' style_number ILIKE'G5000％'

and your index would be just on style_number or concatenated values 并且您的索引将仅位于style_number或串联值上

If you need case insensitivity then use gist(gist_trgm_ops) like covered here: http://www.postgresonline.com/journal/archives/212-PostgreSQL-9.1-Trigrams-teaching-LIKE-and-ILIKE-new-tricks.html 如果您需要不区分大小写，请使用此处所述的gist（gist_trgm_ops）： http : //www.postgresonline.com/journal/archives/212-PostgreSQL-9.1-Trigrams-teaching-LIKE-and-ILIKE-new-tricks.html

正确的Postgres全文搜索索引

问题描述

2 个解决方案

解决方案1
1 2015-01-28 04:34:35

解决方案2
0 已采纳 2015-01-27 16:19:32

正确的Postgres全文搜索索引

问题描述

2 个解决方案

解决方案1 1 2015-01-28 04:34:35

解决方案2 0 已采纳 2015-01-27 16:19:32

解决方案1
1 2015-01-28 04:34:35

解决方案2
0 已采纳 2015-01-27 16:19:32