简体   繁体   English

使用Postgres和pg_search gem全文搜索Ruby on Rails应用

[英]Full text search for Ruby on Rails app using Postgres and pg_search gem

I have posted this query on pg_search's google group here: 我已经在pg_search的google组上发布了此查询:

https://groups.google.com/forum/?fromgroups#!topic/casecommons-dev/3tbCthkDHg0 https://groups.google.com/forum/?fromgroups#!topic/casecommons-dev/3tbCthkDHg0

But no responses so I am posting it here on the StackOverflow. 但是没有回应,因此我将其发布在StackOverflow上。 My question is: should I be creating GIN type indexes when using pg_search gem for the following circumstance? 我的问题是:在以下情况下使用pg_search gem时,是否应该创建GIN类型索引?

My searches are limited to using pg_search_scope for searching within a single model. 我的搜索仅限于使用pg_search_scope在单个模型中进行搜索。

Here is a specific example: 这是一个具体示例:

class Scenario < ActiveRecord::Base
   ...
   include PgSearch
  pg_search_scope :search, :against => [:name, :compute_ngls],
   :using => { :tsearch => {:dictionary => "english"} }


   def self.text_search(query)
     if query.present?
       search(sanitize(query))
     else
       scoped
     end
   end
  ...
end

The call to the text_search method is as follow: 对text_search方法的调用如下:

  scenarios = scenarios.text_search(params[:sSearch])

I only have regular btree indexes on certain columns, eg, :name for instance. 我在某些列上只有常规btree索引,例如:name。 I do not have gin or gist indexes. 我没有杜松子酒或吉斯索引。 My question is: should I explicitly create these indexes? 我的问题是:我应该明确创建这些索引吗? If yes then which kind and on which columns? 如果是,那么哪种类型,在哪些列上? Can you please give me the syntax for creating these indexes? 能否请您给我介绍创建这些索引的语法?

The question on whether to create an index is not something that one can know from looking at your Ruby code and may not be able to know looking at your db schema either. 关于是否创建索引的问题不是通过查看您的Ruby代码就可以知道的,而且也不可能通过查看您的db模式而知道。 The questions depend on how selective your queries are, how much data is being indexes, and how large the tables are page-wise. 这些问题取决于查询的选择性,正在索引的数据量以及表的大小。 This is because PostgreSQL tables support physical-order scans which are often faster than index-driven lookups if a significant portion of the table is retrieved. 这是因为PostgreSQL表支持物理顺序扫描,如果检索到表的重要部分,则通常比索引驱动的查找要快。

GIN and GiST indexes will help you with full text search while btrees are not particularly helpful but for them to be useful you have to be indexing tables of significant size and pulling only a small portion of that table. GIN和GiST索引将帮助您进行全文搜索,而btrees并不是特别有用,但要使它们有用,您必须对大量表进行索引并仅提取该表的一小部分。

My preference for this is to wait for there to be performance reasons to create the index rather than creating up-front, when in doubt. 我的首选是等待有性能原因来创建索引,而不是在有疑问时先创建索引。

EDIT 编辑

More recent heavy GIN expeirence with full text searching has lead me to reverse my advice above. 最近对GIN的大量搜索以及对全文检索的丰富经验使我改变了上面的建议。 I now believe that specifically for full text indexes, you are far better to create the GIN index first and then drop it if it interferes with performance than otherwise. 我现在认为,特别是对于全文索引,最好先创建GIN索引,然后再创建GIN索引(如果它影响性能的话),然后再删除它。

Note that GIN has major write overhead so it is definitely not free. 请注意,GIN具有较大的写开销,因此它绝对不是免费的。 However, FTS indexes are almost always selective enough to be useful if fts is a major use case in your application. 但是,如果fts是应用程序中的主要用例,则FTS索引几乎总是具有足够的选择性,以很有用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM