简体   繁体   中英

Full text search for Ruby on Rails app using Postgres and pg_search gem

I have posted this query on pg_search's google group here:

https://groups.google.com/forum/?fromgroups#!topic/casecommons-dev/3tbCthkDHg0

But no responses so I am posting it here on the StackOverflow. My question is: should I be creating GIN type indexes when using pg_search gem for the following circumstance?

My searches are limited to using pg_search_scope for searching within a single model.

Here is a specific example:

class Scenario < ActiveRecord::Base
   ...
   include PgSearch
  pg_search_scope :search, :against => [:name, :compute_ngls],
   :using => { :tsearch => {:dictionary => "english"} }


   def self.text_search(query)
     if query.present?
       search(sanitize(query))
     else
       scoped
     end
   end
  ...
end

The call to the text_search method is as follow:

  scenarios = scenarios.text_search(params[:sSearch])

I only have regular btree indexes on certain columns, eg, :name for instance. I do not have gin or gist indexes. My question is: should I explicitly create these indexes? If yes then which kind and on which columns? Can you please give me the syntax for creating these indexes?

The question on whether to create an index is not something that one can know from looking at your Ruby code and may not be able to know looking at your db schema either. The questions depend on how selective your queries are, how much data is being indexes, and how large the tables are page-wise. This is because PostgreSQL tables support physical-order scans which are often faster than index-driven lookups if a significant portion of the table is retrieved.

GIN and GiST indexes will help you with full text search while btrees are not particularly helpful but for them to be useful you have to be indexing tables of significant size and pulling only a small portion of that table.

My preference for this is to wait for there to be performance reasons to create the index rather than creating up-front, when in doubt.

EDIT

More recent heavy GIN expeirence with full text searching has lead me to reverse my advice above. I now believe that specifically for full text indexes, you are far better to create the GIN index first and then drop it if it interferes with performance than otherwise.

Note that GIN has major write overhead so it is definitely not free. However, FTS indexes are almost always selective enough to be useful if fts is a major use case in your application.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM