简体   繁体   中英

Geospatial and full text search for Rails app hosted on Heroku

I'm planning out a Rails app that will be hosted on Heroku and will need both geospatial and full text search capabilities.

I know that Heroku offers add-ons like WebSolr and IndexTank that sound like they can do the job, but I was wondering if this could be done in MySQL and/or PostgreSQL without having to pay for any add-ons?

Depending on the scale of your application you should be able to accomplish both FULLTEXT and SPATIAL indexes in MySQL with ease. Once your application gets massive, ie hundreds of millions of rows with high concurrency and multiples of thousands of requests per second you might need to move to another solution for either FULLTEXT or SPATIAL queries. But, I wouldn't recommend optimize for that early on, since it can be very hard to do properly. For the foreseeable future MySQL should suffice.

You can read about spatial indexes in MySQL here . You can read about fulltext indexes in MySQL here . Finally, I would recommend taking the steps outlined here to make your schema.rb file and rake tasks work with these two index types.

I have only used MySQL for both, but my understanding is that PostgreSQL has a good geo-spatial index solution as well.

If you have a database at Heroku, you can use Postgres's support for Full Text Search: http://www.postgresql.org/docs/8.3/static/textsearch.html . The oldest servers Heroku runs (for shared databases) are on 8.3 and 8.4. The newest are on 9.0.

A blog post noticing this little fact can be seen here: https://tenderlovemaking.com/2009/10/17/full-text-search-on-heroku.html

Apparently, that "texticle" (heh. cute.) addon works...pretty well. It will even create the right indexes for you, as I understand it.

Here's the underlying story: postgres full-text-search is pretty fast and fuss-free (although Rails-integration may not be great), although it does not offer the bells and whistles of Solr or IndexTank. Make sure you read about how to properly set up GIN and/or GiST indexes, and use the tsvector/tsquery types.

The short version:

  • Create an (in this case, expression-based) index: CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector('english', body)); . In this case "body" is the field being indexed.
  • Use the @@ operator: SELECT * FROM ... WHERE to_tsvector('english', pgweb.body) @@ to_tsquery('hello & world') LIMIT 30

The hard part may be mapping things back into application land, the blog post previously cited is trying to do that.

The dedicated databases can also be requisitioned with PostGIS, which is a very powerful and fully featured system for indexing and querying geographical data. OpenStreetMap uses the PostgreSQL geometry types (built-in) extensively, and many people combine that with PostGIS to great effect.

Both of these (full text search, PostGIS) take advantage of the extensible data type and indexing infrastructure in Postgres, so you should expect them to work with high performance for many, many records (spend a little time carefully reviewing the situation if things look busted). You might also take advantage of fact that you are able to leverage these features in combination with transactions and structured data. For example:

CREATE TABLE products (pk bigserial, price numeric, quantity integer, description text); can just as easily be used with full text search...any text field will do, and it can be in connection with regular attributes (price, quantity in this case).

I'd use thinking sphinx, a full text search engine also deployable on heroku.

It has geo search built-in: http://freelancing-god.github.com/ts/en/geosearching.html

EDIT:

Sphynx is almost ready for heroku, see here: http://flying-sphinx.com/

For full text search via Postgre I recommend pg_search , I am using it myself on heroku at the moment. I have not used texticle but from what I can see pg_search has more development activity lately and it has been built upon texticle (it will not add indexes for you, you have to do it yourself).

I cannot find the thread now but I saw that Heroku gave option for pg geo search but it was in beta.

My advice is if you are not able to find postgre solution is to host your own instance of SOLR (on EC2 instance) and use sunspot solr gem to integrate it with rails.

I have implemented my own solution and used WebSolr as well. Basically that is what they give you their own SOLR instance hassle free. Is it worth the money, in my opinion no. For integration that use sunspot solr client as well, so it is just are you going to pay somebody 20$/40$/... to host SOLR for you. I know you also get backups, maintenance etc. but call me cheap I prefer my own instance. Also WebSolr is locked on 1.4.x version of SOLR.

IndexTank is now free up to 100k documents on Heroku, we just haven't updated the documentation. This may not be enough for your needs, but I thought I'd let you know just in case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM