简体   繁体   English

在Heroku上托管的Rails应用程序的地理空间和全文搜索

[英]Geospatial and full text search for Rails app hosted on Heroku

I'm planning out a Rails app that will be hosted on Heroku and will need both geospatial and full text search capabilities. 我正在计划一个将在Heroku上托管的Rails应用程序,并且需要地理空间和全文搜索功能。

I know that Heroku offers add-ons like WebSolr and IndexTank that sound like they can do the job, but I was wondering if this could be done in MySQL and/or PostgreSQL without having to pay for any add-ons? 我知道Heroku提供了像WebSolrIndexTank这样的附加组件,听起来他们可以完成这项工作,但我想知道这是否可以在MySQL和/或PostgreSQL中完成而无需为任何附加组件付费?

Depending on the scale of your application you should be able to accomplish both FULLTEXT and SPATIAL indexes in MySQL with ease. 根据应用程序的规模,您应该能够轻松地在MySQL中完成FULLTEXT和SPATIAL索引。 Once your application gets massive, ie hundreds of millions of rows with high concurrency and multiples of thousands of requests per second you might need to move to another solution for either FULLTEXT or SPATIAL queries. 一旦您的应用程序变得庞大,即数以亿计的行具有高并发性和每秒数千个请求的倍数,您可能需要转移到另一个FULLTEXT或SPATIAL查询的解决方案。 But, I wouldn't recommend optimize for that early on, since it can be very hard to do properly. 但是,我不建议尽早进行优化,因为它可能很难正确完成。 For the foreseeable future MySQL should suffice. 在可预见的未来,MySQL应该足够了。

You can read about spatial indexes in MySQL here . 您可以在此处阅读MySQL中的空间索引。 You can read about fulltext indexes in MySQL here . 你可以在这里阅读MySQL中的全文索引。 Finally, I would recommend taking the steps outlined here to make your schema.rb file and rake tasks work with these two index types. 最后,我建议采取此处概述的步骤,使schema.rb文件和rake任务适用于这两种索引类型。

I have only used MySQL for both, but my understanding is that PostgreSQL has a good geo-spatial index solution as well. 我只使用了MySQL,但我的理解是PostgreSQL也有一个很好的地理空间索引解决方案。

If you have a database at Heroku, you can use Postgres's support for Full Text Search: http://www.postgresql.org/docs/8.3/static/textsearch.html . 如果你在Heroku有一个数据库,你可以使用Postgres对全文搜索的支持: http//www.postgresql.org/docs/8.3/static/textsearch.html The oldest servers Heroku runs (for shared databases) are on 8.3 and 8.4. Heroku运行的最早的服务器(对于共享数据库)是8.3和8.4。 The newest are on 9.0. 最新的是9.0。

A blog post noticing this little fact can be seen here: https://tenderlovemaking.com/2009/10/17/full-text-search-on-heroku.html 可以在这里看到一篇博文,注意到这个小事实: https//tenderlovemaking.com/2009/10/17/full-text-search-on-heroku.html

Apparently, that "texticle" (heh. cute.) addon works...pretty well. 显然,那个“texticle”(嘿。可爱。)插件很有效。 It will even create the right indexes for you, as I understand it. 根据我的理解,它甚至会为您创建正确的索引。

Here's the underlying story: postgres full-text-search is pretty fast and fuss-free (although Rails-integration may not be great), although it does not offer the bells and whistles of Solr or IndexTank. 这是基本的故事:postgres全文搜索非常快速且没有大惊小怪(虽然Rails集成可能不是很好),虽然它没有提供Solr或IndexTank的花里胡哨。 Make sure you read about how to properly set up GIN and/or GiST indexes, and use the tsvector/tsquery types. 请务必阅读有关如何正确设置GIN和/或GiST索引以及使用tsvector / tsquery类型的信息。

The short version: 简短版本:

  • Create an (in this case, expression-based) index: CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector('english', body)); 创建一个(在这种情况下,基于表达式)索引: CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector('english', body)); . In this case "body" is the field being indexed. 在这种情况下,“body”是被索引的字段。
  • Use the @@ operator: SELECT * FROM ... WHERE to_tsvector('english', pgweb.body) @@ to_tsquery('hello & world') LIMIT 30 使用@@运算符: SELECT * FROM ... WHERE to_tsvector('english', pgweb.body) @@ to_tsquery('hello & world') LIMIT 30

The hard part may be mapping things back into application land, the blog post previously cited is trying to do that. 困难的部分可能是将事物映射回应用领域,之前引用的博客文章试图这样做。

The dedicated databases can also be requisitioned with PostGIS, which is a very powerful and fully featured system for indexing and querying geographical data. 专用数据库也可以使用PostGIS进行申请,PostGIS是一个功能强大且功能齐全的系统,用于索引和查询地理数据。 OpenStreetMap uses the PostgreSQL geometry types (built-in) extensively, and many people combine that with PostGIS to great effect. OpenStreetMap广泛使用PostgreSQL几何类型(内置),很多人将它与PostGIS结合起来,效果很好。

Both of these (full text search, PostGIS) take advantage of the extensible data type and indexing infrastructure in Postgres, so you should expect them to work with high performance for many, many records (spend a little time carefully reviewing the situation if things look busted). 这两种方法(全文搜索,PostGIS)都利用Postgres中的可扩展数据类型和索引基础结构,因此您应该期望它们能够为许多记录提供高性能(如果事情看起来花一点时间仔细查看情况破获)。 You might also take advantage of fact that you are able to leverage these features in combination with transactions and structured data. 您还可以利用这一事实,即您可以将这些功能与事务和结构化数据结合使用。 For example: 例如:

CREATE TABLE products (pk bigserial, price numeric, quantity integer, description text); can just as easily be used with full text search...any text field will do, and it can be in connection with regular attributes (price, quantity in this case). 可以很容易地与全文搜索一起使用...任何文本字段都可以,并且它可以与常规属性(在这种情况下的价格,数量)相关联。

I'd use thinking sphinx, a full text search engine also deployable on heroku. 我会使用思考sphinx,一个也可以在heroku上部署的全文搜索引擎。

It has geo search built-in: http://freelancing-god.github.com/ts/en/geosearching.html 它内置了地理搜索功能: http//freelancing-god.github.com/ts/en/geosearching.html

EDIT: 编辑:

Sphynx is almost ready for heroku, see here: http://flying-sphinx.com/ Sphynx几乎已经为heroku做好准备了,请看这里: http//flying-sphinx.com/

For full text search via Postgre I recommend pg_search , I am using it myself on heroku at the moment. 对于通过Postgre的全文搜索我推荐pg_search ,我现在正在heroku上使用它。 I have not used texticle but from what I can see pg_search has more development activity lately and it has been built upon texticle (it will not add indexes for you, you have to do it yourself). 我没有使用过texticle但是从我能看到的内容来看,pg_search最近有更多的开发活动,并且它是基于texticle构建的(它不会为你添加索引,你必须自己做)。

I cannot find the thread now but I saw that Heroku gave option for pg geo search but it was in beta. 我现在找不到该主题,但我看到Heroku为pg geo搜索提供了选项,但它处于测试阶段。

My advice is if you are not able to find postgre solution is to host your own instance of SOLR (on EC2 instance) and use sunspot solr gem to integrate it with rails. 我的建议是,如果你无法找到postgre解决方案是托管你自己的SOLR实例(在EC2实例上)并使用太阳黑子solr gem将它与rails集成。

I have implemented my own solution and used WebSolr as well. 我已经实现了自己的解决方案并使用了WebSolr Basically that is what they give you their own SOLR instance hassle free. 基本上这就是他们给你自己的SOLR实例无忧无虑。 Is it worth the money, in my opinion no. 在我看来是不值得的钱。 For integration that use sunspot solr client as well, so it is just are you going to pay somebody 20$/40$/... to host SOLR for you. 对于使用太阳黑子solr客户端的集成,所以你只需支付20 $ / 40 $ / ...来为你托管SOLR。 I know you also get backups, maintenance etc. but call me cheap I prefer my own instance. 我知道你也得到备份,维护等但是打电话给我便宜我更喜欢我自己的实例。 Also WebSolr is locked on 1.4.x version of SOLR. 此外,WebSolr还锁定在SOLR的1.4.x版本上。

IndexTank is now free up to 100k documents on Heroku, we just haven't updated the documentation. IndexTank现在可以在Heroku上免费获得100k文档,我们还没有更新文档。 This may not be enough for your needs, but I thought I'd let you know just in case. 这可能不足以满足您的需求,但我想我会告诉您以防万一。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM