简体   繁体   English

Google App Engine上的反向指数数据存储

[英]Inverted Indices Data Store on Google App Engine

Google App Engine (GAE) provides a way to do Full Text Search (FTS) and store and retrieve documents. Google App Engine(GAE)提供了一种进行全文搜索(FTS)以及存储和检索文档的方法。 The default document ranking is based on a time offset. 默认文档排名基于时间偏移。 Is there a way to do a Lucene style Inverted indices look-up and ranking on GAE? 有没有办法在GAE上进行Lucene风格的倒排索引查找和排名? If not what are some other options to do this. 如果不是,还有哪些其他选择可以执行此操作。

Use case: FTS and intelligent ranking of results (at least search query frequency based) for bunch of html pages. 用例:FTS和一堆html页面的结果智能排名(至少基于搜索查询频率)。

Both GAE Datastore and GAE Search API can do query-by-index: GAE数据存储区和GAE搜索API都可以执行按索引查询:

  1. Datastore is a NoSQL datastore with user-defined indexes and limited queries . 数据存储区是具有用户定义的索引和有限查询的NoSQL数据存储区。 It's a database: fast, distributed and has transactions. 这是一个数据库:快速,分布式且具有事务。 Queries are however quite restricted : They can only span one Entity kind, so no JOINs. 但是,查询受到严格限制 :它们只能跨一种Entity类型,因此没有JOIN。 Only one inequality filter per query, so no geo-point search is possible. 每个查询只能使用一个不等式过滤器,因此无法进行地理位置搜索。 Also, string search is exact, so no sub-string search, regex search or LIKE search is possible. 同样,字符串搜索是精确的,因此不可能进行子字符串搜索,正则表达式搜索或LIKE搜索。

  2. Search API is more like Lucene: you store documents and build indexes from parts of the documents. 搜索API更像Lucene:您存储文档并从文档的一部分构建索引。 It supports full-text search and geo-point search (eg finding geo-points within certain distance from given geo-point). 它支持全文搜索和地理位置搜索(例如,查找距给定地理位置一定距离内的地理位置)。

If you gave us a more specific use case, we might be able to help you decide which one to use. 如果您给了我们一个更具体的用例,我们也许可以帮助您决定使用哪个用例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM