简体   繁体   English

Google App引擎中的近似搜索

[英]Approximate search in Google App engine

I am currently working on a solution for searching brand names, so far we have about 10M different brands and we are using Google Cloud Search API. 我目前正在研究一种搜索品牌名称的解决方案,到目前为止,我们大约有1000万个不同的品牌,并且我们正在使用Google Cloud Search API。 We are currently indexing the 3-grams for each brand name, getting an user query and again extracting the 3-grams, then we search for documents containing all the 3-grams. 我们目前正在为每个商标的3克索引,获取用户查询,然后再次提取3克,然后搜索包含所有3克的文档。

What we would like to do is to find not only documents having all 3-grams but also documents having at least one and sorting the results by the number of matches. 我们要做的是不仅找到具有全部3克的文档,而且还发现具有至少3克的文档,然后按匹配数量对结果进行排序。 Would it be possible to do that using the Google Cloud Search API? 使用Google Cloud Search API可以做到这一点吗? Or should I be looking into something like Elastic Search? 还是我应该研究类似Elastic Search的东西?

Best. 最好。

For anyone on a similar situation we ended up using Elastic Search and it has proven to be a lot more flexible than Google Full Text Search. 对于遇到类似情况的任何人,我们最终都使用了弹性搜索,事实证明,它比Google全文搜索灵活得多。

And even thought searching for a limited amount of N-grams was not possible Elastic allows edit distance queries which helped us to find misspellings and similar words which was essential in our use case. 甚至认为不可能搜索有限数量的N-gram,Elastic允许编辑距离查询,这有助于我们找到拼写错误和类似单词,这在用例中是必不可少的。

We also noticed a great improvement on the search speed and specially on indexing. 我们还注意到搜索速度,特别是索引编制方面的巨大进步。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM