简体   繁体   English

使用Ngram进行Elasticsearch自动完成

[英]Elasticsearch Auto complete using ngram

Im kind of new in Elasticsearch and I have a question on implementing autocomplete feature using NGram. 我是Elasticsearch的新手,我对使用NGram实现自动完成功能有疑问。 From the internet, I understand that the NGram implementation allows a flexible solution such as match from middle, highlighting and etc, compared to using the inbuilt completion suggesters. 从互联网上,我了解到,与使用内置完成建议程序相比,NGram实现提供了灵活的解决方案,例如中间匹配,突出显示等。

Thus, I have the following field mapping for one of my index types: 因此,对于我的一种索引类型,我具有以下字段映射:

"suggest_keywords": {
    "type": "string",
    "analyzer": "nGram_analyzer",
    "search_analyzer": "whitespace_analyzer"
},

nGram analyzer config: nGram分析器配置:

"nGram_analyzer": {
    "filter": [
        "lowercase",
        "asciifolding",
        "nGram_filter"
        ],
    "type": "custom",
    "tokenizer": "whitespace"
}

The following is the sample data that I would have for for field. 以下是我将用于字段的示例数据。

"suggest_keywords": [
        "Wholesale",
        "Fish",
        "Seafood",
        "Fishmongers",
        "Markets"
],

When i query using the following, it will return with whole array. 当我使用以下查询时,它将返回整个数组。 Since I only need the few 因为我只需要几个

{
    "query": {
        "match":{
            "suggest_keywords" : "food"
        }
   }
}

I tried to using highlight to extract the individual terms, but the highlighted terms exists in each documents in the search result. 我尝试使用高亮显示来提取各个术语,但是高亮显示的术语存在于搜索结果中的每个文档中。 I tried with the aggregations but failed to write a query that will combine both the highlight and aggregations. 我尝试使用聚合,但未能编写将突出显示和聚合结合在一起的查询。 Is it possible to do so? 有可能这样做吗?

{
   "query": {
      "match": {
         "suggest_keywords": "nge"
      }
   },
   "highlight": {
      "fields": {
         "suggest_keywords": {}
      }
   }
}

Or is there better implementation for searching from an nGram analyzed array? 还是从nGram分析数组中搜索更好的实现? or should I index all these keywords into different types? 还是应该将所有这些关键字编入不同的类型?

Thanks! 谢谢!

You better index each item of array in separate document which allows you to find only matching documents. 您最好在单独的文档中为数组的每个项目建立索引,从而仅查找匹配的文档。

Instead of using: 而不是使用:

POST /doctype
{
    "suggest_keywords": [
        "Wholesale",
        "Fish",
        "Seafood",
        "Fishmongers",
        "Markets"
    ]
}

use separate documents and index them separately: 使用单独的文档并对它们分别进行索引:

doc1: doc1:

POST /doctype
{
   "suggest_keywords": "Wholesale"
}

doc2: doc2:

POST /doctype
{
   "suggest_keywords": "Fish"
}

and so on... 等等...

Then on search result you'll get matching result in separate docuements 然后在搜索结果中,您将在单独的文档中获得匹配的结果

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM