简体   繁体   English

Elasticsearch中任意顺序的模糊匹配词

[英]Fuzzy match words in any order in Elasticsearch

What I need to achieve is to match documents based on single field (product name, which consists of basically all possible filter values).我需要实现的是基于单个字段(产品名称,基本上由所有可能的过滤器值组成)匹配文档。 I know it is not the most reliable solution, but I only have this one field to work with.我知道这不是最可靠的解决方案,但我只有这个领域可以使用。

I need to be able to send a search query and the words in that query to be matched in any order to the name field (name should contain all words from the search query).我需要能够发送搜索查询,并且该查询中的单词以任何顺序匹配到名称字段(名称应包含搜索查询中的所有单词)。 Actually at this point simple match_phrase_prefix works pretty well, but what is missing there is fuzziness.实际上,此时简单的match_phrase_prefix效果很好,但缺少的是模糊性。 Because another thing we need is to allow user make some typos and still get relevant results.因为我们需要做的另一件事是允许用户输入一些错字并仍然得到相关的结果。

My question is, is there any way to have match_phrase_prefix-like query, but with fuzziness?我的问题是,有什么办法可以进行类似 match_phrase_prefix 的查询,但有模糊性?

I tried some nested bool queries with match, but I don't get anything near match_phrase_prefix this way.我尝试了一些带有 match 的嵌套 bool 查询,但是我在match_phrase_prefix附近没有得到任何东西。

Examples of what I tried:我试过的例子:

Pretty good results, but no fuzziness:相当好的结果,但没有模糊性:

{
  "query": {
    "bool": {
      "must": [
        {
          "match_phrase_prefix": {
            "name.standard": {
              "query": "brand thing model",
              "slop": 10
            }
          }
        }
      ]
    }
  }
}

Fuzziness, but very limited matches:模糊,但非常有限的匹配:

    {
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "name.standard": {
                  "query": "thing",
                  "fuzziness": "AUTO",
                  "prefix_length": 3
                }
              }
            },
            {
              "match": {
                "name.standard": {
                  "query": "brand",
                  "fuzziness": "AUTO",
                  "prefix_length": 3
                }
              }
            }
          ]
        }
      }
    }

Using should above, I get more results, but they are way less relevant than the ones from first query.使用上面的should ,我得到了更多的结果,但它们的相关性比第一次查询的要少。

Above can be achieved by simple match query以上可以通过简单的匹配查询来实现

{
  "query": {
    "match": {
      "name.standard": {
        "query": "brand thing model",
        "operator": "and" ,//It means all of above 3 tokens must be present in any order
        "fuzziness": "AUTO" // value as per your choice
      }
    }
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM