简体   繁体   English

Elastic Search 匹配短语查询 -> 输出不可预测

[英]Elastic Search match phrase query -> output not predictable

Sample doc示例文档

{
  "id": 5,
  "title": "Quick Brown fox jumps over the lazy dog",
  "genre": [
    "fiction"
  ]
}

Mapping映射

{
  "movies" : {
    "mappings" : {
      "properties" : {
        "genre" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "id" : {
          "type" : "long"
        },
        "title" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

Query1: Results in the document shared earlier Query1:先前共享的文档中的结果

{
 "query": {
   "match_phrase": {
     "title": {
       "query": "fox quick over", "slop": 3
     }
   }
 } 
}

Query2: No Results查询 2:没有结果

{
 "query": {
   "match_phrase": {
     "title": {
       "query": "over fox quick", "slop": 3
     }
   }
 } 
}

I was expecting a result in query2 rather than in query 1.我期望在 query2 中而不是在查询 1 中得到结果。

Slop渣滓

Number of times you need to move a term in order to make the query and document match.您需要移动术语以使查询和文档匹配的次数。

Switching word order requires two edits/steps切换词序需要两次编辑/步骤

Below is movement of words下面是文字的移动

Query 1:查询 1:

Pos 1         Pos 2         Pos 3     Pos 4     Pos 5   Pos 6  Pos 7   Pos 8
--------------------------------------------------------------------------------------
Doc:        quick         brown         fox       jumps     over    the   lazy    dog
---------------------------------------------------------------------------------------
Query:                                  fox       quick     over
Slop 1:                                 fox|quick           over                                       
Slop 2:                   quick         fox                 over
Slop 3:    quick                        fox                 over

total steps 3总步骤 3

Query 2:查询 2:

Pos 1         Pos 2         Pos 3     Pos 4   Pos 5   Pos 6  Pos 7   Pos 8
--------------------------------------------------------------------------------------
Doc:        quick         brown         fox       jumps    over    the   lazy    dog
---------------------------------------------------------------------------------------
Query:                    over          fox       quick
Slop 1:                   over          fox|quick            
Slop 2:                   quick|over    fox           
Slop 3:     quick         over          fox       
Slop 4:     quick                       over|fox      
Slop 5:     quick                       fox       over
Slop 6:     quick                       fox               over

Total steps 6总步骤 6

So, I reproduced the issue, with the mapping you provided and was able to troubleshoot the issue, with the help of Explain API and this article on slop in match_phrase queries.所以,我转载的问题,根据您提供的映射,并能解决问题,用的帮助解释API文章中match_phrase查询对坡面。

So your second query gives result when minimum slop of 6 is given as shown in my search result.因此,如我的搜索结果所示,当给出slop of 6最小slop of 6时,您的第二个查询会给出结果。

Search query搜索查询

{
 "query": {
   "match_phrase": {
     "title": {
       "query": "over fox quick", "slop": 6 --> note 6
     }
   }
 } 
}

Similarly, you need to give a minimum slop of 3 to bring the search result from your first query.同样,您需要提供minimum slop of 3minimum slop of 3才能从您的第一个查询中获取搜索结果。

Basically slop value means, allowable deviation of the configurable term.斜率值基本上是指可配置项的允许偏差。

Example:- your doc contains Quick Brown fox jumps over the lazy dog .示例:- 您的文档包含Quick Brown fox jumps over the lazy dog

Quick
Brown
fox
jumps
over
the
lazy 
dog

And if you are searching for fox quick over as a phrase, they all need to come together , for that you need to rearrange the tokens mentioned above.如果您正在将fox quick over作为短语fox quick over搜索,则它们都需要组合在一起,因此您需要重新排列上述标记。

Minimum replacement required is 3 as shown following:所需的最少更换为 3,如下所示:

fox and over no need to change anything, as they are already in order and quick needs to make 3 replacement, in order to come to its correct position. foxover不需要改变任何东西,因为它们已经是有序的,并且需要quick进行 3 次替换,以达到其正确的位置。

Using the same method you can figure out why six slop is required in your second query to work.使用相同的方法,您可以找出为什么在第二个查询中需要 6 个 slop 才能工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM