简体   繁体   English

弹性搜索多重匹配得到错误的结果

[英]Elastic Search multi match gets wrong result

I am sending a query to Elastic Search to find all segments which has a field matching the query.我正在向 Elastic Search 发送查询以查找具有与查询匹配的字段的所有段。 We are implementing a "free search" which the user could write any text he wants and we build a query which search this text throw all the segments fields.我们正在实现一个“免费搜索”,用户可以编写他想要的任何文本,我们构建一个查询,搜索这个文本会抛出所有的段字段。 Each segment which one (or more) of it's fields has this text should return其一个(或多个)字段具有此文本的每个段都应返回

For example:例如:

I would like to get all the segments which with the name "tony lopez".我想获得所有名为“tony lopez”的片段。 Each segment has a field of "first_name" and a field of "last_name".每个段都有一个“first_name”字段和一个“last_name”字段。

The query our service builds:我们的服务构建的查询:

  "multi_match" : {
    "query": "tony lopez",
    "type": "best_fields"
    "fields": [],
    "operator": "OR"
  }

The result from the Elastic using this query is a segment which includes "first_name" field "tony" and "last_name" field "lopez", but also a segment when the "first_name" field is "joe" and "last_name" is "tony".使用此查询的 Elastic 的结果是包含“first_name”字段“tony”和“last_name”字段“lopez”的段,但也是“first_name”字段为“joe”且“last_name”为“tony”的段”。

In this type of query, I would like to recive only the segments which it's name is "tony (first_name) lopez (last_name)"在这种类型的查询中,我只想接收其名称为“tony (first_name) lopez (last_name)”的段

How can I fix that issue?我该如何解决这个问题?

Hope i'm not jumping into conclusions too soon but if you want to get only tony and lopez as firstname and lastname use this:希望我不会过早下结论,但是如果您只想将tonylopez作为名字和姓氏,请使用以下命令:

GET my_index/_search
{
  "query": { 
   "bool": {
     "must": [
       {
         "match": {
           "first": "tony"
         }
       },
       {
         "match": {
           "last": "lopez"
         }
       }
     ]
   }
  }
}

But if one of your indexed documents contains for example tony s as firstname, the query above will return it too.但是,如果您的一个索引文档包含例如tony s作为名字,则上面的查询也会返回它。

Why?为什么? firstname is a text datatype firstnametext数据类型

A field to index full-text values, such as the body of an email or the description of a product.用于索引全文值的字段,例如 email 的正文或产品的描述。 These fields are analyzed, that is they are passed through an analyzer to convert the string into a list of individual terms before being indexed.对这些字段进行分析,也就是说,它们在被索引之前通过分析器将字符串转换为单个术语的列表。

More Details 更多细节

If you run this query via kibana :如果您通过kibana运行此查询:

POST my_index/_analyze
{
  "field": "first", 
  "text": ["tony s"]
}

You will see that tony s is analyzed as two tokens tony and s .您将看到tony s被分析为两个标记tonys

passed through an analyzer to convert the string into a list of individual terms (tony as a term and s as a term).通过分析器将字符串转换为单个术语的列表(tony 作为术语,s 作为术语)。

That is why the above query returns tony s in results, it matches tony .这就是为什么上面的查询在结果中返回tony s ,它匹配tony

If you want to get only tony and lopez exact match then you should use this query:如果你只想得到tonylopez完全匹配,那么你应该使用这个查询:

GET my_index/_search
{
  "query": { 
   "bool": {
     "must": [
       {
         "term": {
           "first.keyword": {
             "value": "tony"
           }
         }
       },
       {
         "term": {
           "last.keyword": {
             "value": "lopez"
           }
         }
       }
     ]
   }
  }
}

Read about keyword datatype 阅读关键字数据类型

UPDATE更新

Try this query - it is not perfect same issue with my tony s example and if you have a document with firstname lopez and lastname tony it will find it.试试这个查询——这与我的tony s例子不是完全相同的问题,如果你有一个包含 firstname lopez和 lastname tony的文档,它会找到它。

GET my_index/_search
{
  "query": { 
   "multi_match": {
     "query": "tony lopez",
     "fields": [],
     "type": "cross_fields",
     "operator":"AND",
     "analyzer":   "standard"

   }
  }
}

The cross_fields type is particularly useful with structured documents where multiple fields should match. cross_fields 类型对于多个字段应该匹配的结构化文档特别有用。 For instance, when querying the first_name and last_name fields for “Will Smith”, the best match is likely to have “Will” in one field and “Smith” in the other例如,在查询 first_name 和 last_name 字段以查找“Will Smith”时,最佳匹配可能是一个字段中包含“Will”而另一个字段中包含“Smith”

cross fields 跨领域

Hope it helps希望能帮助到你

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM