简体   繁体   中英

Elasticsearch retrieve matched field in multi_match

Given a data structure where multiple fields are searched over, how can I retrieve the one that matched?

Example data:

person { 
    "id": 123, 
    "name": Bill, 
    "name": William, 
    "surname": Smith
}

And the query is something like:

GET _search
{
  "query": {
    "multi_match" : {
      "query":    "Will", 
      "fields": [ "name", "surname" ] 
    }
  }
}

Is there a way to get ES to return

hits[
    type: person, 
    id: 123, 
    matched_name: "William"
]

What I need is to go over the two (or more) names Bill and William and return the one that best matches the query Will .

I'm aware of highlighting, and perhaps there's a way to use "content" : {"type" : "plain"} to return the matched field without highlighting it.

The closest solution as of my knowledge is to use named queries and a bool query:

GET mynames/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "multi_match" : {
            "query": "Bill", 
            "fields": ["name"],
            "_name" : "name"
          }
        },
        {
          "multi_match" : {
            "query": "Bill", 
            "fields": ["surname"],
            "_name" : "surname"
          }
        }
      ]
    } 
  }
}

This will give the following result:

{
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "mynames",
        "_type": "_doc",
        "_id": "123",
        "_score": 0.2876821,
        "_source": {
          "id": 123,
          "name": [
            "Bill",
            "William"
          ],
          "surname": "Smith"
        },
        "matched_queries": [
          "name"  // <== the "name" part matched
        ]
      }
    ]
  }
}

This is not exactly a drop-in replacement for multi_match query as it does some magic behind the scenes, but it should be possible to obtain the same behavior via a combination of bool and other queries, like multi_match , match , dis_max , function_score , etc.

Hope that helps!

This can be accomplished with the highlight functionality. A field with multiple entries can be searched, and then only the matching strings are returned.

GET mynames/_search
{
  "fields": ["_id", "surname"], 
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "Will",
            "type": "phrase_prefix", 
            "fields": [
              "name",
              "surname"
            ]
          }
        }
      ]
    }
  },
  "highlight": {
    "order": "score",
    "pre_tags": [""],
    "post_tags": [""], 
    "fields": {
      "names": {"fragment_size": 150, "number_of_fragments": 3}
    }
  }
}

returns

"hits": [
      {
        "_id": "123",
        "fields": {
          "surname": [
            "Smith"
          ]
        },
        "highlight": {
          "names": [
            "William"
          ]
        }
      },
      ...

Where the matched field is listed under highlight. Note the removal of tags, which aren't necessary here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM