简体   繁体   中英

Elasticsearch 6.2.4: how to order aggregation results by _score?

I've got the wanted query results in query part ordered by _score desc. Now I need to extract 3 fields of each document. I want to achieve like:

select distinct field1, field2, field3 from table A;

I just tried:

1) Use collapse to remove the repeated values

GET index/_search
{
  "collapse" : {
        "field" : "filed1.keyword" 
    }
  ...
}

But the problem is that it would only keep the distinct values of filed1 but ignore the values of field2 and fiel3. For example, we have 2 records like:

[1, "a", "b"], [1, "c", "d"] 

Using this method we can only get one record since they have the same value of field1. I want the distinct combined values of these three fields. We can use inner_hits to get the distict value of the second field, but according to https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-collapse.html : Second level of collapsing doesn't allow inner_hits. That means it cannot be applied to get dicstinct values amoung multiple fields (>2).

2) Use aggregations:

GET index/_search
{
  "aggs": {
    "field1": {
      "terms": {
        "field": "field1.keyword"
      },
      "aggs": {
        "field2": {
          "terms": {
            "field": field2.keyword",
            "missing": ""
          },
          "aggs": {
            "field3": {
              "terms": {
                "field": "field3.keyword",
                "missing": ""
              }
            }
          }
        }
      }
    }
  },
  ...
}

It returns the distict values of [field1, field2, field3] but the order of the documents is changed. They're ordered by doc_count but not _score and we cannot get the information of _score in the results.

So how can we get the distinct combined values without changing the current order (since we have customed the documents order in "query" part)? Or how can we order aggregation results by _score?

Thanks!

Below is sample query as how you use aggregations for _score .

POST <your_index_name>/_search
{
  "query": {
    "match": {
      "<yourfield>": "<yourquery>"
    }
  },
  "aggs": {
    "myaggs": {
      "terms": {
        "script": "_score"
      }
    }
  }
}

And hence your above aggregation query would be in the below form:

POST <your_index_name>/_search
{  
   "size":0,
   "query":{  
      "match":{  
         "field1": "search non-search"
      }
   },
   "aggs":{  
      "myaggs":{  
         "terms":{  
            "field":"field1.keyword",
            "order": {
              "_term": "asc"
            }
         },
         "aggs":{  
            "myotheraggs":{  
               "terms":{  
                  "field":"field2.keyword",
                  "order": {
                    "_term": "asc"
                  }
               },
               "aggs":{  
                  "myotheraggs2":{  
                     "terms":{  
                       "field":"field3.keyword",
                       "order": {
                          "_term": "asc"
                       }
                     },
                     "aggs":{  
                        "myscoreaggs":{  
                           "terms":{  
                              "script":"_score",
                              "order": {
                                  "_term": "desc"
                              }
                           }
                        }
                     }
                  }
               }
            }
         }
      }

   }
}

So basically, the above query would return you in the order of field1 field2 field3 score where field1 field2 field3 would be sorted based on the asc lexicographical order while _score would be sorted in desc order

For eg below is the sample sorted data as how it would appear

field1|field2|field3|score
--------------------------------
non-search|lucene|graphdb|1
search|lucene|elasticsearch|2
search|lucene|elasticsearch|1
search|lucene|solr|2
search|lucene|solr|1

Updated Answer Post Chat

POST someindex/_search
{  
   "size":0,
   "aggs":{  
      "myagg":{  
         "terms":{  
            "script":{  
               "source":"doc['field1'].value + params.param + doc['field2'].value + params.param + doc['field3'].value",
               "lang":"painless",
               "params":{  
                  "param":", "
               }
            },
            "order":{  
               "_term":"asc"
            }
         }
      }
   }
}

Let me know if that helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM