简体   繁体   English

Elasticsearch 6.2.4:如何通过_score排序聚合结果?

[英]Elasticsearch 6.2.4: how to order aggregation results by _score?

I've got the wanted query results in query part ordered by _score desc. 我在按_score desc排序的查询部分中有所需的查询结果。 Now I need to extract 3 fields of each document. 现在,我需要提取每个文档的3个字段。 I want to achieve like: 我想实现像:

select distinct field1, field2, field3 from table A;

I just tried: 我只是试过:

1) Use collapse to remove the repeated values 1)使用折叠删除重复的值

GET index/_search
{
  "collapse" : {
        "field" : "filed1.keyword" 
    }
  ...
}

But the problem is that it would only keep the distinct values of filed1 but ignore the values of field2 and fiel3. 但是问题在于它只会保留filed1的不同值,而忽略field2和fiel3的值。 For example, we have 2 records like: 例如,我们有2条记录,例如:

[1, "a", "b"], [1, "c", "d"] 

Using this method we can only get one record since they have the same value of field1. 使用这种方法,我们只能获得一条记录,因为它们具有相同的field1值。 I want the distinct combined values of these three fields. 我想要这三个字段的不同组合值。 We can use inner_hits to get the distict value of the second field, but according to https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-collapse.html : Second level of collapsing doesn't allow inner_hits. 我们可以使用inner_hits来获取第二个字段的distict值,但是根据https://www.elastic.co/guide/zh-CN/elasticsearch/reference/current/search-request-collapse.html :第二级折叠不会不允许inner_hits。 That means it cannot be applied to get dicstinct values amoung multiple fields (>2). 这意味着它不能用于获取多个字段(> 2)之间的区别值。

2) Use aggregations: 2)使用聚合:

GET index/_search
{
  "aggs": {
    "field1": {
      "terms": {
        "field": "field1.keyword"
      },
      "aggs": {
        "field2": {
          "terms": {
            "field": field2.keyword",
            "missing": ""
          },
          "aggs": {
            "field3": {
              "terms": {
                "field": "field3.keyword",
                "missing": ""
              }
            }
          }
        }
      }
    }
  },
  ...
}

It returns the distict values of [field1, field2, field3] but the order of the documents is changed. 它返回[field1,field2,field3]的离散值,但是文档的顺序已更改。 They're ordered by doc_count but not _score and we cannot get the information of _score in the results. 它们是由doc_count排序的,而不是_score排序的,我们无法在结果中获取_score的信息。

So how can we get the distinct combined values without changing the current order (since we have customed the documents order in "query" part)? 那么,如何在不更改当前顺序的情况下获得不同的组合值(因为我们已在“查询”部分中自定义了文档顺序)? Or how can we order aggregation results by _score? 或者我们如何通过_score排序聚合结果?

Thanks! 谢谢!

Below is sample query as how you use aggregations for _score . 以下是如何使用_score aggregations示例查询。

POST <your_index_name>/_search
{
  "query": {
    "match": {
      "<yourfield>": "<yourquery>"
    }
  },
  "aggs": {
    "myaggs": {
      "terms": {
        "script": "_score"
      }
    }
  }
}

And hence your above aggregation query would be in the below form: 因此,您上面的汇总查询将采用以下形式:

POST <your_index_name>/_search
{  
   "size":0,
   "query":{  
      "match":{  
         "field1": "search non-search"
      }
   },
   "aggs":{  
      "myaggs":{  
         "terms":{  
            "field":"field1.keyword",
            "order": {
              "_term": "asc"
            }
         },
         "aggs":{  
            "myotheraggs":{  
               "terms":{  
                  "field":"field2.keyword",
                  "order": {
                    "_term": "asc"
                  }
               },
               "aggs":{  
                  "myotheraggs2":{  
                     "terms":{  
                       "field":"field3.keyword",
                       "order": {
                          "_term": "asc"
                       }
                     },
                     "aggs":{  
                        "myscoreaggs":{  
                           "terms":{  
                              "script":"_score",
                              "order": {
                                  "_term": "desc"
                              }
                           }
                        }
                     }
                  }
               }
            }
         }
      }

   }
}

So basically, the above query would return you in the order of field1 field2 field3 score where field1 field2 field3 would be sorted based on the asc lexicographical order while _score would be sorted in desc order 因此,基本上,以上查询将按field1 field2 field3 score的顺序返回您,其中field1 field2 field3将根据asc lexicographical顺序进行排序,而_score则按desc顺序进行排序

For eg below is the sample sorted data as how it would appear 例如,下面是样本排序数据的样子

field1|field2|field3|score
--------------------------------
non-search|lucene|graphdb|1
search|lucene|elasticsearch|2
search|lucene|elasticsearch|1
search|lucene|solr|2
search|lucene|solr|1

Updated Answer Post Chat 更新的答案帖子聊天

POST someindex/_search
{  
   "size":0,
   "aggs":{  
      "myagg":{  
         "terms":{  
            "script":{  
               "source":"doc['field1'].value + params.param + doc['field2'].value + params.param + doc['field3'].value",
               "lang":"painless",
               "params":{  
                  "param":", "
               }
            },
            "order":{  
               "_term":"asc"
            }
         }
      }
   }
}

Let me know if that helps. 让我知道是否有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM