简体   繁体   English

Elasticsearch:为每个唯一的id过滤第一个文档

[英]Elasticsearch: Filter first document for each unique id

I am writing an elasticsearch query for below scenario: 我正在为以下情况编写elasticsearch查询:

- field1    field2
- 2015      20
- 2015      14
- 2014      39
- 2013      76
- 2013      2
- 2013      55

I want to find sum of field2 for each unique field1 such that field2 is the maximum for the field1 . 我想为每个唯一的field1找到field2总和,以使field2field1的最大值。 Eg in this case I want the value = 20 + 39 + 76 . 例如,在这种情况下,我希望value = 20 + 39 + 76

What would be an elasticsearch query that returns this value? 返回该值的elasticsearch查询将是什么?

I don't think it's possible on elasticsearch 1.x with a single query. 我认为使用单个查询在Elasticsearch 1.x上是不可能的。 In 2.0 we'll probably have such a feature as reducers (see: https://github.com/elastic/elasticsearch/issues/8110 ). 在2.0中,我们可能会具有减速器这样的功能(请参阅: https : //github.com/elastic/elasticsearch/issues/8110 )。

You could get the first part of your task (max of field2 grouped by field1) like this: 您可以像下面这样获得任务的第一部分(field2的最大值由field1分组):

DELETE /test_index

PUT /test_index
{
    "settings": {
        "number_of_shards": 1
    }
}

POST /test_index/_bulk
{"index":{"_index":"test_index","_type":"doc","_id":1}}
{"field1":2015,"field2":20}
{"index":{"_index":"test_index","_type":"doc","_id":2}}
{"field1":2015,"field2":14}
{"index":{"_index":"test_index","_type":"doc","_id":3}}
{"field1":2014,"field2":39}
{"index":{"_index":"test_index","_type":"doc","_id":4}}
{"field1":2013,"field2":76}
{"index":{"_index":"test_index","_type":"doc","_id":5}}
{"field1":2013,"field2":2}
{"index":{"_index":"test_index","_type":"doc","_id":6}}
{"field1":2013,"field2":55}

POST /test_index/_search
{
  "size": 0,
  "aggs": {
    "field1_group": {
      "terms": {
        "field": "field1",
        "size": 0,
        "order": {
          "maksior": "asc"
        }
      },
      "aggs": {
        "maksior": {
          "max": {
            "field": "field2"
          }
        }
      }
    }
  }
}

which will give you: 这将为您提供:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 6,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "field1_group": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": 2015,
               "doc_count": 2,
               "maksior": {
                  "value": 20
               }
            },
            {
               "key": 2014,
               "doc_count": 1,
               "maksior": {
                  "value": 39
               }
            },
            {
               "key": 2013,
               "doc_count": 3,
               "maksior": {
                  "value": 76
               }
            }
         ]
      }
   }
}

Then you could iterate over the results and sum them on the client side. 然后,您可以遍历结果并将其汇总到客户端。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM