（Elasticsearch）如何获取所有文档的嵌套字段的最后一个元素然后执行子聚合

Question

我有一个名为socialmedia的索引，并尝试使用这个名为eng的字段创建查询（省略了一些不必要的字段）

"id" : "1",
"eng": 
[
{
  "soc_mm_score" : "3",
  "date_updated" : "1520969306",
},
{
  "soc_mm_score" : "1",
  "date_updated" : "1520972191",
},
{
  "soc_mm_score" : "4",
  "date_updated" : "1520937222",
}
]

我有很多来自这个索引的文档，其中包含eng嵌套字段，其中还包含很多“子对象”

现在，我的主要目标是，我应该制定什么 Elasticsearch 查询来过滤掉这些嵌套对象

步骤1
获取date_updated值最高的嵌套 object

第2步
获取这些嵌套对象后，执行求和聚合，以便我可以为相应的“最新嵌套对象”添加soc_mm_score字段的所有值

我试过这个查询，但似乎失败了

尝试＃1 （我正在使用elasticsearch-php API，所以请相信我的查询它正在使用这种格式）

'aggs' => [
    'ENG' => [
        'nested' => [
            'path' => 'eng'
        ],
        'aggs' => [
            'FILTER' => [
                'filter' => [
                    'bool' => [
                        'must' => [
                            [
                                // I'm thinking of using max aggregation here
                            ]
                        ]
                    ]
                ]
            ]
            'LATEST' => [
                'top_hits' => [
                    'size' => 1,
                    'sort' => [
                        'eng.date_updated' => [
                            'order' => 'desc'
                        ]
                    ]
                ]
            ]
        ]
    ]
]

PRO/S：它返回正确的嵌套 object CON/S：我无法执行进一步的聚合

样品 Output

然后我尝试添加子聚合

那么这是output

还有其他方法可以执行此操作吗？

回顾我的理想步骤：

访问我的eng嵌套字段
获取该eng嵌套字段的“最新”/最新元素（由具有最高date_updated字段值的元素指示）
现在，在获得那些“最近的”嵌套元素之后，对其兄弟嵌套字段进行子聚合，例如：获取eng字段的所有最新元素的soc_like_count或soc_share_count的总和

Answer 1

制定了答案！

"aggs":{
        "LATEST": {
            "scripted_metric": {
                "init_script" : """
                  state.te = []; 
                  state.g = 0;
                  state.d = 0;
                  state.a = 0;
                """, 
                "map_script" : """
                  if(state.d != doc['_id'].value){
                      state.d = doc['_id'].value;
                      state.te.add(state.a);
                      state.g = 0;
                      state.a = 0;
                  } 
                  if(state.g < doc['eng.date_updated'].value){ 
                    state.g = doc['eng.date_updated'].value; 
                    state.a = doc['eng.soc_te_score'].value;
                  }
                  """,
                "combine_script" : """
                    state.te.add(state.a);
                    double count = 0; 
                    for (t in state.te) { 
                      count += t 
                    }

                    return count
                  """,
                "reduce_script" : """
                    double count = 0; 
                    for (a in states) { 
                      count += a 
                    }

                    return count
                """
            }
        }
      }

Answer 2

度量聚合不支持子聚合， top_hits是一个度量聚合。

一种解决方案是在从 elasticsearch 获得结果后进行求和。

我创建了一些可能有用的东西，但您必须根据自己的需要对其进行自定义。

假设你的映射

{ 
"my_index": {
    "mappings": {
      "doc": {
        "properties": {
          "eng": {
            "type": "nested",
            "properties": {
              "date_updated": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "soc_like_count": {
                "type": "long"
              },
              "soc_mm_score": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "id": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

查询

GET my_index/_search
{
  "size": 0,
  "aggs": {
    "ENG": {
      "nested": {
        "path": "eng"
      },
      "aggs": {
        "sum_soc_top_hits_by_date": {
          "scripted_metric": {
            "init_script": "params._agg.map = new HashMap();params._agg.results = new HashMap();params._agg.size = 1;params._agg.date_arr = null",
            "map_script": "params._agg.map[doc['eng.date_updated.keyword'].value] = doc['eng.soc_like_count'].value;params._agg.date_arr = new ArrayList(params._agg.map.keySet());Collections.sort(params._agg.date_arr, Collections.reverseOrder())",
            "combine_script": "params._agg.size = params._agg.size > params._agg.date_arr.length - 1 ?  params._agg.date_arr.length : params._agg.size;double soc= 0; for (t in params._agg.date_arr.subList(0,params._agg.size)) { params._agg.results[t] = params._agg.map[t];soc += params._agg.map[t]}params._agg.results.total = soc; return params._agg.results",
            "reduce_script": "return params._aggs"
          }
        }
      }
    }
  }
}

更改params._agg.size = 1以更改热门点击数。

（Elasticsearch）如何获取所有文档的嵌套字段的最后一个元素然后执行子聚合

问题描述

1 个解决方案

解决方案1
4 已采纳 2019-11-18 06:38:04

解决方案2
0 2019-11-11 09:14:25

（Elasticsearch）如何获取所有文档的嵌套字段的最后一个元素然后执行子聚合

问题描述

1 个解决方案

解决方案1 4 已采纳 2019-11-18 06:38:04

解决方案2 0 2019-11-11 09:14:25

解决方案1
4 已采纳 2019-11-18 06:38:04

解决方案2
0 2019-11-11 09:14:25