如何访问 elasticsearch 中子聚合中的 date_histogram 键字段？

Question

I want to apply some filters on the bucket response generated by the date_histogram , that filter is dependent on the key of the date_histogram output buckets.我想对date_histogram生成的桶响应应用一些过滤器，该过滤器取决于date_histogram output 桶的键。

Suppose I have following data in假设我有以下数据

{
   "entryTime":"",
   "soldTime:""
}

the elastic query is something like this弹性查询是这样的

{
  "aggs": {
    "date": {
      "date_histogram": {
        "field": "entryTime",
        "interval": "month",
        "keyed": true
      },
      "aggs": {
        "filter_try": {
          "filter": {
            "bool": {
              "must": [
                {
                  "range": {
                    "entryTime": {
                      "lte": 1588840533000
                    }
                  }
                },
                {
                  "bool": {
                    "should": [
                      {
                        "bool": {
                          "must": [
                            {
                              "exists": {
                                "field": "soldTime"
                              }
                            },
                            {
                              "range": {
                                "soldTime": {
                                  "gt": 1588840533000
                                }
                              }
                            }
                          ]
                        }
                      },
                      {
                        "bool": {
                          "must_not": [
                            {
                              "exists": {
                                "field": "soldTime"
                              }
                            }
                          ]
                        }
                      }
                    ]
                  }
                }
              ]
            }
          }
        }
      }
    }
  }
}

so here in that bool query, I want to use the date generated for the specific bucket by date_histogram aggregation in both the range clauses instead of the hardcoded epoch time.因此，在该布尔查询中，我想在两个范围子句中使用date_histogram聚合为特定存储桶生成的日期，而不是硬编码的纪元时间。

Even if we can access using script then also it's fine.即使我们可以使用脚本访问也可以。

for further clarification, this is the boolean query and in the query want to replace this "DATE" with the date_histogram bucket key.为了进一步说明，这是 boolean 查询，在查询中想用date_histogram桶键替换这个"DATE" 。

# (entryTime < DATE) 
# AND 
# (
#    (soldTime != null AND soldTime > DATE) 
#          OR 
#      (soldTime == NULL)
#  )

Consider below 10 Document I have:考虑以下我拥有的 10 个文档：

"hits" : [
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1577869200000",
          "soldTime" : "1578646800000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1578214800000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1578560400000",
          "soldTime" : "1579942800000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1579683600000",
          "soldTime" : "1581325200000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1580893200000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1582189200000",
          "soldTime" : "1582362000000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1582621200000",
          "soldTime" : "1584349200000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1583053200000",
          "soldTime" : "1583830800000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "9",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1584262800000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1585472400000"
        }
      }
    ]

Now the end of January 2020 in epoch is -> 1580515199000现在 2020 年 1 月结束的纪元是 -> 1580515199000

So if I apply on the above-mentioned bool query,因此，如果我申请上述布尔查询，

Will get the output as the将获得 output 作为

"hits" : [
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 3.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1579683600000",
          "soldTime" : "1581325200000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1578214800000"
        }
      }
    ]

As document with ID 4 satisfy (soldTime != null AND soldTime > DATE) and document with ID 2 satisfy (soldTime == null) condition from OR part.由于 ID 为 4 的文档满足(soldTime != null AND soldTime > DATE)并且 ID 为 2 的文档满足 OR 部分的(soldTime == null)条件。

Now for the same bool request If I use the date of end February 2020 -> 1583020799000 , will get the hits as follows现在对于相同的布尔请求如果我使用 2020 年二月结束的日期 -> 1583020799000 ，将获得如下点击

"hits" : [
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 3.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1582621200000",
          "soldTime" : "1584349200000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1578214800000"
        }
      },
      {
        "_index" : "vi_test",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1580893200000"
        }
      }
    ]

ID 7: Entry in Feb, but sold in March so is in stock for Feb-2020 ID 7：2 月进入，但 3 月售出，因此有 2020 年 2 月的库存
ID 2: Entry in Jan, not sold yet means in the stock ID 2：1月入库，未售出即有库存
ID 5: Entry in Feb, not sold yet means in the stock ID 5：2月入库，未售出即有库存

Now the same data required for each end of the month of a whole year to plot the trend.现在同样的数据需要全年每个月底来plot的走势。

Thank you谢谢

Answer 1

I couldn't find a way using normal queries as parent aggregation key is not available in sub aggregation.我找不到使用普通查询的方法，因为父聚合键在子聚合中不可用。 I have written a script for this which selects documents where soldTime is either null or doesnot fall in same month as entryTime我为此编写了一个脚本，它选择 soldTime 为 null 或不与 entryTime 同月的文档

Query:询问：

{
  "query": {
    "script": {
      "script": """
         ZonedDateTime entry;
         ZonedDateTime sold;
         if(doc['entryTime'].size()>0)
         {
           entry= doc['entryTime'].value;
         }
         if(doc['soldTime'].size()>0) 
         {
           sold = doc['soldTime'].value;
         }
         if(sold==null || ( entry.getMonthValue()!==sold.getMonthValue()|| entry.getYear()!==sold.getYear()))
         {
           return true;
         }
         else false;
"""
    }
  },
  "size": 10,
  "aggs": {
    "monthly_trend": {
      "date_histogram": {
        "field": "entryTime",
        "interval": "month"
      },
      "aggs": {
        "docs": {
          "top_hits": {
            "size": 10
          }
        }
      }
    }
  }
}

Result:结果：

    "hits" : [
      {
        "_index" : "index22",
        "_type" : "_doc",
        "_id" : "55Kv83EB8a54AbXfngYU",
        "_score" : 1.0,
        "_source" : {
          "deaerId" : "4",
          "entryTime" : "1578214800000"
        }
      }
    ]
  },
  "aggregations" : {
    "monthly_trend" : {
      "buckets" : [
        {
          "key_as_string" : "2020-01-01T00:00:00.000Z",
          "key" : 1577836800000,
          "doc_count" : 1,
          "docs" : {
            "hits" : {
              "total" : {
                "value" : 1,
                "relation" : "eq"
              },
              "max_score" : 1.0,
              "hits" : [
                {
                  "_index" : "index22",
                  "_type" : "_doc",
                  "_id" : "55Kv83EB8a54AbXfngYU",
                  "_score" : 1.0,
                  "_source" : {
                    "deaerId" : "4",
                    "entryTime" : "1578214800000"
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }

如何访问 elasticsearch 中子聚合中的 date_histogram 键字段？

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-05-08 06:21:31

如何访问 elasticsearch 中子聚合中的 date_histogram 键字段？

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-05-08 06:21:31

解决方案1
0 已采纳 2020-05-08 06:21:31