[英]Elastic query aggregate by specified time range a day
嗨,我需要编写一个特定的查询,该查询将在几天内按选定时间范围内的工作班次汇总数据。 问题是我不想直接在 date_range 聚合中指定所有范围,只想为聚合的特定日期指定从 -> 到时间范围。 有没有可能如何以简单的方式做到这一点?
我有这种查询:
{
"_source": false,
"size": 10000,
"query": {
"bool": {
"must": [
{
"terms": {
"streamId": [
"ENRG_0054"
]
}
},
{
"range": {
"timestamp": {
"gte": "2021-02-01T00:00:00Z",
"lte": "2021-02-10T01:00:00Z"
}
}
}
]
}
},
"sort": [
{
"timestamp": {
"order": "asc"
}
},
{
"_score": {
"order": "asc"
}
}
],
"aggs": {
"streamId": {
"terms": {
"field": "streamId",
"size": 10000
},
"aggs": {
"days": {
"date_histogram": {
"field": "timestamp",
"interval": "1d"
},
"aggs": {
"shifts": {
"date_range": {
"field": "timestamp",
"format": "HH:mm",
"ranges": [
{
"key": "MORNING",
"from": "06:00",
"to": "14:00"
},
{
"key": "AFTERNOON",
"from": "14:00",
"to": "22:00"
}
],
"keyed": true
},
"aggs": {
"MAX": {
"max": {
"field": "@floatMessage.value.value"
}
},
"MIN": {
"min": {
"field": "@floatMessage.value.value"
}
},
"DIFF": {
"bucket_script": {
"buckets_path": {
"min": "MIN",
"max": "MAX"
},
"script": {
"source": "return (params.max-params.min)"
}
}
}
}
}
}
}
}
}
}
}
但结果我得到 null 的值,因为时间范围没有用日期指定。
"aggregations": {
"streamId": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "ENRG_0054",
"doc_count": 13343,
"days": {
"buckets": [
{
"key_as_string": "2021-02-01T00:00:00.000Z",
"key": 1612137600000,
"doc_count": 2763,
"shifts": {
"buckets": {
"MORNING": {
"from": 2.16E7,
"from_as_string": "06:00",
"to": 5.04E7,
"to_as_string": "14:00",
"doc_count": 0,
"MIN": {
"value": null
},
"MAX": {
"value": null
}
},
"AFTERNOON": {
"from": 5.04E7,
"from_as_string": "14:00",
"to": 7.92E7,
"to_as_string": "22:00",
"doc_count": 0,
"MIN": {
"value": null
},
"MAX": {
"value": null
}
}
}
}
},
示例文档:
{
"streamId": "ENRG_0054",
"created": "2021-02-01T00:19:42.905Z",
"extra": {},
"location": null,
"model": "floatMessage",
"id": "6017491eb112b21488f6c843",
"value": {
"unit": "°C",
"value": 18.94,
"messageProcessed": "2021-02-01T00:19:41.595Z"
},
"timestamp": "2021-02-01T00:19:39.161Z",
"tags": []
}
当我为整个查询生成所需时间戳范围的所有 date_ranges 时,结果还可以,这是获得所需结果的唯一方法,还是有人可以建议如何更新查询以满足我的要求? 谢谢
您在data_range
聚合中看不到任何存储桶的原因与datetime
时间与date
推断有关——类似于我前一段时间在这里讨论的那个。
简而言之,在处理时间值( HH:mm
)而不是完整的日期时间值( MM-dd-yyyy HH:mm
)时, date_range
聚合看起来令人困惑,因为:
year
,则默认为1970month
,则默认为一月day
,则默认为该月的 1 日(如果没有提供月份,则默认为Jan )你看,如果你只添加了年份组件:
"date_range": {
"field": "timestamp",
"format": "HH:mm yyyy", <---
"ranges": [
{
"key": "MORNING",
"from": "06:00 2021", <---
"to": "14:00 2021" <---
}
],
"keyed": true
}
Elasticsearch 将返回:
"MORNING" : {
"from" : 2.16E7,
"from_as_string" : "06:00 1970", <--- 🥴
"to" : 5.04E7,
"to_as_string" : "14:00 1970", <--- 🥴
...
}
增加month
将解决这个特定的时间点问题,但当然会引入只能在一个具体年份的一个月上进行聚合的问题。
time
的date
字段:{
"mappings": {
"properties": {
"streamId": {
"type": "keyword"
},
...
"time": {
"type": "date", <---
"format": "HH:mm:ss.SSSz"
}
}
}
}
_update_by_query
调用):{
"streamId": "ENRG_0054",
...
"timestamp": "2021-02-01T00:19:39.161Z",
"time": "00:19:39.161Z", <---
"tags": []
}
time
字段上进行聚合"days": {
"date_histogram": {
"field": "timestamp", <---
"interval": "1d"
},
"aggs": {
"shifts": {
"date_range": {
"field": "time", <---
"format": "HH:mm",
"ranges": [
这就是它的全部!
PS 在幕后, time
值将自动分配给 1970但这很好,因为您只对时间值感兴趣。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.