[英]How to put a size on a date_histogram aggregation
I'm executing a query in elasticsearch.我正在 elasticsearch 中执行查询。 I need to have the number of hits for my attribute "end_date_ut" (type is Date and format is dateOptionalTime) for each month represented in the index.我需要为索引中表示的每个月的属性“end_date_ut”(类型为日期,格式为 dateOptionalTime)获得点击次数。 For that, I'm using a date_histogram aggregation.为此,我使用了 date_histogram 聚合。
My query just bellow:我的查询如下:
GET inc/_search
{
"size": 0,
"aggs": {
"appli": {
"date_histogram": {
"field": "end_date_ut",
"interval": "month"
}
}
}
}
And here is a part of the result:这是结果的一部分:
"hits": {
"total": 517478,
"max_score": 0,
"hits": []
},
"aggregations": {
"appli": {
"buckets": [
{
"key_as_string": "2009-08-01T00:00:00.000Z",
"key": 1249084800000,
"doc_count": 0
},
{
"key_as_string": "2009-09-01T00:00:00.000Z",
"key": 1251763200000,
"doc_count": 1
},
{
"key_as_string": "2009-10-01T00:00:00.000Z",
"key": 1254355200000,
"doc_count": 2362
},
{
"key_as_string": "2009-11-01T00:00:00.000Z",
"key": 1257033600000,
"doc_count": 5336
},
{
"key_as_string": "2009-12-01T00:00:00.000Z",
"key": 1259625600000,
"doc_count": 7536
},
{
"key_as_string": "2010-01-01T00:00:00.000Z",
"key": 1262304000000,
"doc_count": 8864
}
The problem is that I have too many buckets (results).问题是我有太多的桶(结果)。 When I'm using "terms aggregation", I don't have any problems because I can set a size, but with "date_histogram aggregation" I can't find a way to put a limit on my query result.当我使用“术语聚合”时,我没有任何问题,因为我可以设置大小,但是使用“date_histogram 聚合”我找不到限制查询结果的方法。
{
"size": 0,
"aggs": {
"by_minute": {
"date_histogram": {
"field": "createTime",
"interval": "1m",
"order": {
"_count": "desc"
}
},
"aggs": {
"top2": {
"bucket_sort": {
"sort": [],
"size": 2
}
}
}
}
}
}
{
"took": 28,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 999999,
"max_score": 0.0,
"hits": []
},
"aggregations": {
"by_minute": {
"buckets": [
{
"key_as_string": "2019-12-21T16:13:00.000Z",
"key": 1576944780000,
"doc_count": 6374
},
{
"key_as_string": "2019-12-21T16:10:00.000Z",
"key": 1576944600000,
"doc_count": 6327
}
]
}
}
}
I suggest to use min_doc_count
to only include buckets that have data, ie the buckets with 0 documents would not come back in the response.我建议使用min_doc_count
只包含有数据的存储桶,即具有 0 个文档的存储桶不会在响应中返回。
GET inc/_search
{
"size": 0,
"aggs": {
"appli": {
"date_histogram": {
"field": "end_date_ut",
"interval": "month",
"min_doc_count": 1 <--- add this
}
}
}
}
If you can, you can also add a range
query in order to restrain the time interval on which the aggregation is run.如果可以,您还可以添加range
查询以限制运行聚合的时间间隔。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.