[英]Elasticsearch composite query and aggregation results different doc_count value
I am trying to query on my data set with composite query.我正在尝试使用复合查询来查询我的数据集。 Here is my
这是我的
Query 1:查询一:
curl -X POST "localhost:9200/index1-202103/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query":{
"bool":{
"filter":[
{
"range":{
"date":{
"gte":"20210330",
"lte":"20210330"
}
}
},
{
"term":{
"userid":"16114"
}
},
{
"exists":{
"field":"opens"
}
},
{
"exists":{
"field":"tags"
}
}
]
}
},
"aggs":{
"my_buckets":{
"composite":{
"sources":[
{
"from_domain_wise":{
"terms":{
"field":"domain"
}
}
},
{
"msp_wise":{
"terms":{
"field":"msp"
}
}
},
{
"fromaddress_wise":{
"terms":{
"field":"fromaddress"
}
}
},
{
"tag_wise":{
"terms":{
"field":"tags"
}
}
},
{
"rate_over_time":{
"date_histogram":{
"field":"opens.time",
"interval":"1h"
}
}
}
]
}
}
}
}'
Query 2查询 2
curl -X POST "localhost:9200/index1-202103/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
"query":{
"bool":{
"filter":[
{
"range":{
"date":{
"gte":"20210330",
"lte":"20210330"
}
}
},
{
"term":{
"userid":"16114"
}
},
{
"exists":{
"field":"opens"
}
},
{
"exists":{
"field":"tags"
}
}
]
}
},
"aggs":{
"my_buckets":{
"composite":{
"sources":[
{
"from_domain_wise":{
"terms":{
"field":"domain"
}
}
},
{
"msp_wise":{
"terms":{
"field":"msp"
}
}
},
{
"fromaddress_wise":{
"terms":{
"field":"fromaddress"
}
}
},
{
"tag_wise":{
"terms":{
"field":"tags"
}
}
}
]
},
"aggs":{
"rate_over_time":{
"date_histogram":{
"field":"opens.time",
"interval":"1h"
}
}
}
}
}
}'
Both the results gives output for date histogram with different counts.这两个结果都为具有不同计数的日期直方图提供了 output。 When I checked, my findings were that Query1 is counting opens.time (FORMAT: 2021-03-30 15:15:45) fields duplicate values also whereas Query2 is counting opens.time only once if hour is same in single doc.
当我检查时,我的发现是 Query1 正在计算 opens.time (FORMAT: 2021-03-30 15:15:45) 字段的重复值,而 Query2 仅在单个文档中的小时数相同时计算 opens.time 一次。
For example: if doc contains opens: [{ "time": "2021-03-30 15:20:25" }, { "time": "2021-03-30 15:45:30" }]
then Query1 return doc_count
as 2 where as Query2 returns doc_count
as 1.例如:如果 doc 包含 opens:
[{ "time": "2021-03-30 15:20:25" }, { "time": "2021-03-30 15:45:30" }]
那么 Query1 返回doc_count
为 2,其中 Query2 返回doc_count
为 1。
Can anyone please explain why my query is behaving like this in spite of both the queries having the same goal.任何人都可以解释为什么我的查询会这样,尽管这两个查询具有相同的目标。 I want result which Query2 gives using Query1.
我想要 Query2 使用 Query1 给出的结果。
PS: Elasticsearch version is 7.10 PS: Elasticsearch版本是7.10
Both queries do "have the goal" but notice where you apply the date_histogram
:两个查询都“有目标”,但请注意您应用
date_histogram
的位置:
In the first query it's used as a composite sub-aggregation , in the second as a composite value source .在第一个查询中,它用作复合子聚合,在第二个查询中用作复合值源。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.