[英]How to perform complex query on aggregated fields in ElasticSearch
I am trying to figure out how to perform a complex query in elastic search, let say I have the following table of data:我试图弄清楚如何在弹性搜索中执行复杂的查询,假设我有以下数据表:
Which I got from the following query我从以下查询中得到的
{
"aggs": {
"3": {
"terms": {
"field": "ColumnA",
"order": {
"_key": "desc"
},
"size": 50
},
"aggs": {
"4": {
"terms": {
"field": "ColumnB",
"order": {
"_key": "desc"
},
"size": 50
},
"aggs": {
"5": {
"terms": {
"field": "ColumnC",
"order": {
"_key": "desc"
},
"size": 50
},
"aggs": {
"sum_of_views": {
"sum": {
"field": "views"
}
},
"sum_of_costs": {
"sum": {
"field": "cost"
}
},
"sum_of_clicks": {
"sum": {
"field": "clicks"
}
},
"sum_of_earned": {
"sum": {
"field": "earned"
}
},
"sum_of_adv_earned": {
"sum": {
"field": "adv_earned"
}
}
}
}
}
}
}
}
},
"size": 0,
"_source": {
"excludes": []
},
"stored_fields": [
"*"
],
"script_fields": {},
"docvalue_fields": [
{
"field": "hour",
"format": "date_time"
}
],
"query": {
"bool": {
"must": [],
"filter": [
{
"match_all": {}
},
{
"range": {
"hour": {
"format": "strict_date_optional_time",
"gte": "2019-08-08T06:29:34.723Z",
"lte": "2020-08-08T06:29:34.724Z"
}
}
}
],
"should": [],
"must_not": []
}
}
}
Now for example, if I want to get the records that have the following condition现在例如,如果我想获取具有以下条件的记录
(sum_of_clicks / sum_of_views) * (sum_of_earned2 / sum_of_earned1) < 0.5
What should I query?我应该查询什么?
Think the below should help.认为以下内容应该有所帮助。 My understanding is that you would want to first group based on
ColumnA, ColumnB, ColumnC
, calculate the sum for clicks, views, earned1 and earned2
fields and then apply the custom aggregation logic you are looking for.我的理解是,您希望首先基于
ColumnA, ColumnB, ColumnC
,计算clicks, views, earned1 and earned2
字段的总和,然后应用您正在寻找的自定义聚合逻辑。
I've been able to come up with the below query where I've made use of Bucket Selector Aggregation .我已经能够提出以下查询,其中我使用了Bucket Selector Aggregation 。
POST <your_index_name>/_search
{
"size": 0,
"aggs": {
"3": {
"terms": {
"field": "ColumnA",
"order": {
"_key": "desc"
},
"size": 50
},
"aggs": {
"4": {
"terms": {
"field": "ColumnB",
"order": {
"_key": "desc"
},
"size": 50
},
"aggs": {
"5": {
"terms": {
"field": "ColumnC",
"order": {
"_key": "desc"
},
"size": 50
},
"aggs": {
"sum_views": {
"sum": {
"field": "views"
}
},
"sum_clicks": {
"sum": {
"field": "clicks"
}
},
"sum_earned1": {
"sum": {
"field": "earned1"
}
},
"sum_earned2": {
"sum": {
"field": "earned2"
}
},
"custom_sum_bucket_filter": {
"bucket_selector": {
"buckets_path": {
"sum_of_views": "sum_views",
"sum_of_clicks": "sum_clicks",
"sum_of_earned1": "sum_earned1",
"sum_of_earned2": "sum_earned2"
},
"script": "(params.sum_of_views/params.sum_of_clicks) * (params.sum_of_earned1/params.sum_of_earned2) < 0.5"
}
}
}
},
"min_bucket_selector": {
"bucket_selector": {
"buckets_path": {
"valid_docs_count": "5._bucket_count"
},
"script": {
"source": "params.valid_docs_count >= 1"
}
}
}
}
},
"min_bucket_selector": {
"bucket_selector": {
"buckets_path": {
"valid_docs_count": "4._bucket_count"
},
"script": {
"source": "params.valid_docs_count >= 1"
}
}
}
}
}
}
}
Note that to get the exact result you are looking for, I've had to add the filter conditions of buckets at 4
and 5
.请注意,要获得您正在寻找的确切结果,我必须在
4
和5
处添加存储桶的过滤条件。
The aggregations I've made use are我使用的聚合是
In order to test why I've added the additional empty bucket filters, you can just remove them and see what results you observe.为了测试为什么我添加了额外的空桶过滤器,您可以删除它们并查看您观察到的结果。
Note that for sake of simplicity I have ignored the query
part as well as the cost
field.请注意,为简单起见,我忽略了
query
部分以及cost
字段。 Please feel free to add them and test it.请随时添加它们并进行测试。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.