[英]Elasticsearch: Filter by conditions in the grouped documents
I need to filter aggregation results by a condition that says that at least one of the documents grouped must have a field with a certain content. 我需要通过一个条件来过滤聚合结果,该条件表明至少分组的文档之一必须具有包含特定内容的字段。 My data is a kind of traces of events occurred to different processes, a unique process has many traces.
我的数据是发生在不同流程中的事件的一种痕迹,一个独特的流程具有许多痕迹。
Example of my data: 我的数据示例:
proc_id event timestamp
1 ON 1000
1 EV1 1001
2 ON 1002
1 OFF 1003
3 ON 1004
2 EV2 1005
3 EV1 1006
3 EV_END 1007
2 EV_END 1008
For example, I need to group by proc_id, just the proc_id that has at least one EV_END event. 例如,我需要按proc_id分组,仅按具有至少一个EV_END事件的proc_id分组。 Taking just the EV_END traces is not the solution because I need to process things (like times and number of events), later, with all the traces of the proc_id.
仅采用EV_END跟踪不是解决方案,因为稍后我需要使用proc_id的所有跟踪来处理事物(例如事件的时间和数量)。
I saw that from version 2.x there are bucket_selectors and scripts but I'm not getting the idea. 我从2.x版本中看到,有bucket_selectors和脚本,但我不明白。
The pseudo query with what I want to do: 我想做的伪查询:
curl -XPOST 'localhost:9200/proc/_search?pretty' -d '
{
"query": { "match_all": {} },
"aggs": {
"group_by_proc_id": {
"terms": {
"field": "proc_id",
**ONLY if proc has at least one trace with event == 'EV_END'**
}
}
}
}'
I think you could use filter aggregation to get proc_ids where EV_END event is present. 我认为您可以使用过滤器聚合来获取存在 EV_END事件的proc_id。
{
"query": {
"match_all": {}
},
"size": 0,
"aggs": {
"EV_END": {
"filter": {
"term": {
"event": "EV_END"
}
},
"aggs": {
"proc_group": {
"terms": {
"field": "proc_id",
"size": 10
}
}
}
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.