[英]Elastic Search aggregation filter array for condition
我的數據如下所示:
[
{
"name": "Scott",
"origin": "London",
"travel": [
{
"active": false,
"city": "Berlin",
"visited": "2020-02-01"
},
{
"active": true,
"city": "Prague",
"visited": "2020-02-15"
}
]
},
{
"name": "Lilly",
"origin": "London",
"travel": [
{
"active": true,
"city": "Scotland",
"visited": "2020-02-01"
}
]
}
]
我想執行一個聚合,其中每個頂級原點都是一個桶,然后是一個嵌套聚合以查看當前有多少人正在訪問每個城市。 因此,如果active
為true
,我只關心城市是什么。
使用過濾器,它將搜索visited
的數組並返回完整的數組(兩個對象)(如果有一個將active
設置為 true)。 我不想包括active
為假的城市。
所需的 output:
{
"aggregations": {
"origin": {
"buckets": [
{
"key": "London",
"buckets": [
{
"key": "travel",
"doc_count": 2555,
"buckets": [
{
"key": "Scotland",
"doc_count": 1
},
{
"key": "Prague",
"doc_count": 1
}
]
}
]
}
]
}
}
}
上面我只有 2 個 under travel 聚合計數,因為只有兩個 travel 對象將 active 設置為 true。
目前,我的聚合設置如下:
{
"from": 0,
"aggs": {
"origin": {
"terms": {
"field": "origin"
},
"aggs": {
"travel": {
"filter": {
"term": {
"travel.active": true
}
},
"aggs": {
"city": {
"terms": {
"field": "city"
}
}
}
}
}
}
}
}
我在origin
上有我的頂級聚合,然后在travel
數組上有一個嵌套的聚合。 這里我有一個過濾器travel.active = true
,然后是另一個嵌套的聚合來為每個城市創建桶。
在我的聚合中,即使我正在過濾 active = true,它仍然將Berlin
作為一個城市。
我的猜測是因為它允許它作為active: true
對於數組中的一個對象是 true。
如何從聚合中完全過濾掉active: false
?
您將不得不使用“嵌套聚合”。 官方文檔鏈接供參考
以下是您的查詢示例:
映射:
PUT /city_index
{
"mappings": {
"properties": {
"name" : { "type" : "keyword" },
"origin" : { "type" : "keyword" },
"travel": {
"type": "nested",
"properties": {
"active": {
"type": "boolean"
},
"city": {
"type": "keyword"
},
"visited" : {
"type":"date"
}
}
}
}
}
}
插入:
PUT /city_index/_doc/1
{
"name": "Scott",
"origin" : "London",
"travel": [
{
"active": false,
"city": "Berlin",
"visited" : "2020-02-01"
},
{
"active": true,
"city": "Prague",
"visited": "2020-02-15"
}
]
}
PUT /city_index/_doc/2
{
"name": "Lilly",
"origin": "London",
"travel": [
{
"active": true,
"city": "Scotland",
"visited": "2020-02-01"
}
]
}
詢問:
GET /city_index/_search
{
"size": 0,
"aggs": {
"origin": {
"terms": {
"field": "origin"
},
"aggs": {
"city": {
"nested": {
"path": "travel"
},
"aggs": {
"travel": {
"filter": {
"term": {
"travel.active": true
}
},
"aggs": {
"city": {
"terms": {
"field": "travel.city"
}
}
}
}
}
}
}
}
}
}
Output:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"origin": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "London",
"doc_count": 2,
"city": {
"doc_count": 3,
"travel": {
"doc_count": 2,
"city": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Prague",
"doc_count": 1
},
{
"key": "Scotland",
"doc_count": 1
}
]
}
}
}
}
]
}
}
}
@karthick 的建議很好,但我在查詢中添加了過濾器。 這樣您在聚合階段將擁有更少量的值。
GET idx_travel/_search
{
"size": 0,
"query": {
"nested": {
"path": "travel",
"query": {
"term": {
"travel.active": {
"value": true
}
}
}
}
},
"aggs": {
"origin": {
"terms": {
"field": "origin"
},
"aggs": {
"city": {
"nested": {
"path": "travel"
},
"aggs": {
"city": {
"terms": {
"field": "travel.city"
}
}
}
}
}
}
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.