簡體   English   中英

條件的彈性搜索聚合過濾器數組

[英]Elastic Search aggregation filter array for condition

我的數據如下所示:

[
    {
        "name": "Scott",
        "origin": "London",
        "travel": [
            {
                "active": false,
                "city": "Berlin",
                "visited": "2020-02-01"
            },
            {
                "active": true,
                "city": "Prague",
                "visited": "2020-02-15"
            }
        ]
    },
    {
        "name": "Lilly",
        "origin": "London",
        "travel": [
            {
                "active": true,
                "city": "Scotland",
                "visited": "2020-02-01"
            }
        ]
    }
]

我想執行一個聚合,其中每個頂級原點都是一個桶,然后是一個嵌套聚合以查看當前有多少人正在訪問每個城市。 因此,如果activetrue ,我只關心城市是什么。

使用過濾器,它將搜索visited的數組並返回完整的數組(兩個對象)(如果有一個將active設置為 true)。 我不想包括active為假的城市。

所需的 output:

{
  "aggregations": {
    "origin": {
      "buckets": [
        {
          "key": "London",
          "buckets": [
            {
              "key": "travel",
              "doc_count": 2555,
              "buckets": [
                {
                  "key": "Scotland",
                  "doc_count": 1
                },
                {
                  "key": "Prague",
                  "doc_count": 1
                }
              ]
            }
          ]
        }
      ]
    }
  }
}

上面我只有 2 個 under travel 聚合計數,因為只有兩個 travel 對象將 active 設置為 true。

目前,我的聚合設置如下:

{
  "from": 0,
  "aggs": {
    "origin": {
      "terms": {
        "field": "origin"
      },
      "aggs": {
        "travel": {
          "filter": {
            "term": {
              "travel.active": true
            }
          },
          "aggs": {
            "city": {
              "terms": {
                "field": "city"
              }
            }
          }
        }
      }
    }
  }
}

我在origin上有我的頂級聚合,然后在travel數組上有一個嵌套的聚合。 這里我有一個過濾器travel.active = true ,然后是另一個嵌套的聚合來為每個城市創建桶。

在我的聚合中,即使我正在過濾 active = true,它仍然將Berlin作為一個城市。

我的猜測是因為它允許它作為active: true對於數組中的一個對象是 true。

如何從聚合中完全過濾掉active: false

您將不得不使用“嵌套聚合”。 官方文檔鏈接供參考

以下是您的查詢示例:

映射:

PUT /city_index
{
  "mappings": {
    "properties": {
      "name" : { "type" : "keyword" },
      "origin" : { "type" : "keyword" },
      "travel": { 
        "type": "nested",
        "properties": {
          "active": {
            "type": "boolean"
          },
          "city": {
            "type": "keyword"
          },
          "visited" : {
            "type":"date"
          }
        }
      }
    }
  }
}

插入:

PUT /city_index/_doc/1
{
  "name": "Scott", 
  "origin" : "London",
  "travel": [
    {
      "active": false,
      "city": "Berlin",
      "visited" : "2020-02-01"
    },
    {
      "active": true,
      "city": "Prague",
      "visited": "2020-02-15"
    }
  ]
}

PUT /city_index/_doc/2
{
  "name": "Lilly",
  "origin": "London",
  "travel": [
    {
      "active": true,
      "city": "Scotland",
      "visited": "2020-02-01"
    }
  ]
}

詢問:

GET /city_index/_search
{
  "size": 0,
  "aggs": {
    "origin": {
      "terms": {
        "field": "origin"
      },
      "aggs": {
        "city": {
          "nested": {
            "path": "travel"
          },
          "aggs": {
            "travel": {
              "filter": {
                "term": {
                  "travel.active": true
                }
              },
              "aggs": {
                "city": {
                  "terms": {
                    "field": "travel.city"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Output:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "origin": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "London",
          "doc_count": 2,
          "city": {
            "doc_count": 3,
            "travel": {
              "doc_count": 2,
              "city": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": [
                  {
                    "key": "Prague",
                    "doc_count": 1
                  },
                  {
                    "key": "Scotland",
                    "doc_count": 1
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

@karthick 的建議很好,但我在查詢中添加了過濾器。 這樣您在聚合階段將擁有更少量的值。

GET idx_travel/_search
{
  "size": 0,
  "query": {
    "nested": {
      "path": "travel",
      "query": {
        "term": {
          "travel.active": {
            "value": true
          }
        }
      }
    }
  },
  "aggs": {
    "origin": {
      "terms": {
        "field": "origin"
      },
      "aggs": {
        "city": {
          "nested": {
            "path": "travel"
          },
          "aggs": {
            "city": {
              "terms": {
                "field": "travel.city"
              }
            }
          }
        }
      }
    }
  }
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM