简体   繁体   English

在 Elasticsearch 中使用嵌套文档聚合多个存储桶

[英]Aggs multiple buckets with nested documents in Elasticsearch

I'm currently working on an Elasticsearch project.我目前正在做一个 Elasticsearch 项目。 I want to aggregate data from our existing documents.我想从我们现有的文档中汇总数据。

The (simplified) structure is as follows: (简化的)结构如下:

{
  "products" : {
    "mappings" : {
      "product" : {
        "properties" : {
          "created" : {
            "type" : "date",
            "format" : "yyyy-MM-dd HH:mm:ss"
          },
          "description" : {
            "type" : "text"
          },
          "facets" : {
            "type" : "nested",
            "properties" : {
              "facet_id" : {
                "type" : "long"
              }
              "name_slug" : {
                "type" : "keyword"
              },
              "value_slug" : {
                "type" : "keyword"
              }
            }
          },
       }
      }
    }
   }
}

Want I want to achieve with one query:希望我想通过一个查询来实现:

  1. Select the unique facet_name values选择唯一的 facet_name 值

  2. Under the facet_names I want all corresponding facet_values在 facet_names 我想要所有相应的 facet_values

Something like this:像这样的东西:

- facet_name
-- facet_sub_value (counter?)
-- facet_sub_value (counter?)
-- facet_sub_value (counter?)
- facet_name
-- facet_sub_value (counter?)
-- facet_sub_value (counter?)
-- facet_sub_value (counter?)

Can you guys point me in the right direction?你们能指出我正确的方向吗? I've looked at the aggs query, but the documentation is not clearly enough in order to realise this.我查看了 aggs 查询,但文档不够清楚,无法实现这一点。

You'll be using nested terms aggregations .您将使用嵌套术语聚合 Since the facet names & values are under the same path, you can try this:由于构面名称和值位于同一路径下,您可以尝试以下操作:

GET products/_search
{
  "size": 0,
  "aggs": {
    "by_facet_names_parent": {
      "nested": {
        "path": "facets"
      },
      "aggs": {
        "by_facet_names_nested": {
          "terms": {
            "field": "facets.name_slug",
            "size": 10
          },
          "aggs": {
            "by_facet_subvalues": {
              "terms": {
                "field": "facets.value_slug",
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}

And your response should look like something along these lines:您的回复应该类似于以下内容:

{
  "took": 26,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 30,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "by_facet_names_parent": {
      "doc_count": 90,
      "by_facet_names_nested": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 80,
        "buckets": [
          {
            "key": "0JDcya7Y7Y",     <-------- your facet name keyword
            "doc_count": 4,
            "by_facet_subvalues": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": [
                {
                  "key": "3q4E9R6h5k",    <-------- one of the facet values + its count
                  "doc_count": 3
                },
                {
                  "key": "1q4E9R6h5k",   <-------- another facet value & count
                  "doc_count": 1
                }
              ]
            }
          },
          {
            "key": "0RyRKWugU1",
            "doc_count": 1,
            "by_facet_subvalues": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": [
                {
                  "key": "Af7qeCsXz6",
                  "doc_count": 1
                }
              ]
            }
          }
          .....
        ]
      }
    }
  }
}

Notice how the number of nested buckets might be >= the number of your actual products docs.请注意嵌套存储桶的数量如何 >= 实际产品文档的数量。 This is because the nested aggregations treat the nested subdocuments as separate documents within the parent documents .这是因为嵌套聚合将嵌套的子文档视为父文档中的单独文档 This takes some time to digest but it'll make sense when you play around with them long enough.这需要一些时间来消化,但是当你和它们玩得足够长时,它就会变得有意义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM