简体   繁体   中英

How to do sub-aggregations on nested objects with Elastic?

I have the following schema to represent products that can have multiple variants (eg: sizes for a t-shirt):

{
    "mappings": {
        "properties": {
            "id": {"type": "keyword"},
            "name": {"type": "text"},
            "variants": {
                "type": "nested",
                "properties": {
                    "inventory": {"type": "long"},
                    "customizations": {"type": "object"},
                    "customizations.name": {"type": "keyword"},
                    "customizations.value": {"type": "keyword"}
                }
            }
        }
    }
}

With which I can then insert product data that looks like:

{
    "id": "prod-1",
    "name": "Shirt Design 1",
    "variants": [
        {"inventory": 78, "customizations": [{"name": "size", "value": "L"}, {"name": "color", "value": "blue"}]},
        {"inventory": 78, "customizations": [{"name": "size", "value": "M"}, {"name": "color", "value": "blue"}]},
        {"inventory": 89, "customizations": [{"name": "size", "value": "S"}, {"name": "color", "value": "blue"}]}
    ]
}
{
    "id": "prod-2",
    "name": "Shirt Design 2",
    "variants": [
        {"inventory": 78, "customizations": [{"name": "size", "value": "L"}, {"name": "color", "value": "green"}]},
        {"inventory": 78, "customizations": [{"name": "size", "value": "M"}, {"name": "color", "value": "green"}]}
    ]
}

When filtering / querying this index, I want to be able to show facets based on the customizations that make up the product. Those customizations are user submitted and therefore not in my control, but the idea is to be able to display filters like:

☐ Size:
    - S (1)
    - M (2)
    - L (2)
☐ Color:
    - blue (1)
    - green (1)

For now I can correctly bucket by customization name with the following query:

{
    "size": 0,
    "aggs": {
        "skus": {
            "nested": {
                "path": "variants"
            },
            "aggs": {
                "customization_names": {
                    "terms": {
                        "field": "variants.customizations.name"
                    }
                }
            }
        }
    }
}

Which gives me the following buckets:

"buckets": [
        {
            "doc_count": 2,
            "key": "color"
        },
        {
            "doc_count": 2,
            "key": "size"
        }
    ],

Trying to do a sub-aggregation to get the list of actual customizations underneath is where I'm stuck. I've tried:

{
    "size": 0,
    "aggs": {
        "skus": {
            "nested": {
                "path": "variants"
            },
            "aggs": {
                "customization_names": {
                    "terms": {
                        "field": "variants.customizations.name"
                    },
                    "aggs": {
                        "sub": {
                            "reverse_nested": {},
                            "aggs": {
                                "customization_values": {
                                    "terms": {
                                        "field": "variants.customizations.value"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

doesn't return any sub-buckets:

buckets": [
                    {
                        "doc_count": 4,
                        "key": "color",
                        "sub": {
                            "doc_count": 2,
                            "customization_values": {
                                "buckets": [],
                                "doc_count_error_upper_bound": 0,
                                "sum_other_doc_count": 0
                            }
                        }
                    },
                    {
                        "doc_count": 4,
                        "key": "size",
                        "sub": {
                            "doc_count": 2,
                            "customization_values": {
                                "buckets": [],
                                "doc_count_error_upper_bound": 0,
                                "sum_other_doc_count": 0
                            }
                        }
                    }
                ],

If I don't use reverse_nested , instead of empty sub-buckets, I get every possible value in there, so I get red and blue as part of the size sub-bucket for example.

I initially had the customizations as a map of key => value, but couldn't make it work that way either. However, the format for "customizations" is somewhat customizable here.

The only way I have found so far to solve this is to add a field to customizations which is a json string representation of name + value.

// mapping:
"customizations.facet_code": {"type": "keyword"}
// data:
"customizations": [{"name": "size", "value": "M", "facet_code": "{name:size,value:M}"],

I can then properly bucket based on facet_code and my app can deserialize it to re-group things together again. I would prefer if I could figure out how to do it "properly" if at all possible.

The 'proper' way to do this would be to make the customizations of type nested too, instead of an object . That is to say:

{
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword"
      },
      "name": {
        "type": "text"
      },
      "variants": {
        "type": "nested",
        "properties": {
          "inventory": {
            "type": "long"
          },
          "customizations": {
            "type": "nested",       <-- This
            "properties": {
              "name": {
                "type": "keyword"
              },
              "value": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

The query would then be

{
  "size": 0,
  "aggs": {
    "skus": {
      "nested": {
        "path": "variants.customizations"
      },
      "aggs": {
        "customization_names": {
          "terms": {
            "field": "variants.customizations.name"
          },
          "aggs": {
            "customization_values": {
              "terms": {
                "field": "variants.customizations.value"
              }
            }
          }
        }
      }
    }
  }
}

yielding all your required facets:

{
  ...
  "aggregations":{
    "skus":{
      "doc_count":10,
      "customization_names":{
        "doc_count_error_upper_bound":0,
        "sum_other_doc_count":0,
        "buckets":[
          {
            "key":"color",
            "doc_count":5,
            "customization_values":{
              "doc_count_error_upper_bound":0,
              "sum_other_doc_count":0,
              "buckets":[
                {
                  "key":"blue",
                  "doc_count":3
                },
                {
                  "key":"green",
                  "doc_count":2
                }
              ]
            }
          },
          {
            "key":"size",
            "doc_count":5,
            "customization_values":{
              "doc_count_error_upper_bound":0,
              "sum_other_doc_count":0,
              "buckets":[
                {
                  "key":"L",
                  "doc_count":2
                },
                {
                  "key":"M",
                  "doc_count":2
                },
                {
                  "key":"S",
                  "doc_count":1
                }
              ]
            }
          }
        ]
      }
    }
  }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM