简体   繁体   English

Elasticsearch。 具有重复值的嵌套字段上的术语聚合

[英]Elasticsearch. Terms aggregation on nested field with duplicated values

I have some problem with nested aggregation in Elasticsearch. 我在Elasticsearch中使用嵌套聚合存在一些问题。 I have mapping with nested field: 我有与嵌套字段映射:

POST my_index/ my_type / _mapping
{
    "properties": {
        "name": {
            "type": "keyword"
        },
        "nested_fields": {
            "type": "nested",
                "properties": {
                "key": {
                    "type": "keyword"
                },
                "value": {
                    "type": "keyword"
                }
            }
        }
    }
}

Then I add one document to index: 然后我将一个文档添加到索引:

POST my_index/ my_type
{
    "name":"object1",
        "nested_fields":[
            {
                "key": "key1",
                "value": "value1"

            },
            {
                "key": "key1",
                "value": "value2"
            }
        ]
}

As you see, in my nested array I have two items, which have similar key field, but different value field. 如您所见,在我的嵌套数组中,我有两个项目,它们具有相似的key字段,但具有不同的value字段。 Then I want to make such query: 然后我要进行这样的查询:

GET / my_index / my_type / _search
{
    "query": {
        "nested": {
            "path": "nested_fields",
                "query": {
                "bool": {
                    "must": [
                        {
                            "term": {
                                "nested_fields.key": {
                                    "value": "key1"
                                }
                            }
                        },
                        {
                            "terms": {
                                "nested_fields.value": [
                                    "value1",
                                    "value2"
                                ]
                            }
                        }
                    ]
                }
            }
        }
    },
    "aggs": {
        "agg_nested_fields": {
            "nested": {
                "path": "nested_fields"
            },
            "aggs": {
                "agg_nested_fields_key": {
                    "terms": {
                        "field": "nested_fields.key",
                            "size": 10
                    }
                }
            }
        }
    }
}

As you see, I want to find all documents, which have at least one object in nested_field array, with key property equal to key1 and one of provided values ( value1 or value2 ). 如您所见,我想查找所有文档,这些文档的nested_field数组中至少有一个对象,其key属性等于key1并提供一个值( value1value2 )。 Then I want to group founded documents by nested_fields.key . 然后,我想通过nested_fields.key对已建立的文档进行nested_fields.key But I have such response 但是我有这样的回应

{
    "took": 13,
        "timed_out": false,
            "_shards": {
        "total": 5,
            "successful": 5,
                "failed": 0
    },
    "hits": {
        "total": 1,
            "max_score": 0.87546873,
                "hits": [
                    {
                        "_index": "my_index",
                        "_type": "my_type",
                        "_id": "AVuLNXxiryKmA7VEwOfV",
                        "_score": 0.87546873,
                        "_source": {
                            "name": "object1",
                            "nested_fields": [
                                {
                                    "key": "key1",
                                    "value": "value1"
                                },
                                {
                                    "key": "key1",
                                    "value": "value2"
                                }
                            ]
                        }
                    }
                ]
    },
    "aggregations": {
        "agg_nested_fields": {
            "doc_count": 2,
                "agg_nested_fields_key": {
                "doc_count_error_upper_bound": 0,
                    "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "key1",
                                "doc_count": 2
                            }
                        ]
            }
        }
    }
}

As you see from the response, I have one hit (it is correct), but the document was counted two times in aggregation (see doc_count: 2 ), because it has two items with 'key1' value in nested_fields array. 从响应中可以看到,我一击(正确),但是该文档在聚合中被计数了两次(请参阅doc_count: 2 ),因为在nested_fields数组中有两个具有'key1'值的nested_fields How can I get the right count in aggregation? 如何获得正确的聚合计数?

You will have to use reverse_nested aggs inside the nested aggregation to return the aggregation count on root document. 您将必须在嵌套聚合中使用reverse_nested aggs返回根文档上的聚合计数。

{
    "query": {
        "nested": {
            "path": "nested_fields",
            "query": {
                "bool": {
                    "must": [{
                            "term": {
                                "nested_fields.key": {
                                    "value": "key1"
                                }
                            }
                        },
                        {
                            "terms": {
                                "nested_fields.value": [
                                    "value1",
                                    "value2"
                                ]
                            }
                        }
                    ]
                }
            }
        }
    },
    "aggs": {
        "agg_nested_fields": {
            "nested": {
                "path": "nested_fields"
            },
            "aggs": {
                "agg_nested_fields_key": {
                    "terms": {
                        "field": "nested_fields.key",
                        "size": 10
                    },
                    "aggs": {
                        "back_to_root": {
                            "reverse_nested": {
                                "path": "_source"
                            }
                        }
                    }
                }
            }
        }
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM