简体   繁体   English

Elasticsearch - 术语聚合嵌套字段

[英]Elasticsearch - Terms Aggregation nested field

I have following problem.我有以下问题。 I have a nested field ("list") with 2 properties (fieldB & fieldC).我有一个带有 2 个属性(fieldB 和 fieldC)的嵌套字段(“列表”)。 This is how a document looks like:这是一个文档的样子:

"fieldA: "1",
"list": [
  {"fieldB": "ABC",
  "fieldC": "DEF"},
  {"fieldB": "ABC",
  "fieldC": "GHI"},
  {"fieldB": "UVW",
  "fieldC": "XYZ"},...]
                        },

I want to get a distinct list of all possible fieldC values for "ABC" (fieldB) over all documents.我想在所有文档中获取“ABC”(fieldB)的所有可能 fieldC 值的不同列表。 So far I've tried this in Java (Java REST Client):到目前为止,我已经在 Java (Java REST 客户端)中尝试过这个:

 SearchRequest searchRequest = new SearchRequest("abc*");
 QueryBuilder matchQueryBuilder = QueryBuilders.boolQuery()
             .must(QueryBuilders.nestedQuery("aList", 
             QueryBuilders.matchQuery("list.fieldB.keyword", "ABC"), ScoreMode.None));
 SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
 sourceBuilder.query(matchQueryBuilder)
              .aggregation(AggregationBuilders.nested("listAgg","list")
              .subAggregation(AggregationBuilders.terms("fieldBAgg")
              .field("list.fieldB.keyword")));

    searchRequest.source(sourceBuilder);

    SearchResponse searchResponse = null;
    try {
        searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
    } catch (IOException e) {
        e.printStackTrace();
    }

    Nested list = searchResponse.getAggregations().get("listAgg");
    Terms fieldBs = list.getAggregations().get("fieldBAgg");

With that query I get all documents which include "ABC" in fieldB and I get all fieldC values.通过该查询,我得到了在 fieldB 中包含“ABC”的所有文档,并且得到了所有fieldC 值。 But I just want the fieldC values where fieldB is "ABC".但我只想要fieldB为“ABC”的fieldC值。

So in that example I get DEF, GHI and XYZ.所以在那个例子中,我得到了 DEF、GHI 和 XYZ。 But i just want DEF and GHI.但我只想要 DEF 和 GHI。 Does anybody have an idea how to solve this?有人知道如何解决这个问题吗?

The nested constraint in the query part will only select all documents that do have a nested field satisfying the constraint.查询部分中的nested约束将仅 select 所有具有满足约束的嵌套字段的文档。 You also need to add that same constraint in the aggregation part, otherwise you're going to aggregate all nested fields of all the selected documents, which is what you're seeing.您还需要在聚合部分添加相同的约束,否则您将聚合所有选定文档的所有嵌套字段,这就是您所看到的。 Proceed like this instead:像这样继续:

// 1. terms aggregation on the desired nested field
nestedField = AggregationBuilders.terms("fieldBAgg").field("list.fieldC.keyword");

// 2. filter aggregation on the desired nested field value
onlyBQuery = QueryBuilders.termQuery("list.fieldB.keyword", "ABC");
onlyBFilter = AggregationBuilders.filter("onlyFieldB", onlyBQuery).subAggregation(nestedField);

// 3. parent nested aggregation
nested = AggregationBuilders.nested("listAgg", "list").subAggregation(onlyBFilter);

// 4. main query/aggregation
sourceBuilder.query(matchQueryBuilder).aggregation(nested);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM