简体   繁体   English

如何优化Elasticsearch查询?

[英]How to optimize elasticsearch query?

I'm trying to perform an Elasticsearch query by using Java High Level REST Client. 我正在尝试使用Java高级REST客户端执行Elasticsearch查询。 The main goal is to group the results for me. 主要目标是为我分组结果。 Here is a data: 这是一个数据:

    "hits" : [
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "X4sSPmYB62YwufswHQbx",
    "_score" : 1.0,
    "_source" : {
      "objId" : "1",
      "stepId" : "step_one",
      "status" : "RUNNING",
      "timestamp" : 1515974400
    }
  },
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "15QRP2YB62YwufswAApl",
    "_score" : 1.0,
    "_source" : {
      "objId" : "1",
      "stepId" : "step_one",
      "status" : "DONE",
      "timestamp" : 1516406400
    }
  },
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "QpMOP2YB62YwufswrfYn",
    "_score" : 1.0,
    "_source" : {
      "objId" : "1",
      "stepId" : "step_two",
      "status" : "RUNNING",
      "timestamp" : 1516492800
    }
  },
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "VZMPP2YB62YwufswJ_r0",
    "_score" : 1.0,
    "_source" : {
      "objId" : "1",
      "stepId" : "step_two",
      "status" : "DONE",
      "timestamp" : 1517356800
    }
  },
  {
    "_index" : "my_index",
    "_type" : "object",
    "_id" : "XZMPP2YB62YwufswQfrc",
    "_score" : 1.0,
    "_source" : {
      "objId" : "2",
      "stepId" : "step_one",
      "status" : "DONE",
      "timestamp" : 1517788800
    }
  }
  }
]

For example for objId = 1 I expect to retrieve something like: 例如,对于objId = 1,我希望检索到以下内容:

    {
      "objId" : "1",
      "stepId" : "step_one",
      "status" : "DONE",
      "timestamp" : 1516406400
    },
    {
      "objId" : "1",
      "stepId" : "step_two",
      "status" : "DONE",
      "timestamp" : 1517356800
    }

Now I have this Java method: 现在,我有了这个Java方法:

    private List<MyObject> search(String objId) {
    MatchPhraseQueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("objId", objId);
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.query(queryBuilder);
    searchSourceBuilder.size(1000);

    SearchRequest searchRequest = new SearchRequest("my_index");
    searchRequest.types("object");
    searchRequest.source(searchSourceBuilder);

    try {
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest);

        return Arrays.stream(searchResponse.getHits().getHits())
                .map(this::toMyObject)
                .collect(toList());
    } catch (IOException ex) {
        log.error("Error retrieving records from elasticsearch. {} ", ex);
    }

    return new ArrayList<>();
}

But this method returns only a list of Objects which are found by objId. 但是此方法仅返回objId找到的对象列表。

My question is: Is it possible to find objects by objId value than group it by stepId and finally filter that result by the latest timestamp ? 我的问题是:是否有可能按objId值查找对象, 不是按stepId对其进行分组 ,最后按最新时间戳过滤结果?

Here is an answer to my question I found: 这是我发现的问题的答案:

private List<MyObject> search(String objId) {
    try {
        SearchResponse searchResponse = esRestClient.search(new SearchRequest("my_index")
                .source(new SearchSourceBuilder()
                        .query(QueryBuilders.matchPhraseQuery("objId", objId))
                        .size(0)
                        .aggregation(
                                AggregationBuilders.terms("by_stepId").field("stepId.keyword")
                                        .subAggregation(AggregationBuilders.topHits("by_timestamp")
                                                .sort("timestamp", SortOrder.DESC)
                                                .size(1)
                                        )
                        )
                )
                .types("object")
        );
        Terms terms = searchResponse.getAggregations().get("by_stepId");
        return terms.getBuckets().stream()
                .map(MultiBucketsAggregation.Bucket::getAggregations)
                .flatMap(buckets -> buckets.asList().stream())
                .map(aggregations -> (ParsedTopHits) aggregations)
                .flatMap(topHits -> Arrays.stream(topHits.getHits().getHits()))
                .map(this::toMyObject)
                .collect(toList());
    } catch (IOException ex) {
        log.error("Error retrieving records from elasticsearch. {} ", ex);
    }
    return new ArrayList<>();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM