[英]How to optimize elasticsearch query?
我正在嘗試使用Java高級REST客戶端執行Elasticsearch查詢。 主要目標是為我分組結果。 這是一個數據:
"hits" : [
{
"_index" : "my_index",
"_type" : "object",
"_id" : "X4sSPmYB62YwufswHQbx",
"_score" : 1.0,
"_source" : {
"objId" : "1",
"stepId" : "step_one",
"status" : "RUNNING",
"timestamp" : 1515974400
}
},
{
"_index" : "my_index",
"_type" : "object",
"_id" : "15QRP2YB62YwufswAApl",
"_score" : 1.0,
"_source" : {
"objId" : "1",
"stepId" : "step_one",
"status" : "DONE",
"timestamp" : 1516406400
}
},
{
"_index" : "my_index",
"_type" : "object",
"_id" : "QpMOP2YB62YwufswrfYn",
"_score" : 1.0,
"_source" : {
"objId" : "1",
"stepId" : "step_two",
"status" : "RUNNING",
"timestamp" : 1516492800
}
},
{
"_index" : "my_index",
"_type" : "object",
"_id" : "VZMPP2YB62YwufswJ_r0",
"_score" : 1.0,
"_source" : {
"objId" : "1",
"stepId" : "step_two",
"status" : "DONE",
"timestamp" : 1517356800
}
},
{
"_index" : "my_index",
"_type" : "object",
"_id" : "XZMPP2YB62YwufswQfrc",
"_score" : 1.0,
"_source" : {
"objId" : "2",
"stepId" : "step_one",
"status" : "DONE",
"timestamp" : 1517788800
}
}
}
]
例如,對於objId = 1,我希望檢索到以下內容:
{
"objId" : "1",
"stepId" : "step_one",
"status" : "DONE",
"timestamp" : 1516406400
},
{
"objId" : "1",
"stepId" : "step_two",
"status" : "DONE",
"timestamp" : 1517356800
}
現在,我有了這個Java方法:
private List<MyObject> search(String objId) {
MatchPhraseQueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("objId", objId);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(queryBuilder);
searchSourceBuilder.size(1000);
SearchRequest searchRequest = new SearchRequest("my_index");
searchRequest.types("object");
searchRequest.source(searchSourceBuilder);
try {
SearchResponse searchResponse = restHighLevelClient.search(searchRequest);
return Arrays.stream(searchResponse.getHits().getHits())
.map(this::toMyObject)
.collect(toList());
} catch (IOException ex) {
log.error("Error retrieving records from elasticsearch. {} ", ex);
}
return new ArrayList<>();
}
但是此方法僅返回objId找到的對象列表。
我的問題是:是否有可能按objId值查找對象, 而不是按stepId對其進行分組 ,最后按最新時間戳過濾結果?
這是我發現的問題的答案:
private List<MyObject> search(String objId) {
try {
SearchResponse searchResponse = esRestClient.search(new SearchRequest("my_index")
.source(new SearchSourceBuilder()
.query(QueryBuilders.matchPhraseQuery("objId", objId))
.size(0)
.aggregation(
AggregationBuilders.terms("by_stepId").field("stepId.keyword")
.subAggregation(AggregationBuilders.topHits("by_timestamp")
.sort("timestamp", SortOrder.DESC)
.size(1)
)
)
)
.types("object")
);
Terms terms = searchResponse.getAggregations().get("by_stepId");
return terms.getBuckets().stream()
.map(MultiBucketsAggregation.Bucket::getAggregations)
.flatMap(buckets -> buckets.asList().stream())
.map(aggregations -> (ParsedTopHits) aggregations)
.flatMap(topHits -> Arrays.stream(topHits.getHits().getHits()))
.map(this::toMyObject)
.collect(toList());
} catch (IOException ex) {
log.error("Error retrieving records from elasticsearch. {} ", ex);
}
return new ArrayList<>();
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.