[英]ElasticSearch aggregate documents by field into array
我正在尝试在 java 到 elasticsearch 中编写以下逻辑作为查询:
ES 包含以下文件:
{"request" : 1, "store":"ebay", "status" : "retrieved" , "lastdate": "2012/12/20 17:00", "retrieved_by" : "John"}
{"request" : 1, "store":"ebay", "status" : "stored" , "lastdate": "2012/12/20 18:00", "stored_by" : "Alex"}
{"request" : 1, "store":"ebay", "status" : "bought" , "lastdate": "2012/12/20 19:00", "bought_by" : "Arik"}
{"request" : 2, "store":"aliexpress", "status" : "retrieved" , "lastdate": "2012/12/20 17:00"}
{"request" : 2, "store":"aliexpress","status" : "stored" , "lastdate": "2012/12/20 18:00"}
{"request" : 2, "store":"aliexpress","status" : "bought" , "lastdate": "2012/12/20 19:00"}
我正在尝试编写一个查询,该查询将商店名称作为输入并返回该商店的请求,该请求通过其 request_id 聚合到一个数组中。
换句话说,我正在尝试:
1.按特定字段(“商店”)的术语过滤。
2.根据特定字段(“请求”)将结果聚合到数组中
例如输入“ebay”:
{
"1" : [
{"request" : 1, "store":"ebay", "status" : "retrieved" , "lastdate": "2012/12/20 17:00", "retrieved_by" : "John"}
{"request" : 1, "store":"ebay", "status" : "stored" , "lastdate": "2012/12/20 18:00", "stored_by" : "Alex"}
{"request" : 1, "store":"ebay", "status" : "bought" , "lastdate": "2012/12/20 19:00", "bought_by" : "Arik"}
],
".." : [...]
}
结果中的关键是请求并不重要(我会购买任何关键)。 重要的部分是我将请求字段的所有记录聚合到一个数组中,并按 lastdate 将它们排序在数组中。
我的最终目标是使用 java QueryBuilder 创建此查询。 因此,我首先尝试使用弹性原生查询语言来了解 QueryBuilder 使用什么..
设置基本映射:
PUT stores
{
"mappings": {
"properties": {
"lastdate": {
"type": "date",
"format": "yyyy/MM/dd HH:mm"
}
}
}
}
同步一些文档:
POST _bulk
{"index":{"_index":"stores","_type":"_doc"}}
{"request":1,"store":"ebay","status":"retrieved","lastdate":"2012/12/20 17:00","retrieved_by":"John"}
{"index":{"_index":"stores","_type":"_doc"}}
{"request":1,"store":"ebay","status":"stored","lastdate":"2012/12/20 18:00","stored_by":"Alex"}
{"index":{"_index":"stores","_type":"_doc"}}
{"request":1,"store":"ebay","status":"bought","lastdate":"2012/12/20 19:00","bought_by":"Arik"}
{"index":{"_index":"stores","_type":"_doc"}}
{"request":2,"store":"aliexpress","status":"retrieved","lastdate":"2012/12/20 17:00"}
{"index":{"_index":"stores","_type":"_doc"}}
{"request":2,"store":"aliexpress","status":"stored","lastdate":"2012/12/20 18:00"}
{"index":{"_index":"stores","_type":"_doc"}}
{"request":2,"store":"aliexpress","status":"bought","lastdate":"2012/12/20 19:00"}
在查询中过滤,然后按request
字段聚合并使用排序的top_hits
:
GET stores/_search
{
"size": 0,
"query": {
"term": {
"store": {
"value": "ebay"
}
}
},
"aggs": {
"by_req": {
"terms": {
"field": "request"
},
"aggs": {
"hits": {
"top_hits": {
"sort": [
{
"lastdate": {
"order": "desc"
}
}
]
}
}
}
}
}
}
将其转换为 Java DSL 应该不会太难。
@joe 在 ES DSL 中发布了正确答案(再次感谢。)。
我的目标是使用 java 中的查询。 如果有人还需要 JAVA DSL 代码,我将在此处添加:
QueryBuilder storeQuery = QueryBuilders.boolQuery().filter(QueryBuilders.termsQuery("store", "ebay"))
AggregationBuilder subAgg= AggregationBuilders.topHits("hits").sort("lastdate, SortOrder.ASC);
AggregationBuilder mainAgg= AggregationBuilders.terms("by_req").field("request").subAggregation(subAgg);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.