简体   繁体   English

通过ElasticSearch 6中的子聚合进行筛选,排序和分页

[英]Filtering, sorting and paginating by sub-aggregations in ElasticSearch 6

I have a collection of documents, where each document indicates the available rooms for a given hotel and day, and their cost for that day: 我有一组文件,每个文件都显示给定酒店和日期的可用房间,以及当天的费用:

{
    "hotel_id": 2016021519381313,
    "day": "20200530",
    "rooms": [
        {
            "room_id": "00d70230ca0142a6874358919336e53f",
            "rate": 87
        },
        {
            "room_id": "675a5ec187274a45ae7a5fdc20f72201",
            "rate": 53
        }
    ]
}

Being the mapping: 作为映射:

{
    "properties": {
        "day": {
            "type": "keyword"
        },
        "hotel_id": {
            "type": "long"
        },
        "rooms": {
            "type": "nested",
            "properties": {
                "rate": {
                    "type": "long"
                },
                "room_id": {
                    "type": "keyword"
                }
            }
        }
    }
}

I am trying to figure out, how to do a query where I can get the available rooms for a set of days which total cost is less than a given amount, ordered by total cost in ascending order and paginated. 我想弄清楚,如何进行查询,我可以在一天中获得可用房间,总成本低于给定金额,按总成本按升序排序和分页。

So far I came up with the way of getting rooms available for the set of days and their total cost. 到目前为止,我想出了让房间可用的日期和总成本的方法。 Basically filtering by the days, and grouping per hotel and room IDs, requiring that the minimum count in the aggregation is the number of days I am looking for. 基本上按日期过滤,并按酒店和房间ID分组,要求汇总中的最小数量是我要查找的天数。

{
    "size" : 0,
    "query": {
        "bool": { 
            "must": [
                {
                    "terms" : {
                        "day" : ["20200423", "20200424", "20200425"]
                    }
                }
            ]
        } 
    } ,
    "aggs" : {
        "hotel" : {
            "terms" : { 
                "field" : "hotel_id"
            },
            "aggs" : {
                "rooms" : {
                    "nested" : {
                        "path" : "rooms"
                    },
                    "aggs" : {
                        "rooms" : {
                            "terms" : {
                                "field" : "rooms.room_id",
                                "min_doc_count" : 3
                            },
                            "aggs" : {
                                "sum_price" : { 
                                    "sum" : { "field" : "rooms.rate" } }
                            }
                        }

                    }
                }
            }
        }
    }
}

So now I am interesting in ordering the result buckets in descending order at the "hotel" level based on the value of the sub-aggregation with "rooms", and also filtering the buckets that do not contains enough documents or which "sum_price" is bigger than a given budget. 所以现在我很有兴趣根据带有“房间”的子聚合的值在“酒店”级别按降序排序结果桶,并且还过滤不包含足够文档或“sum_price”的桶。大于给定的预算。 But I cannot manage how to do it. 但我无法管理如何做到这一点。

I have been taking a look at "bucket_sort", but I cannot find the way to sort in base a subaggregation. 我一直在看“bucket_sort”,但是我找不到基于子聚合的排序方式。 I have been also taking a look to "bucket_selector", but it gives me empty buckets when they do not fit the predicate. 我一直在看看“bucket_selector”,但是当它们不适合谓词时,它会给我空桶。 I am probably not using them correctly in my case. 在我的情况下,我可能没有正确使用它们。

Which would be the right way of accomplish it? 哪个是完成它的正确方法?

Here is the query without pagination : 这是没有分页的查询:

{
   "size":0,
   "query":{
      "bool":{
         "must":[
            {
               "terms":{
                  "day":[
                     "20200530",
                     "20200531",
                     "20200532"
                  ]
               }
            }
         ]
      }
   },
   "aggs":{
      "rooms":{
         "nested":{
            "path":"rooms"
         },
         "aggs":{
            "rooms":{
               "terms":{
                  "field":"rooms.room_id",
                  "min_doc_count":3,
                  "order":{
                     "sum_price":"asc"
                  }
               },
               "aggs":{
                  "sum_price":{
                     "sum":{
                        "field":"rooms.rate"
                     }
                  },
                  "max_price":{
                     "bucket_selector":{
                        "buckets_path":{
                           "var1":"sum_price"
                        },
                        "script":"params.var1 < 100"
                     }
                  }
               }
            }
         }
      }
   }
}

Please note that the following variables should be changed for the desired results: 请注意,应更改以下变量以获得所需结果:

  • day
  • min_doc_count min_doc_count
  • script in max_price max_price中的脚本

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM