[英]MongoDB Running sum over date range per ID
我有以下 json 文件(小样本),其中包含 4 个不同客户在不同日期的购买记录。 我需要使用 MongoDB/nosql 来确定哪些客户在连续 3 天内至少总共进行了 8 次新购买。
在这种情况下,客户 ABC 从 2020-05-01 到 2020-05-03(连续 3 天)总共购买了 32 次。 客户 GHI 从 2020-07-28 到 2020-07-30(连续 3 天)也有 20 次购买。 所以我的 output 应该只包含客户 ABC 和 GHI。 获取 output 的代码是什么? 非常感谢!
{"cust_id":"ABC", "date":"2020-05-01", "new_purchase":2},
{"cust_id":"ABC", "date":"2020-05-02", "new_purchase":16},
{"cust_id":"ABC", "date":"2020-05-03", "new_purchase":14},
{"cust_id":"ABC", "date":"2020-05-04", "new_purchase":0},
{"cust_id":"ABC", "date":"2020-05-05", "new_purchase":5},
{"cust_id":"DEF", "date":"2020-05-11", "new_purchase":3},
{"cust_id":"DEF", "date":"2020-05-12", "new_purchase":0},
{"cust_id":"DEF", "date":"2020-05-13", "new_purchase":0},
{"cust_id":"DEF", "date":"2020-05-14", "new_purchase":0},
{"cust_id":"DEF", "date":"2020-05-15", "new_purchase":1},
{"cust_id":"GHI", "date":"2020-07-28", "new_purchase":0},
{"cust_id":"GHI", "date":"2020-07-29", "new_purchase":3},
{"cust_id":"GHI", "date":"2020-07-30", "new_purchase":17},
{"cust_id":"GHI", "date":"2020-07-31", "new_purchase":0},
{"cust_id":"GHI", "date":"2020-08-01", "new_purchase":1},
{"cust_id":"JKL", "date":"2020-06-04", "new_purchase":7},
{"cust_id":"JKL", "date":"2020-06-05", "new_purchase":0},
{"cust_id":"JKL", "date":"2020-06-06", "new_purchase":0},
{"cust_id":"JKL", "date":"2020-06-07", "new_purchase":0},
{"cust_id":"JKL", "date":"2020-06-08", "new_purchase":0},
{"cust_id":"JKL", "date":"2020-06-08", "new_purchase":2}
我假设这些日子没有重复,只要您在最后一个日期有错字,情况似乎就是如此。
这个想法是:
day-1
day+1
。 如果它们存在,我们将其作为数组添加到集合中,如下所示: sets: [[1,2,3], [2,3,4], [3,4,5]]
。$unwind
ed,我们找到那些日子的产品总和。total > 8
。{$group:{_id:"$id"}}
[
{
"$group" : {
"_id" : {
"cust_id" : "$cust_id",
"year" : {
"$arrayElemAt" : [
{
"$split" : [
"$date",
"-"
]
},
0
]
},
"month" : {
"$arrayElemAt" : [
{
"$split" : [
"$date",
"-"
]
},
1
]
}
},
"daysAndBuys" : {
"$push" : {
"day" : {
"$convert" : {
"input" : {
"$arrayElemAt" : [
{
"$split" : [
"$date",
"-"
]
},
2
]
},
"to" : "int"
}
},
"buys" : "$new_purchase"
}
}
}
},
{
"$addFields" : {
"sets" : {
"$map" : {
"input" : "$daysAndBuys",
"as" : "d",
"in" : {
"$cond" : [
{
"$and" : [
{
"$in" : [
{
"$add" : [
"$$d.day",
1
]
},
"$daysAndBuys.day"
]
},
{
"$in" : [
{
"$subtract" : [
"$$d.day",
1
]
},
"$daysAndBuys.day"
]
}
]
},
[
"$$d.day",
{
"$add" : [
"$$d.day",
1
]
},
{
"$subtract" : [
"$$d.day",
1
]
}
],
""
]
}
}
}
}
},
{
"$addFields" : {
"sets" : {
"$filter" : {
"input" : "$sets",
"as" : "s",
"cond" : {
"$ne" : [
"$$s",
""
]
}
}
}
}
},
{
"$unwind" : "$sets"
},
{
"$addFields" : {
"daysAndBuys" : {
"$filter" : {
"input" : "$daysAndBuys",
"as" : "data",
"cond" : {
"$in" : [
"$$data.day",
"$sets"
]
}
}
}
}
},
{
"$project" : {
"daysAndBuys" : 1,
"total" : {
"$sum" : "$daysAndBuys.buys"
}
}
},
{
"$match" : {
"total" : {
"$gte" : 8
}
}
}
]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.