简体   繁体   English

mongodb聚合多个值

[英]mongodb aggregation of multiple values

I am trying analytics with mongodb , but i am very new at it although i got it working for 1 query i don't think its efficient one , here is my example of my dataset 我正在尝试使用mongodb进行分析,但是我对此很陌生,尽管我让它适用于1个查询,但我认为它效率不高,这是我的数据集示例

{
_id: ObjectId("54442882fa2e117a55f3458b"),
analytic_num: 185,
createdAt: ISODate("2014-10-19T21:09:22.167Z"),
updatedAt: ISODate("2014-10-19T21:09:22.167Z"),
rawBrowser: "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:33.0) Gecko/20100101 Firefox/33.0",
gender: "male",
eventId: "accepted",
type: "member",
relationshipStatus: "Single",
ageRange: "18-25",
uid: "53f79ae6f158168161f04d27",
cid: "54370fa7498a776e1c065120",
education: ["high_school", "professional_degree"],
interestedIn: ["female"],
__v: 0
}

here is the query m trying 这是尝试中的查询

db.analytics.aggregate([{
$match: {
    createdAt: {
        $gte: new Date(2014, 9, 15),
        $lt: new Date(2014, 9, 28)
    }
}
}, {
$project: {
    _id: 0,
    minute: {
        $minute: "$createdAt"
    },
    hour: {
        $hour: "$createdAt"
    }
}
}, {
$group: {
    _id: {
        minute: "$minute",
        hour: "$hour"
    },
    hits: {
        $sum: 1
    }
}
}]);

here is the result i am getting 这是我得到的结果

{ "_id" : { "minute" : 33, "hour" : 21 }, "hits" : 1 }
{ "_id" : { "minute" : 29, "hour" : 21 }, "hits" : 6 }
{ "_id" : { "minute" : 6, "hour" : 22 }, "hits" : 2 }
{ "_id" : { "minute" : 9, "hour" : 21 }, "hits" : 1 }

everything is fine but i only get hits for every minute of every hour , thats fine if i just want only hits 一切都很好,但我只每小时获得每分钟的点击数,如果我只想点击数就可以了

but i if need to find out hits by type or gender or ageRange i need to change $match query, thats not efficient to run this query for all the attributes by changing $matvh 但是我如果需要按类型,性别或ageRange找出$match ,就需要更改$match查询,这对于通过更改$matvh来针对所有属性运行此查询效率不高

How can i get all the hits for type,gender, angRange in one query i want result like this 我如何在一个查询中获取类型,性别,angRange的所有匹配,我想要这样的结果

{ "_id" : { "minute" : 33, "hour" : 21 }, "hits" : 30, "member" :2 "single": 12 ,"male" :12 }
{ "_id" : { "minute" : 34, "hour" : 21 }, "hits" : 50, "member" :22 "single": 12 ,"male" :12 }

Pls help 请帮助

thanks 谢谢

You are looking for the $cond operator. 您正在寻找$cond运算子。 This allows you to evaluate a condition and then make a decision on whether this returns true|false to which value you want to return. 这使您可以评估条件,然后决定此条件是否返回true|false In this case, whether you want to add an increment to a $sum operation or whether you don't: 在这种情况下,是否要为$sum操作添加增量,或者是否不想:

db.analytics.aggregate([
    { "$match": {
        "createdAt": {
            "$gte": new Date(2014, 9, 15),
            "$lt": new Date(2014, 9, 28)
        }
    }}, 
    { "$group": {
        "_id": {
            "minute": { "$minute": "$createdAt" },
            "hour": { "$hour": "$createdAt" }
        },
        "hits": { "$sum": 1 },
        "member": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$type", "member" ] },
                    1,
                    0
                ]
            }
        },
        "single": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$relationshipStatus", "single" ] },
                    1,
                    0
                ]
            }
        },
        "male": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$gender", "male" ] },
                    1,
                    0
                ]
            }
        }
    }}
]);

So the $cond operator is basically a "ternary", or if..then..else construct, which evaluates a condition and returns a value based on how that condition was determined, true|false . 因此, $cond运算符基本上是“三元”,或者if..then..else构造,它计算条件并根据确定条件的方式返回值true|false You use it in this way to determine the value returned. 您可以通过这种方式使用它来确定返回的值。

Be careful with date aggregation operators. 注意日期聚合运算符。 Maybe what you want here is every minute of every hour, aggregated for all days within that range. 也许您想要的是每小时的每一分钟,在该范围内的所有日期汇总。 But usually people just want the discrete time periods over each day, even by the minute. 但是通常人们只希望每天的离散时间,甚至是分钟。

Be careful what you ask for, you just might get it. 请小心您的要求,您可能会得到它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM