简体   繁体   中英

Count and Aggregate in MongoDB

I have mongodb collection whose structure is as follows :-

{
"_id" : "mongo",
"log" : [
    {
        "ts" : ISODate("2011-02-10T01:20:49Z"),
        "visitorId" : "25850661"
    },
    {
        "ts" : ISODate("2014-11-01T14:35:05Z"),
        "visitorId" : NumberLong(278571823)
    },
    {
        "ts" : ISODate("2014-11-01T14:37:56Z"),
        "visitorId" : NumberLong(0)
    },
    {
        "ts" : ISODate("2014-11-04T06:23:48Z"),
        "visitorId" : NumberLong(225200092)
    },
    {
        "ts" : ISODate("2014-11-04T06:25:44Z"),
        "visitorId" : NumberLong(225200092)
    }
],
"uts" : ISODate("2014-11-04T06:25:43.740Z")
}

"mongo" is a search term and "ts" indicates the timestamp when it was searched on website.

"uts" indicates the last time it was searched.

So search term "mongo" was searched 5 times on our website.

I need to get top 50 most searched items in past 3 months.

I am no expert in aggregation in mongodb, but i was trying something like this to atleast get data of past 3 months: -

db.collection.aggregate({$group:{_id:"$_id",count:{$sum:1}}},{$match:{"log.ts":{"$gte":new Date("2014-09-01")}}})

It gave me error :-

exception: sharded pipeline failed on shard DSink9: { errmsg: \"exception: aggregation result exceeds maximum document size (16MB)\", code: 16389

Can anyone please help me?

UPDATE

I was able to write some query. But it gives me syntax error.

db.collection.aggregate(
{$unwind:"$log"},
{$project:{log:"$log.ts"}},
{$match:{log:{"$gte" : new Date("2014-09-01"),"$lt" : new Date("2014-11-04")}}},
{$project:{_id:{val:{"$_id"}}}},
{$group:{_id:"$_id",sum:{$sum:1}}})

You are exceeding a maximum document size in a result, but generally that is an indication that you are "doing it wrong", particularly given your example term of searching for "mongo" in your stored data between two dates:

db.collection.aggregate([
   // Always match first, it reduces the workload and can use an index here only.
   { "$match": { 
       "_id": "mongo" 
       "log.ts": {
           "$gte": new Date("2014-09-01"), "$lt": new Date("2014-11-04")
       }
   }},

   // Unwind the array to de-normalize as documents
   { "$unwind": "$log" },

   // Get the count within the range, so match first to "filter"
   { "$match": { 
       "log.ts": {
           "$gte": new Date("2014-09-01"), "$lt": new Date("2014-11-04")
       }
   }},

   // Group the count on `_id`
   { "$group": {
       "_id": "$_id",
       "count": { "$sum": 1 }
   }}
]);

Your aggregation result exceeds the max size of mongodb.You can use allowDiskUse option.This option prevent this.And in mongodb shell version 2.6 this will not throw an exception. look at this aggregrate .And you can optimize your query for decreasing the pipeline result.For this look at this question aggregation result

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM