I have a collection in mongodb - "text_failed" which has all the numbers on which I failed to send an SMS, the time they failed and some other information.
A document in this collection looks like this:
{
_id(ObjectId): xxxxxx2af8....
failTime(String): 2015-05-15 01:15:48
telNum(String): 95634xxxxx
//some other information
}
I need to fetch the top 500 numbers which failed the most in a month's duration. A number can occur any number of time during this month.(Eg: a number failed 143 times, other 46 etc.)
The problem I have is that during this duration the numbers failed crossed 7M. It's difficult to process this much information using the following code which doesn't use aggregation:
DBCollection collection = mongoDB.getCollection("text_failed");
BasicDBObject query = new BasicDBObject();
query.put("failTime", new BasicDBObject("$gt", "2015-05-15 00:00:00").append("$lt", "2015-06-15 00:00:00"));
BasicDBObject field = new BasicDBObject();
field.put("telNum", 1);
DBCursor cursor = collection.find(query, field);
HashMap<String, Integer> hm = new HashMap<String, Integer>();
//int count = 1;
System.out.println(cursor);
while(cursor.hasNext()) {
//System.out.println(count);
//count++;
DBObject object = cursor.next();
if(hm.containsKey(object.get("telNum").toString())) {
hm.put(object.get("telNum").toString(), hm.get(object.get("telNum").toString()) + 1);
}
else {
hm.put(object.get("telNum").toString(), 1);
}
}
This fetches 7M+ documents for me. I need only the top 500 numbers. The result should look something like this:
{
telNum: xxxxx54654 //the number which failed
count: 129 //number of times it failed
}
I used aggregation myself but didn't get the desired results. Can this be accomplished by aggregation? Or is there any other way more efficient in which I can do this?
You could try the following aggregation pipeline:
db.getCollection("text_failed").aggregate([
{
"$match": {
"failTime": { "$gt": "2015-05-01 00:00:00", "$lt": "2015-06-01 00:00:00" }
}
},
{
"$group": {
"_id": "$telNum",
"count": { "$sum": 1 }
}
},
{
"$sort": { "count": -1 }
},
{
"$limit": 500
}
])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.