MongoDB get total count as result using aggregate is very slow

Question

I am using Mongodb 3.2.0 with aggregate query to get the total distinct "userId" by "itemId". In my collection, I have more than 20 million documents. The document in my collection looks like below.

{
    itemId : ObjectId('59c0a50f6ca8a1545bf1d206'),
    regionId : ObjectId('59c11af56ca8a1545bb32665'),
    userId : ObjectId('59c3cd626ca8a12e70866b0c')
  },
  {
    itemId : ObjectId('59c0a50f6ca8a1545bf1d206'),
    regionId : ObjectId('59c11af56ca8a1545bb32665'),
    userId : ObjectId('59c3cd626ca8a12e70865678')
  }

From this, using "itemId" as my selector, I am computing the total distinct "userId" available within the collection. The below config I am using as index in my collection.

db.items.endureIndex({"itemId" : 1})
db.items.endureIndex({"userId" : 1})

My aggregate query is

db.items.aggregate([
    { $match: { itemId: { $in: [ ObjectId('59c0a50f6ca8a1545bf1d206'),  ObjectId('59c0a50f6ca8a1545bf1d207')] } } },
    { $group: { _id: "$userId"}},
    { $group: { _id: null, count : {$sum : 1}}}
    ])

I have also given "allowDiskUse" as true.

The query is executing more than 20 seconds and giving the result. Is there any other way i can improve the execution speed?

I am executing via NodeJS native mongodb driver. Using the distinct query fails with "Exceeding with 16 MB Limit". So, I preferred to go with "aggregate" query.

There are totally 600 000 unique userId as (ObjectId) getting as a result. The total document available in the collection is 8 397 727.

Answer 1

Can try this to get distinct userId and filtered by itemId

db.collectionName.distinct('userId', 
  {itemId: {$in: [ObjectId('59c0a50f6ca8a1545bf1d206'), ObjectId('59c0a50f6ca8a1545bf1d207')]}}
).length

MongoDB get total count as result using aggregate is very slow

Question

1 answers

solution1
0 2017-10-09 10:44:06

MongoDB get total count as result using aggregate is very slow

Question

1 answers

solution1 0 2017-10-09 10:44:06

solution1
0 2017-10-09 10:44:06