简体   繁体   中英

MongoDB 'count()' is very slow. How do we refine/work around with it?

I am currently using MongoDB with millions of data records. I discovered one thing that's pretty annoying.

When I use 'count()' function with a small number of queried data collection, it's very fast. However, when the queried data collection contains thousand or even millions of data records, the entire system becomes very slow.

I made sure that I have indexed the required fields.

Has anybody encountered an identical thing? How do you do to improve that?

There is now another optimization than create proper index.

db.users.ensureIndex({name:1});
db.users.find({name:"Andrei"}).count();

If you need some counters i suggest to precalculate them whenever it possible. By using atomic $inc operation and not use count({}) at all.

But mongodb guys working hard on mongodb, so, count({}) improvements they are planning in mongodb 2.1 according to jira bug .

You can ensure that the index is really used without any disk access.

Let's say you want to count records with name : "Andrei"

You ensure index on name (as you've done) and

db.users.find({name:"andrei"}, {_id:0, name:1}).count()

you can check that it is the fastest way to count (except with precomputing) by checking if

db.users.find({name:"andrei"}, {_id:0, name:1}).explain() 

displays a index_only field set to true.

This trick will ensure that your query will retrieve records only from ram (index) and not from disk.

For me the solution was change index to sparse . It depend on specific situation, just give it a try if you can.

db.Account.createIndex( { "date_checked_1": 1 }, { sparse: true } )

db.Account.find({    
     "dateChecked" : { $exists : true }    
}).count()

318 thousands records in collection

  • 0.31 sec - with sparse index
  • 0.79 sec - with non-sparse index

You are pretty much out of luck for now, count in mongodb is awful and won't be getting better in the near future. See: https://jira.mongodb.org/browse/SERVER-1752

From experience, you should pretty much never use it unless it's a one time thing, something that occurs very rarely, or your database is pretty small.

As @Andrew Orsich stated, use counters whenever possible (the downfall to counters is the global write lock, but better than count() regardless).

Adding my observations based on latest version of mongodb 4.4. I have 0.80 TB collection size.

I have created an index ( UserObject.CountryID ) for my collection. and ran this query.

db.users.aggregate([
{
    $match : {
        "UserObject.CountryID" : 3
    }
}]).group({_id: "Count", count: {$sum: 1}})

It took total

  • 06800 ms to fetch count of around 13 million (1.3 crore) records searching 0.80 TB collection size.
  • 16274 ms to fetch count of around 35 million (3.5 crore) records searching 0.80 TB collection size.
  • 41615 ms to fetch count of around 42 million (4.2 crore) records searching 0.80 TB collection size.

在此处输入图片说明

在此处输入图片说明

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM