简体   繁体   中英

MongoDB: find execution time for count() command on millions of records in a collection?

I am trying to find time required to perform a count() on a collection which is consisting of millions of testdata records, with following scenario:-

1) From 1st Mongo shell I am inserting millions of records into collection using a code

for (var i = 0; i < 10000000; ++i){ 
  db.unicorns.insert({name: 'sampleName', gender: 'm', weight: '440' });
}

2) From 2ndMongo shell I am trying to find count() on that collection( Imp: while insertion is still getting executed on 1st Mongo Shell)

db.unicorns.count()

I researched but found that explain() and stats() cannot be applied to count() command.

some

I need to find out how much time it takes to count() when there are insertions going on collection(something like a live scenario)?

Is there any other good approach for doing this?

MongoDB has a built-in profiller that you can enable via:

db.setProfilingLevel(2)

Instead of '2' you can choose any option from the list bellow:

  • 0 - the profiler is off, does not collect any data. mongod always writes operations longer than the slowOpThresholdMs threshold to its log.
  • 1 - collects profiling data for slow operations only. By default slow operations are those slower than 100 milliseconds. You can modify the threshold for “slow” operations with the slowOpThresholdMs runtime option or the setParameter command. See the Specify the Threshold for Slow Operations section for more information.
  • 2 - collects profiling data for all database operations.

And you can see the results of your queries by checking the system.profile collection in MongoDB..

EDIT:

If you want to test performance you could use the following snippets of code that can be executed from the mongo console:

> for (var i = 0; i < 10000000; ++i) { db.countTest.insert({a: i % 10}) }
> db.countTest.ensureIndex({a:1})
> db.countTest.count({a: 1})
> db.countTest.count()
> db.countTest.find().count()

And my conclusions are as following:

  1. adding an index (appart from the id) returned the count for 10 million records in around 170ms
  2. counting by id (count without any query) returned the count in less than a millisecond
  3. counting by id with cursor (note that the .find() will act as a cursor over the collection) returned the count in less than a millisecond

So the more indexes your collection has the slower your query will be . If you count by _id it will be instant , if you have a composite index it will scale based on the number of indexes .

The easier way would be

function timeCount(database, collection) {
  db = db.getSiblingDB(database);
  var start = new Date().getTime();
  db.collection.count();
  print("msecs taken: "+ (new Date().getTime() - start) );
 }

Now you can call the function with

 timeCount("yourDB","unicorns")

You can put the function into a js file and load it via the --shell parameter or you can put it into your ~/.mongorc.js and call it with every db and collection.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM