execution time of mongodb find query

Question

I am trying to benchmark mongodb performance and I am having problems understanding how mongodb executes queries, specifically how long they take to complete.

If I run the following code:

import pymongo
#Connect to the database
client = MongoClient("mongodb://.../testrecords")
db = client.testrecords

start = datetime.datetime.now()

result = db.threads.find( {"$and": [{ "location" : "JC018" }, {"timestamp": "2018-03-22T23:05:15+00:00"}  ] } ).explain()

endtime = datetime.datetime.now()
print ("duration: " + str(endtime-start))
print(result)

I receive following output: duration: 0:00:00.531754 . I also get the results of the explanation() function providing the following information executionTimeMillis': 249

This makes sense as the time taken by mongodb to execute the query is less than the roundtrip time.

However if I use the following loop to run the same query 10,000 times, the execution duration is consistently recorded as between 200 and 300 milliseconds. (Note that I have removed the explain() call.) I fail to see how running the query 10,000 times can result in no meaningful increase in execution time.

for i in range(10000):
    result = db.threads.find( {"$and": [{ "location" : "JC018" }, {"timestamp": "2018-03-22T23:05:15+00:00"}  ] } )

However, if I run the loop with the explain() function it does appear to take approximately n * 250ms to execute the loop.

for i in range(n):
    result = db.threads.find( {"$and": [{ "location" : "JC018" }, {"timestamp": "2018-03-22T23:05:15+00:00"}  ] } )

Can anyone explain the lack of a time difference in executing the query once and executing it 10,000 times and why adding the explain() function to the loop appears to result in the expected execution time?

I thought that there may be some kind of caching going on but I am only using PyMongo on the client side and cannot find any mention of this in the documentation.

Thanks

Answer 1

So after more research I discovered that the query doesn't return all of the results in the database, it returns the first 100 records and a Cursor object which is a reference to the result set and can be iterated over.

So to actually fetch all of the results from the database one would use the following code:

results = []

for doc in db.threads.find( { "timestamp": { "$gt": "2018-02-20T20:08:00+00:00", "$lt": "2018-02-20T22:54:42.3+00:00"} } ):
    results.append(doc)

execution time of mongodb find query

Question

1 answers

solution1
0 ACCPTED 2018-03-26 18:39:19

execution time of mongodb find query

Question

1 answers

solution1 0 ACCPTED 2018-03-26 18:39:19

solution1
0 ACCPTED 2018-03-26 18:39:19