简体   繁体   中英

In memory storage engine is not faster than wired tiger

I'm running a query that returns a lot of data. It looks up 916 documents, each of them having a large data field (around 5MB). The query looks like this:

db.collection.find(
{'name': somename, 'currency': mycurrency, 
'valuation_date': {'$in': [list_of_250_datetime_datetime]}
}.{'data_column: is set to true or false in the below test results}).limit(x)

I have been trying to optimize the query and found out that most of the time is spent loading (or transmitting) that large data item, rather than look it up in the 5GB database. So I assume the query is propery optimized and the indices are used correctly, which is also confirmed by the profiler.

So I assumed that loading the data from disk would take most of the time, but it seems that when I use the in memory storage engine, things are actually slowed down. How is this possible? And what else can I do to speed things up?

In Memory storage engine:

================ Starting test using mongodb://localhost:27018/ ================
Looking up 100 values excluding data column...
++++++++++ Query completed in 0.0130000114441 seconds ++++++++++ 
Looking up 100 values, full json with data...
++++++++++ Query completed in 2.85100007057 seconds ++++++++++ 
Looking up all values, excluding data column...
++++++++++ Query completed in 0.0999999046326 seconds for 916 items ++++++++++ 
Looking up all values, full json with data...
++++++++++ Query completed in 29.2250001431 seconds for 916 items ++++++++++ 

Wired tiger:

================ Starting test using mongodb://localhost:27017/ ================
Looking up 100 values excluding mdo column...
++++++++++ Query completed in 0.0120000839233 seconds ++++++++++ 
Looking up 100 values, full json with data...
++++++++++ Query completed in 2.97799992561 seconds ++++++++++ 
Looking up all values, excluding data column...
++++++++++ Query completed in 0.0700001716614 seconds for 916 items ++++++++++ 
Looking up all values, full json with data...
++++++++++ Query completed in 23.8389999866 seconds for 916 items ++++++++++ 

It's not faster because MongoDB with WT is caching the data in memory anyway, you have enough RAM for the data you are querying to fit in the cache so there is no penalty (except for first access, of course) reading from disk. If you started with a cold cache you would see a significant difference for WT versus in-memory, but not once the data has been accessed and loaded into the cache.

My immediate suspicion would be the network and if this is 5MiB each over 916 documents that would mean you are getting (916 * 5 * 8)/23.84 = 1.537Gb/s or for the first example (916 * 5 * 8)/29.23 = 1.254Gb/s - the 100 value versions are in similar ranges. You didn't mention whether this is Windows or Linux, or anything else about the environment other than that this is using localhost, so hard to comment as to what might speed things up, but I suspect it is the transfer of the data that is your bottleneck at the moment.

As per my knowledge, WiredTiger persists data in disk for index, user_data, replica etc. while In-memory storage engine is good with the less disk-io operation as it doesn't persist any data on disk. Also read this

In-memory storage engine requires that all its data (including oplog if mongod is part of replica set, etc.) fit into the specified --inMemorySizeGB command-line option or storage.inMemory.engineConfig.inMemorySizeGB setting. See Memory Use.

So, If the data size you are querying upon is too large it will eventually take longer than WiredTiger.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM