简体   繁体   中英

MongoDB vs MySQL Performance - Simple Query

I am doing a comparison of mongodb with respect to mysql and imported the mysql data into the mongodb collection (>500000 records). the collection looks like this:

{
    "_id" : ObjectId(""),
    "idSequence" : ,
    "TestNumber" : ,
    "TestName" : "",
    "S1" : ,
    "S2" : ,
    "Slottxt" : "",
    "DUT" : ,
    "DUTtxt" : "",
    "DUTver" : "",
    "Voltage" : ,
    "Temperature" : ,
    "Rate" : ,
    "ParamX" : "",
    "ParamY" : "",
    "Result" : ,
    "TimeStart" : new Date(""),
    "TimeStop" : new Date(""),
    "Operator" : "",
    "ErrorNumber" : ,
    "ErrorText" : "",
    "Comments" : "",
    "Pos" : ,
    "SVNURL" : "",
    "SVNRev" : ,
    "Valid" : 
}

When comparing the queries (which both return 15 records):

mysql -> SELECT TestNumber FROM db WHERE Valid=0 AND DUT=68 GROUP BY TestNumber

with

mongodb -> db.results.distinct("TestNumber", {Valid:0, DUT:68}).sort()

The results are equivalent, but it takes (iro) 17secs from mongodb, compared with 0.03 secs from mysql.

I appreciate that it is difficult to make a comparison between the two db architectures and i further appreciate one of the skills of mongodb admin is to organise the data structure accordingly (therefore it is not a fair test to just import the mysql structure) Ref: MySQL vs MongoDB 1000 reads

But the time to return difference is too great to be a tuning issue. My (default) mongodb log file reads:

Wed Mar 05 04:56:36.415 [conn4089] command NTV_Results.$cmd command: { distinct: "results", key: "TestNumber", query: { Valid: 0.0, DUT: 68.0 } } ntoreturn:1 keyUpdates:0 numYields: 6 locks(micros) r:21764672 reslen:250 16525ms

I have also tried the query:

db.results.group( {
               key: { "TestNumber": 1 },
               cond: {"Valid": 0, "DUT": 68 },
               reduce: function ( curr, result ) { },
               initial: { }
            } )

With similar (17 seconds) results, any clues as to what I am doing wrong? Both services are running on the same octo-core i7 3770 desktop PC with Windows 7 and 16Gb RAM.

There can be many reasons for slow performance, much of which is too much detail to go into here. But I can offer you a "starter pack" as it were.

Creating Indexes on your Valid and DUT fields are going to improve results for these and other queries. Consider this compound form this case using the ensureIndex command

db.collection.ensureIndex({ "Valid": 1, "DUT": 1})

Also the use of aggregate is recommended for these types of operations:

db.collection.aggregate([
    {$match: { "Valid": 0, "DUT": 68 }},
    {$group: { _id: "$TestNumber" }}
])

Should be the equivalent of the SQL you are referring to.

There is a SQL to Aggregation Mapping Chart that may give you some assistance with the thinking. Also worth familiarizing yourself with the difference aggregation operators in order to write effective queries.

I have spent many years writing very complex SQL for advanced tasks. And I find the aggregation framework a breath of fresh air for various problem solving cases.

Worth your time to learn.

Also worth noting. Your "default" MongoDB log file is reporting those operations because they are considered to be "slow queries" and are then brought to your attention by "default". You can also see more or less information, as you require by tuning the database profiler to meet your needs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM