简体   繁体   中英

MongoDB $IN query performance issue

thanks all for your help.

I have a collection with this stats I got 700 Milions of records that look something like this

db.flight_availabillity.findOne() { "_id" : ObjectId("5226465fc3b53d4f249c19fc"), "flight_id" : 9803, "arrival" : 1384819200, "duration" : 1, "capacity" : 1, "rooms" : 1, "min_price" : 163, "min_price_packaged" : 50, "rates_has_wifi" : 1, "rates_has_baby_cot" : 1, "rates_has_pets_allow" : 1, "erank" : 0.25 }

When i do queries i do only on 4 fields so i build a index that look like this db.flight_availabillity.ensureIndex({"flight_id":1,"arrival":1,"duration":1,"capacity":1,"rooms":1})

The problem : When sending only 1 flight id find({"flight_id":{$in:[236]}) The results is blasing fast

when using several flight ids find({"flight_id":{$in:[236,232,545,757]}) ( And i can have up to 1000 flight ids in the queries ) . i get slower results.

Here is an explain of one of them that took 3.5 seconds , but i had also severals with 10 seconds

db.flight_availabillity.find({"flight_id":{$in:[333,207731,33993,277,127,183345,169019,156473,92715,5046,2927,2473,2112,2024,281,264,185,125,95,80,208065,183074,31774,359,314,64010,56170,5107,4673,147,115571,214,101564,287,66356,128,194487,100,207984,66353]},"arrival":1384387200,"duration":1,"capacity":1,"rooms":1}).explain() { "cursor" : "BtreeCursor flight_id_1_arrival_1_departure_1_capacity_1_rooms_1 multi", "isMultiKey" : false, "n" : 40, "nscannedObjects" : 240, "nscanned" : 358, "nscannedObjectsAllPlans" : 597, "nscannedAllPlans" : 715, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 4, .... }

What i miss ? how to query it and get fast results ?

Thanks !

In some MongoDB versions $in does not use index - also Mongo has a limitation of using more than one index for the same query.

You query comprises flight_id, arrival, duration, capacity and rooms. Try to setup an index with arrival, duration, capacity and rooms, which shall provide you with good index on a selective criteria instead of putting all fields.

The flight_id will be just a final selection, after the hard work has already been done by the selective criteria.

Also it does not help that indexBound was not pasted, it could give clues whether the index composition is optimal or not.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM