简体   繁体   中英

mongodb $all and $in very slow even on indexed fields

I have a collection of about 80 million documents, each of them storing an array of tags in the tags field, eg:

{text: "blah blah blah...", tags: ["car", "auto", "automobile"]}

The field tags is indexed, so naturally the queries like this are almost instant:

 db.documents.find({tags:"car"})

However the following queries are all very slow, taking several minutes to complete:

 db.documents.find({tags:{$all:["car","phone"]}})
 db.documents.find({tags:{$in:["car","auto"]}})

The problem persists even if the array only has a single item:

 db.documents.find({tags:{$all:["car"]}})  //very slow too

I thought $all and $in should be able to work very fast because tags is indexed but apparently it is not the case. Why?

It turns out this is a known bug in MongoDB which hasn't yet been fixed as of 2.2

MongoDB does not perform index intersection when searching for multiple entries using $all . Only the first item in the array is looked up using indexes, and a scan of all matched documents is performed to filter the results.

For example, in the query db.documents.find({tags:{$all:["car","phone"]}}) all documents containing the tag "car" need to be retrieved and scanned. Since the collection in question contains over a hundred thousand documents tagged with "car", the slowdown is not surprising.

Worse, MongoDB doesn't even perform the simple optimization of selecting the least represented item in the $all array for the index lookup. If there are 100000 documents tagged "car" and 10 documents tagged "phone", MongoDB will still need to scan 100000 documents to return results for {$all:["car", "phone"]}

See also: https://jira.mongodb.org/browse/SERVER-1000

I just want to add, $in is fast. In fact, for just 1 criteria or keyword, $in is equivalent with $all, yet $in is fast, and $all is slow.

So use $in.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM