简体   繁体   English

查询的mongodb'count'非常慢

[英]mongodb 'count' with query is very slow

everyone,im use a mongodb 2.4.6 version and in windows 2008 64-bit. 大家好,我使用mongodb 2.4.6版本和Windows 2008 64位版本。

i have a collection that have two million records and need to search and paging in client. 我有一个有200万条记录的集合,需要在客户端进行搜索和分页。

db.products.find({"catalogs":1205}).skip().limit() is very fast .

but when calculate total record count: 但是当计算总记录数时:

db.products.find({"catalogs":1205},{"_id":1}).count() is too slow.

>> 442312 records.

>>[log] Sat Sep 28 00:20:01.566 [conn10] command products.$cmd command: { count: "products", query: { catalogs: 1205.0 }, fields: { _id: 1.0 } } ntoreturn:1 keyUpdates:0 locks(micros) r:460681 reslen:48 460ms

this count command elapsed time is 460ms ,is too slow.if we have a lot of request that very terrible. 这个计数命令经过的时间是460ms ,太慢了。如果我们有很多要求那么可怕。

i created a index for a 'catalogs' field and can't use $inc command because query could be very complex. 我为'catalogs'字段创建了一个索引,并且不能使用$inc命令,因为查询可能非常复杂。

im googling some like this problem and found this 'count' performance bug already fixed in mongodb 2.4 version. 谷歌搜索一些像这个问题,发现这个'计数'性能错误已经修复mongodb 2.4版本。

from http://docs.mongodb.org/manual/release-notes/2.4-overview/ 来自http://docs.mongodb.org/manual/release-notes/2.4-overview/

Improvements to count provide dramatically faster count operations. Counting is now up to 20 times faster for low cardinality index based counts.

what ways can improve count?thanks. 有什么方法可以提高数量?谢谢。

update some information 更新一些信息

> db.products.getIndexes()
[
    {
            "v" : 1,
            "key" : {
                    "_id" : 1
            },
            "ns" : "products.products",
            "name" : "_id_"
    },
    {
            "v" : 1,
            "key" : {
                    "catalogs" : 1,
                    "created" : -1
            },
            "ns" : "products.products",
            "name" : "catalogs_1_created_-1"
    }
]

the query and elapsed time: 查询和已用时间:

>db.products.find({"catalogs":1205},{"_id":1}).limit(20)
>>Tue Oct 01 15:39:19.160 [conn2] query products.products query: { catalogs: 1205.0 } cursorid:277334670708253 ntoreturn:20 ntoskip:0 nscanned:21 keyUpdates:0 locks(micros) W:5045 r:1017 nreturned:20 reslen:704 1ms

the query exaplin: 查询exaplin:

>db.products.find({"catalogs":1205},{"_id":1}).explain()

{
    "cursor" : "BtreeCursor catalogs_1_created_-1",
    "isMultiKey" : true,
    "n" : 451466,
    "nscannedObjects" : 451466,
    "nscanned" : 451466,
    "nscannedObjectsAllPlans" : 451466,
    "nscannedAllPlans" : 451466,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 2,
    "nChunkSkips" : 0,
    "millis" : 2969,
    "indexBounds" : {
            "catalogs" : [
                    [
                            1205,
                            1205
                    ]
            ],
            "created" : [
                    [
                            {
                                    "$maxElement" : 1
                            },
                            {
                                    "$minElement" : 1
                            }
                    ]
            ]
    },
    "server" : "WIN-O47CO6C2WXY:27017"

} }

The reason this count query is not particularly fast is because it has to scan 451466 entries in the index to count up entries. 此计数查询不是特别快的原因是因为它必须扫描索引中的451466个条目以计算条目。 In other words, your query is not very selective relative to the index and size of the entries that satisfy the query. 换句话说,相对于满足查询的条目的索引和大小,您的查询不是非常有选择性。

count() iterates through all the results in cursor before giving a count, that is why its so slow. count()在给出计数之前遍历游标中的所有结果,这就是它如此慢的原因。 Use size() instead, its pretty fast with respect to count() . 使用size()代替它,相对于count()非常快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM