MongoDB索引无法帮助查询多键索引

Question

I have a collection of documents with a multikey index defined. 我有一组定义了多键索引的文档。 However, the performance of the query is pretty poor for just 43K documents. 但是，对于43K文档，查询的性能相当差。 Is ~215ms for this query considered poor? 这个查询的〜215ms被认为是差的吗？ Did I define the index correctly if nscanned is 43902 (which equals the total documents in the collection)? 如果nscanned是43902（等于集合中的文档总数），我是否正确定义了索引？

Document: 文献：

{
    "_id": {
        "$oid": "50f7c95b31e4920008dc75dc"
    },
    "bank_accounts": [
        {
            "bank_id": {
                "$oid": "50f7c95a31e4920009b5fc5d"
            },
            "account_id": [
                "ff39089358c1e7bcb880d093e70eafdd",
                "adaec507c755d6e6cf2984a5a897f1e2"
            ]
        }
    ],
    "created_date": "2013,01,17,09,50,19,274089",
}

Index: 指数：

{ "bank_accounts.bank_id" : 1 , "bank_accounts.account_id" : 1}

Query: 查询：

db.visitor.find({ "bank_accounts.account_id" : "ff39089358c1e7bcb880d093e70eafdd" , "bank_accounts.bank_id" : ObjectId("50f7c95a31e4920009b5fc5d")}).explain()

Explain: 说明：

{
    "cursor" : "BtreeCursor bank_accounts.bank_id_1_bank_accounts.account_id_1",
    "isMultiKey" : true,
    "n" : 1,
    "nscannedObjects" : 43902,
    "nscanned" : 43902,
    "nscannedObjectsAllPlans" : 43902,
    "nscannedAllPlans" : 43902,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 213,
    "indexBounds" : {
        "bank_accounts.bank_id" : [
            [
                ObjectId("50f7c95a31e4920009b5fc5d"),
                ObjectId("50f7c95a31e4920009b5fc5d")
            ]
        ],
        "bank_accounts.account_id" : [
            [
                {
                    "$minElement" : 1
                },
                {
                    "$maxElement" : 1
                }
            ]
        ]
    },
    "server" : "Not_Important"
}

Answer 1

I see three factors in play. 我发现有三个因素在起作用。

First, for application purposes, make sure that $elemMatch isn't a more appropriate query for this use-case. 首先，出于应用程序的目的，确保$ elemMatch对于此用例不是更合适的查询。 http://docs.mongodb.org/manual/reference/operator/elemMatch/ . http://docs.mongodb.org/manual/reference/operator/elemMatch/ 。 It seems like it would be bad if the wrong results came back due to multiple subdocuments satisfying the query. 由于满足查询的多个子文档导致错误的结果返回，似乎会很糟糕。

Second, I imagine the high nscanned value can be accounted for by querying on each of the field values independently. 其次，我想通过独立查询每个字段值可以解决高nscanned值。 .find({ bank_accounts.bank_id: X }) vs. .find({"bank_accounts.account_id": Y}). .find（{bank_accounts.bank_id：X}）与.find（{“bank_accounts.account_id”：Y}）。 You may see that nscanned for the full query is about equal to nscanned of the largest subquery. 您可能会看到完整查询的nscanned大约等于最大子查询的nscanned。 If the index key were being evaluated fully as a range, this would not be expected, but... 如果索引键被完全作为范围进行评估，那么这是不可能的，但......

Third, the { "bank_accounts.account_id" : [[{"$minElement" : 1},{"$maxElement" : 1}]] } clause of the explain plan shows that no range is being applied to this portion of the key. 第三，解释计划的{“bank_accounts.account_id”：[[{“$ minElement”：1}，{“$ maxElement”：1}]]}子句显示没有范围应用于密钥的这一部分。

Not really sure why, but I suspect it has something to do with account_id's nature (an array within a subdocument within an array). 不确定为什么，但我怀疑它与account_id的性质（数组中的子文档中的数组）有关。 200ms seems about right for an nscanned that high. 200毫秒似乎适合nscan高。

A more performant document organization might be to denormalize the account_id -> bank_id relationship within the subdocument, and store: 更高效的文档组织可能是对子文档中的account_id - > bank_id关系进行非规范化，并存储：

{"bank_accounts": [
{
 "bank_id": X,
 "account_id: Y,
},
{
 "bank_id": X,
 "account_id: Z,
}
]}

instead of: {"bank_accounts": [{ "bank_id": X, "account_id: [Y, Z], }]} 而不是：{“bank_accounts”：[{“bank_id”：X，“account_id：[Y，Z]，}]}

My tests below show that with this organization, the query optimizer gets back to work and exerts a range on both keys: 我在下面的测试表明，使用此组织，查询优化器将恢复工作并在两个键上执行范围：

> db.accounts.insert({"something": true, "blah": [{ a: "1", b: "2"} ] })
> db.accounts.ensureIndex({"blah.a": 1, "blah.b": 1})
> db.accounts.find({"blah.a": 1, "blah.b": "A RANGE"}).explain()
{
    "cursor" : "BtreeCursor blah.a_1_blah.b_1",
    "isMultiKey" : false,
    "n" : 0,
    "nscannedObjects" : 0,
    "nscanned" : 0,
    "nscannedObjectsAllPlans" : 0,
    "nscannedAllPlans" : 0,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 0,
    "indexBounds" : {
        "blah.a" : [
            [
                1,
                1
            ]
        ],
        "blah.b" : [
            [
                "A RANGE",
                "A RANGE"
        ]
    ]
    }
}

MongoDB索引无法帮助查询多键索引

问题描述

1 个解决方案

解决方案1
8 已采纳 2013-02-19 21:52:18

MongoDB索引无法帮助查询多键索引

问题描述

1 个解决方案

解决方案1 8 已采纳 2013-02-19 21:52:18

解决方案1
8 已采纳 2013-02-19 21:52:18