為什么在查詢兩個鍵時單個索引比復合索引快？（MongoDB，多鍵）

Question

查詢同一文檔的兩個字段時，我創建了4個索引來測試集合中的查詢性能，其中一個字段是數組（需要多鍵索引）。 其中兩個指標是單一的，兩個是復合的。

我感到驚訝的是，由於單一索引之一的性能要優於復合索引的性能。 我期望使用復合索引來獲得最佳性能，因為我知道它會索引兩個字段，從而可以加快查詢速度。

這些是我的索引：

{    "v" : 1, 
     "key" : { "_id" : 1 }, 
     "ns" : "bt_twitter.mallorca.mallorca", 
     "name" : "_id_"  
}, 
{    "v" : 1, 
     "key" : { "epoch_creation_date" :1 }, 
     "ns" : "bt_twitter.mallorca.mallorca", 
     "name" : "epoch_creation_date_1"  
}, 
{     "v" : 1, 
      "key" : { "related_hashtags" : 1 }, 
      "ns" : "bt_twitter.mallorca.mallorca", 
      "name" : "related_hashtags_1"  
},  
{     "v" : 1, 
      "key" : { "epoch_creation_date" : 1, "related_hashtags" : 1 }, 
      "ns" : "bt_twitter.mallorca.mallorca", 
      "name" : "epoch_creation_date_1_related_hashtags_1"  
}

我的查詢和性能指標是（提示參數顯示每個查詢使用的索引）：

查詢1：

active_collection.find(
    {'epoch_creation_date': {'$exists': True}}, 
    {"_id": 0, "related_hashtags":1}
).hint([("epoch_creation_date", ASCENDING)]).explain()

毫：237

掃描：101226

查詢2：

active_collection.find(
    {'epoch_creation_date': {'$exists': True}}, 
    {"_id": 0, "related_hashtags": 1}
).hint([("related_hashtags", ASCENDING)]).explain()

毫：1131

掃描：306715

查詢3：

active_collection.find(
     {'epoch_creation_date': {'$exists': True}},
     {"_id": 0, "related_hashtags": 1}
).hint([("epoch_creation_date", ASCENDING), ("related_hashtags", ASCENDING)]).explain()

毫：935

掃描：306715

查詢4：

active_collection.find(
     {'epoch_creation_date': {'$exists': True}}, 
     {"_id": 0, "related_hashtags": 1}
).hint([("related_hashtags", ASCENDING),("epoch_creation_date", ASCENDING)]).explain()

毫：1165

掃描：306715

QUERY 1掃描的文檔較少，這可能是更快的原因。 有人可以幫助我理解為什么它的性能要優於使用復合索引的查詢嗎？ 因此，什么時候使用復合索引比使用單個索引更好？

我正在閱讀mongo文檔，但是這些概念使我難以理解。

提前致謝。

更新的問題（針對Sammaye和Philipp）

這是完整的explain（）的結果

"cursor" : "BtreeCursor epoch_creation_date_1",
"isMultiKey" : false,
"n" : 101226,
"nscannedObjects" : 101226,
"nscanned" : 101226,
"nscannedObjectsAllPlans" : 101226,
"nscannedAllPlans" : 101226,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 242,
"indexBounds" : {u'epoch_creation_date': [[{u'$minElement': 1}, {u'$maxElement': 1}]]

},
"server" : "vmmongodb:27017"

用於以下查詢：

active_collection.find(
{'epoch_creation_date': {'$exists': True}}, 
{"_id": 0, "related_hashtags":1})
.hint([("epoch_creation_date", ASCENDING)]).explain()

Answer 1

您創建了一個復合索引（名為epoch_creation_date_1_related_hashtags_1 ），但未在這些提示中使用它。 取而代之的是你使用你還創造了兩個單場指數（ related_hashtags_1和epoch_creation_date_1以不同的順序）。

在這兩個索引中，只有epoch_creation_date_1有效，因為您沒有在查詢兩個字段。 您只查詢一個，這是'epoch_creation_date': {'$exists': True} 。 您使用{"_id": 0, "related_hashtags":1}執行的字段過濾是對該查詢找到的文檔進行的。 那時，索引已不再有用。 這意味着在related_hashtags上的任何索引都將無法提高此查詢的性能。 復合索引（當您實際使用時）可能比根本沒有索引要好，但epoch_creation_date僅在epoch_creation_date上的索引好。

Answer 2

好了，閱讀更多問題后，我了解了問題所在。 多鍵索引將寫入一個索引條目PER多值。 這意味着，如果每個文檔的每個related_hashtags具有3個值，則索引的大小實際上是3倍，要掃描的值數是3倍（如果我的數學加在一起...）。

nscanned是一個計數器，用於指示必須查看文檔的時間（請注意計數器，而不是查看特定數量的唯一文檔），這意味着由於使用了多鍵索引，您必須掃描（相同）文檔數量的大約3倍您通常會在第一個查詢中使用。

這是帶有多鍵索引的已知警告，為什么要小心將它們像這樣扔掉。

我相信第三個查詢如此之慢的原因是，由於多鍵索引無法支持indexOnly游標，因此MongoDB無法在那里使用覆蓋查詢。

為什么在查詢兩個鍵時單個索引比復合索引快？（MongoDB，多鍵）

問題描述

2 個解決方案

解決方案1
2 已采納 2013-11-29 12:43:00

解決方案2
0 2013-11-29 12:53:24

為什么在查詢兩個鍵時單個索引比復合索引快？ （MongoDB，多鍵）

問題描述

2 個解決方案

解決方案1 2 已采納 2013-11-29 12:43:00

解決方案2 0 2013-11-29 12:53:24

為什么在查詢兩個鍵時單個索引比復合索引快？（MongoDB，多鍵）

解決方案1
2 已采納 2013-11-29 12:43:00

解決方案2
0 2013-11-29 12:53:24