Mongoose / MongoDB：計算數組中的元素

Question

我正在嘗試使用Mongoose計算我的集合中數組中字符串的出現次數。 我的“架構”看起來像這樣：

var ThingSchema = new Schema({
  tokens: [ String ]
});

我的目標是在“Thing”集合中獲得前10個“標記”，每個文檔可以包含多個值。 例如：

var documentOne = {
    _id: ObjectId('50ff1299a6177ef9160007fa')
  , tokens: [ 'foo' ]
}

var documentTwo = {
    _id: ObjectId('50ff1299a6177ef9160007fb')
  , tokens: [ 'foo', 'bar' ]
}

var documentThree = {
    _id: ObjectId('50ff1299a6177ef9160007fc')
  , tokens: [ 'foo', 'bar', 'baz' ]
}

var documentFour = {
    _id: ObjectId('50ff1299a6177ef9160007fd')
  , tokens: [ 'foo', 'baz' ]
}

...會給我數據結果：

[ foo: 4, bar: 2 baz: 2 ]

我正在考慮將MapReduce和Aggregate用於此工具，但我不確定什么是最佳選擇。

Answer 1

啊哈，我找到了解決方案。 MongoDB的aggregate框架允許我們在集合上執行一系列任務。 尤其值得注意的是$unwind ，打破文檔中的數組獨特的文件 ，使他們能夠組/計數集體。

MongooseJS在模型上非常容易地暴露這種情況。 使用上面的示例，如下所示：

Thing.aggregate([
    { $match: { /* Query can go here, if you want to filter results. */ } } 
  , { $project: { tokens: 1 } } /* select the tokens field as something we want to "send" to the next command in the chain */
  , { $unwind: '$tokens' } /* this converts arrays into unique documents for counting */
  , { $group: { /* execute 'grouping' */
          _id: { token: '$tokens' } /* using the 'token' value as the _id */
        , count: { $sum: 1 } /* create a sum value */
      }
    }
], function(err, topTopics) {
  console.log(topTopics);
  // [ foo: 4, bar: 2 baz: 2 ]
});

在大約200,000條記錄的初步測試中，它明顯快於MapReduce，因此可能更好地擴展，但這只是在粗略瀏覽之后。 因人而異。

Mongoose / MongoDB：計算數組中的元素

問題描述

1 個解決方案

解決方案1
22 已采納 2013-02-05 19:58:14

Mongoose / MongoDB：計算數組中的元素

問題描述

1 個解決方案

解決方案1 22 已采納 2013-02-05 19:58:14

解決方案1
22 已采納 2013-02-05 19:58:14