简体   繁体   English

Mongoose / MongoDB:计算数组中的元素

[英]Mongoose / MongoDB: count elements in array

I'm trying to count the number of occurrences of a string in an array in my collection using Mongoose. 我正在尝试使用Mongoose计算我的集合中数组中字符串的出现次数。 My "schema" looks like this: 我的“架构”看起来像这样:

var ThingSchema = new Schema({
  tokens: [ String ]
});

My objective is to get the top 10 "tokens" in the "Thing" collection, which can contain multiple values per document. 我的目标是在“Thing”集合中获得前10个“标记”,每个文档可以包含多个值。 For example: 例如:

var documentOne = {
    _id: ObjectId('50ff1299a6177ef9160007fa')
  , tokens: [ 'foo' ]
}

var documentTwo = {
    _id: ObjectId('50ff1299a6177ef9160007fb')
  , tokens: [ 'foo', 'bar' ]
}

var documentThree = {
    _id: ObjectId('50ff1299a6177ef9160007fc')
  , tokens: [ 'foo', 'bar', 'baz' ]
}

var documentFour = {
    _id: ObjectId('50ff1299a6177ef9160007fd')
  , tokens: [ 'foo', 'baz' ]
}

...would give me data result: ...会给我数据结果:

[ foo: 4, bar: 2 baz: 2 ]

I'm considering using MapReduce and Aggregate for this tool, but I'm not certain what is the best option. 我正在考虑将MapReduce和Aggregate用于此工具,但我不确定什么是最佳选择。

Aha, I've found the solution. 啊哈,我找到了解决方案。 MongoDB's aggregate framework allows us to execute a series of tasks on a collection. MongoDB的aggregate框架允许我们在集合上执行一系列任务。 Of particular note is $unwind , which breaks an array in a document into unique documents , so they can be groups / counted en masse . 尤其值得注意的是$unwind ,打破文档中的数组独特的文件 ,使他们能够组/计数集体

MongooseJS exposes this very accessibly on a model. MongooseJS在模型上非常容易地暴露这种情况。 Using the example above, this looks as follows: 使用上面的示例,如下所示:

Thing.aggregate([
    { $match: { /* Query can go here, if you want to filter results. */ } } 
  , { $project: { tokens: 1 } } /* select the tokens field as something we want to "send" to the next command in the chain */
  , { $unwind: '$tokens' } /* this converts arrays into unique documents for counting */
  , { $group: { /* execute 'grouping' */
          _id: { token: '$tokens' } /* using the 'token' value as the _id */
        , count: { $sum: 1 } /* create a sum value */
      }
    }
], function(err, topTopics) {
  console.log(topTopics);
  // [ foo: 4, bar: 2 baz: 2 ]
});

It is noticeably faster than MapReduce in preliminary tests across ~200,000 records, and thus likely scales better, but this is only after a cursory glance. 在大约200,000条记录的初步测试中,它明显快于MapReduce,因此可能更好地扩展,但这只是在粗略浏览之后。 YMMV. 因人而异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM