[英]Accessing external parameters and document in mongodb group aggregation
In a collection with the following general structure: 在具有以下常规结构的集合中:
{_id: 'id1', clientId: 'cid1', clientName:'Jon', item: 'item1', dateOfPurchase: '...'},
{_id: 'id2', clientId: 'cid1', clientName:'Jon', item: 'item2', dateOfPurchase: '...'},
{_id: 'id3', clientId: 'cid2', clientName:'Doe', item: 'itemX', dateOfPurchase: '...'}
... etc
The objective is to create a grouping by clientId
to calculate some simple statistics, eg total occurrences per clientId. 目的是创建一个按
clientId
分组的分组,以计算一些简单的统计数据,例如,每个clientId的总出现次数。
One way to achieve this using Node.js MongoDB Driver API Collection.group method is: 使用Node.js MongoDB驱动程序API Collection.group方法实现此目的的一种方法是:
db.collection.group(
'clientId',
{},
{ count: 0 },
function(obj, prev) {
prev.count++;
},
true
}
The output of this for the sample data above would be similar to: 上面的示例数据的输出类似于:
{clientId: 'cid1', count: 2}
{clientId: 'cid2', count: 1}
Question 1: what is the best way to pass some external values to the reducer
function? 问题1:将某些外部值传递给
reducer
函数的最佳方法是什么? For example I may want to calculate different counts for purchases made before/after a specific date and want to pass this date as a parameter. 例如,我可能想为在特定日期之前/之后进行的购买计算不同的计数,并希望将此日期作为参数传递。 I know that with
mapReduce
I can use the scope
option for this purpose. 我知道使用
mapReduce
可以将scope
选项用于此目的。 I'm wondering if there's a way to do this with the group
function. 我想知道是否有一种方法可以使用
group
功能。 I could use the iterator object but it feels hacky. 我可以使用iterator对象,但是感觉很笨拙。
Question 2: is there a way to access the original document from inside the finalize
function in order to include some extra data in the results? 问题2:是否可以从
finalize
函数内部访问原始文档,以便在结果中包含一些额外数据? ie project extra fields from the original documents such as clientName
: 即从原始文档(例如
clientName
投影额外的字段:
{clientId: 'cid1', count: 2, clientName: 'Jon'}
{clientId: 'cid2', count: 1, clientName: 'Doe'}
Clarifications for Question 2, a) I could add the extra field inside the reducer
function but it feels redundant to include code which is not supposed to run on every iteration. 对问题2的澄清:a)我可以在
reducer
函数内添加多余的字段,但是包含不应在每次迭代中运行的代码感到多余。 b) I could use aggregate pipelines to achieve something like this but I'm wondering if I can do this with Collection.group
here b)我可以使用聚合管道来实现类似的功能,但是我想知道是否可以使用
Collection.group
做到这一点
While digging around the documentation I found an answer to Question 1 which is to use the Code class for the reducer function. 在仔细阅读文档时,我找到了问题1的答案,即对Coder函数使用Code类 。 The
Code
constructor takes a second argument functioning exactly like scope
in mapReduce
eg: Code
构造函数使用第二个参数,其功能与mapReduce
scope
完全相同,例如:
const myFunction = function(obj, prev) {
if (prev.count < myLimit) // myLimit is available here because it is defined in the Code initialization below
prev.count++;
}
Code = require('mongodb').Code;
db.collection.group(
'clientId',
{},
{ count: 0 },
new Code(myFunction, { myLimit: 5 }),
true
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.