简体   繁体   English

在mongodb组聚合中访问外部参数和文档

[英]Accessing external parameters and document in mongodb group aggregation

In a collection with the following general structure: 在具有以下常规结构的集合中:

{_id: 'id1', clientId: 'cid1', clientName:'Jon', item: 'item1', dateOfPurchase: '...'},
{_id: 'id2', clientId: 'cid1', clientName:'Jon', item: 'item2', dateOfPurchase: '...'},
{_id: 'id3', clientId: 'cid2', clientName:'Doe', item: 'itemX', dateOfPurchase: '...'}
... etc

The objective is to create a grouping by clientId to calculate some simple statistics, eg total occurrences per clientId. 目的是创建一个按clientId分组的分组,以计算一些简单的统计数据,例如,每个clientId的总出现次数。

One way to achieve this using Node.js MongoDB Driver API Collection.group method is: 使用Node.js MongoDB驱动程序API Collection.group方法实现此目的的一种方法是:

db.collection.group(
    'clientId',
    {},
    { count: 0 },
    function(obj, prev) {
        prev.count++;
    },
    true
}

The output of this for the sample data above would be similar to: 上面的示例数据的输出类似于:

{clientId: 'cid1', count: 2}
{clientId: 'cid2', count: 1}

Question 1: what is the best way to pass some external values to the reducer function? 问题1:将某些外部值传递给reducer函数的最佳方法是什么? For example I may want to calculate different counts for purchases made before/after a specific date and want to pass this date as a parameter. 例如,我可能想为在特定日期之前/之后进行的购买计算不同的计数,并希望将此日期作为参数传递。 I know that with mapReduce I can use the scope option for this purpose. 我知道使用mapReduce可以将scope选项用于此目的。 I'm wondering if there's a way to do this with the group function. 我想知道是否有一种方法可以使用group功能。 I could use the iterator object but it feels hacky. 我可以使用iterator对象,但是感觉很笨拙。

Question 2: is there a way to access the original document from inside the finalize function in order to include some extra data in the results? 问题2:是否可以从finalize函数内部访问原始文档,以便在结果中包含一些额外数据? ie project extra fields from the original documents such as clientName : 即从原始文档(例如clientName投影额外的字段:

{clientId: 'cid1', count: 2, clientName: 'Jon'}
{clientId: 'cid2', count: 1, clientName: 'Doe'}

Clarifications for Question 2, a) I could add the extra field inside the reducer function but it feels redundant to include code which is not supposed to run on every iteration. 对问题2的澄清:a)我可以在reducer函数内添加多余的字段,但是包含不应在每次迭代中运行的代码感到多余。 b) I could use aggregate pipelines to achieve something like this but I'm wondering if I can do this with Collection.group here b)我可以使用聚合管道来实现类似的功能,但是我想知道是否可以使用Collection.group做到这一点

While digging around the documentation I found an answer to Question 1 which is to use the Code class for the reducer function. 在仔细阅读文档时,我找到了问题1的答案,即对Coder函数使用Code类 The Code constructor takes a second argument functioning exactly like scope in mapReduce eg: Code构造函数使用第二个参数,其功能与mapReduce scope完全相同,例如:

const myFunction = function(obj, prev) {
    if (prev.count < myLimit) // myLimit is available here because it is defined in the Code initialization below
        prev.count++;
}

Code = require('mongodb').Code;
db.collection.group(
    'clientId',
    {},
    { count: 0 },
    new Code(myFunction, { myLimit: 5 }),
    true
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM