简体   繁体   English

是否可以在MongoDB中迭代mapReduce

[英]Is it possible to iterate over mapReduce in MongoDB

I am using mapReduce in MongoDB to generate the trending songs for a user form his/her friends network. 我在MongoDB中使用mapReduce为用户通过他/她的朋友网络生成流行歌曲。 so I iterate over all users and check if the user_id exists in their friends array, if it exists I emit their songs and then merge the whole emitted songs to find the top songs for all his friends network. 因此,我遍历所有用户,并检查user_id是否存在于其朋友数组中,如果存在,则发出他们的歌曲,然后合并发出的全部歌曲,以查找其所有朋友网络的热门歌曲。

The problem is that i need to iterate over all users to find the (network trending songs) for every user in the collection. 问题是我需要遍历所有用户以查找集合中每个用户的(网络流行歌曲)。 How can I accomplish this, Is there way like nested mapReduce. 我如何才能做到这一点,有没有办法像嵌套的mapReduce一样。 or do I have to iterate from the application layer, like excuting mapReduce through a for loop!. 还是我必须从应用程序层进行迭代,例如通过for循环执行mapReduce!

my current mapReduce that i am using is this one: 我当前使用的mapReduce是这个:

var map = function() {
users = [];
songs = [];
    if(this.value.friends !== undefined && this.value.friends.length !== 0 && this.value.songs !== undefined && this.value.songs.length !== 0){
        key = this._id.user_id;
        for(var x=0; x<this.value.songs.length; x++)
            emit({user_id:user_id,song_id:this.value.songs[x][0]},{played:this.value.songs[x][1], counter:1});
    }
};
var reduce = function(key, values) {
    var counter = 0;
    var played = 0;
    values.forEach(function(val){
        counter += val.counter;
        played += val.played;
    });
    return {played : played, counter : counter};
};
db.runCommand({"mapreduce":"trending_users", "map":map, "reduce":reduce, "scope":{user_id: "111222333444"} ,"query":{'value.friends':{$in : ['111222333444'] }},'out':{merge:'trending_user_network'}})    
db.trending_user_network.find({'_id.user_id':'111222333444'}).sort({'value.counter':-1, 'value.played':-1})

You could certainly use a for-loop in your application to cycle over the user IDs and run your map reduce for each one. 当然,您可以在应用程序中使用for循环来遍历用户ID并为每个ID运行地图缩减。 However, for something like this, you might have better luck using the aggregation framework to create a pipeline of aggregate operations to do it all at once. 但是,对于这样的事情,使用聚合框架来创建聚合操作的管道以一次完成所有操作可能会更好。

I don't know the precise details of your schema, but I think you could build an aggregation pipeline along the lines of this: 我不知道您的架构的确切细节,但是我认为您可以按照以下方式建立一个聚合管道:

  • $unwind to get a flat list of users mapped to their friends' user IDs $unwind获取映射到其朋友的用户ID的平面用户列表
  • $unwind again to map the friends' user IDs to their list of songs 再次$unwind将朋友的用户ID映射到他们的歌曲列表
  • $group to get the aggregates of each song in the resulting list $group获取结果列表中每首歌曲的汇总
  • $sort to put the resulting stuff in order $sort将结果排序

In reality your pipeline might require a few more steps, but I think that if you look at this problem in terms of aggregation rather than map-reduce, it will be easier. 实际上,您的管道可能需要更多的步骤,但是我认为,如果您从聚合而不是map-reduce的角度来看这个问题,它将更加容易。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM