简体   繁体   English

MongoDB Map减少在大数据量中返回意外结果

[英]MongoDB Map Reduce returning unexpected results in high data Volume

I am new to PHP as well as mongo DB and i have a data set of 80000 records and this is a local deployment. 我是PHP和mongo DB的新手,我有80000条记录的数据集,这是本地部署。

My Data Structure is simple: 我的数据结构很简单:

(
    [_id] => MongoId Object
        (
            [$id] => 53c146aebc7d867d058b94b3
        )

    [name] => Mark
    [txnType] => Borrowed
    [amount] => 5876
)

I am running a Map Reduce Job as defined below: 我正在运行如下定义的Map Reduce作业:

$map = new MongoCode("function ()
{
    { 
        emit({name:this.name,type:this.txnType},this.amount);
    }
}");
$reduce = new MongoCode("
    function (key, values)
    {
        var total=0;
        var count=0;
        for (var i in values) { 
            if (!isNaN(values[i])) {
                total+=values[i];
            };
            count++;
        }
        return {total:total, count:count};
    }
    ");

$sales =  $db->command(array(
    "mapreduce" => "data", 
    "map" => $map,
    "reduce" => $reduce,
    "out" => "sales"
    ));

The Concept is basically that there are 4 guys who may have transactions of type Borrowed, Sold, Purchase and Lent. 基本概念是,有4个人可能具有“借入”,“出售”,“购买”和“借出”类型的交易。 Each record representing a txn. 每个记录代表一个txn。

I want to just create a data pivot getting the data as: 我只想创建一个数据透视图,将数据获取为:

Name : Type : Total Amount : Count of Txns 名称:类型:总额:Txns的数量

Some how the data that is propping up is messed up. 支撑数据的方式有些混乱。 The counts when added up should add up to 80000, but instead its adding up to only 216. 加起来时的计数应加起来为80000,但加起来只有216。

I am not able to understand why this is happening.. Can anyone please help me. 我不明白为什么会这样。任何人都可以帮助我。 where am i going wrong and what to correct. 我在哪里错了,要纠正什么。

My need is to basically draw up analytic for the transaction. 我的需求是基本上为交易制定分析。

The problem is that your emit is the outputting the same format as your reduce. 问题是您的发射与您的reduce输出的格式相同。

Here is what you emit for value: 这是您所追求的价值:

this.amount

Here is what you return from reduce: 这是reduce的返回结果:

return {total:total, count:count};

In order for reduce to work correctly when it re-reduces (remember, reduce may be called zero, once or multiple times on the same key value) you must emit this format: 为了使reduce在重新还原时正常工作(请记住,reduce可能在相同的键值上被称为零,一次或多次),您必须发出以下格式:

emit({name:this.name,type:this.txnType},{ total: this.amount, count: 1} );

And therefore your reduce function should now be: 因此,您的reduce函数现在应该是:

    var total=0;
    var count=0;
    for (var i in values) { 
        if (!isNaN(values.total[i])) {
            total+=values.total[i];
        };
        count+=values.count;
    }
    return {total:total, count:count};

The two most important rules of mapReduce in MongoDB : MongoDB中mapReduce的两个最重要的规则

  1. emit value in exactly the same format as your reduce function returns 以与reduce函数返回的格式完全相同的格式发出值

  2. structure reduce so that it can be called zero, once or multiple times for each key 结构化简,以便每个键都可以称为零,一次或多次

Note that you can perform the same aggregation much more efficiently and faster with Aggregation Framework like so: 请注意,您可以使用Aggregation Framework更快,更高效地执行相同的聚合,如下所示:

db.collection.aggregate( {$group: 
    { _id : {name: "$name", type: "$txnType"},
      total: {$sum: "$amount"},
      count: {$sum: 1}
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM