简体繁体中英

when exactly the combiner runs for each mapper output

原文 2014-01-29 16:58:28 2 1 hadoop/ mapreduce

when exactly the combiner runs, though you mention the combiner class in your driver code, its still up to hadoop to decide whether it should run on each mapper output. Could you please explain on what basis( is there any thump rule or equation or formula) hadoop decides this combiner job execution.

1 answers

The combiner runs after the mapper and before the reducer. It runs for every mapper output. It can be seen as a part of the mapper, so the input of the reducer is actually the output of the combiners. Each mapper may consist of many map tasks, so that's maybe something that got you confused. It acts as a "mini-reducer", meaning that it groups all the values that have the same key (the output of mapper), but only for the data that has been output from the mapper, and not for all the data, unlike the reducer.

See this Yahoo Tutorial for more details.

Mapper output doubled in combiner

When Exactly Combiner is called in MapReduce?

Definitive source for when Hadoop MapReduce Runs a Combiner

where does hadoop store the output files of mapper, partitioner and combiner?

Difference between combiner and in-mapper combiner in mapreduce?

Does combiner runs conditionally

What exactly is output of mapper and reducer function

How to use combiner, when the output VALUE of reducer is null?

What runs first: the partitioner or the combiner?

Why combiner output records = 0?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Mapper output doubled in combiner When Exactly Combiner is called in MapReduce? Definitive source for when Hadoop MapReduce Runs a Combiner where does hadoop store the output files of mapper, partitioner and combiner? Difference between combiner and in-mapper combiner in mapreduce? Does combiner runs conditionally What exactly is output of mapper and reducer function How to use combiner, when the output VALUE of reducer is null? What runs first: the partitioner or the combiner? Why combiner output records = 0?

Related Tags

when exactly the combiner runs for each mapper output

Question

1 answers

solution1 0 2014-01-30 07:51:17

solution1
0 2014-01-30 07:51:17