简体繁体 English

为每个映射器输出准确运行组合器的时间

[英]when exactly the combiner runs for each mapper output

原文 2014-01-29 16:58:28 0 1 hadoop/ mapreduce

when exactly the combiner runs, though you mention the combiner class in your driver code, its still up to hadoop to decide whether it should run on each mapper output. 当确切地运行合并器时，尽管您在驱动程序代码中提到了合并器类，但仍然需要Hadoop来决定是否应在每个映射器输出上运行它。 Could you please explain on what basis( is there any thump rule or equation or formula) hadoop decides this combiner job execution. 您能否解释hadoop决定此组合器作业执行的依据（是否有任何重击规则，方程式或公式）。

1 个解决方案

The combiner runs after the mapper and before the reducer. 组合器在映射器之后和减速器之前运行。 It runs for every mapper output. 它针对每个映射器输出运行。 It can be seen as a part of the mapper, so the input of the reducer is actually the output of the combiners. 可以将其视为映射器的一部分，因此减速器的输入实际上是组合器的输出。 Each mapper may consist of many map tasks, so that's maybe something that got you confused. 每个映射器可能包含许多映射任务，所以这可能会让您感到困惑。 It acts as a "mini-reducer", meaning that it groups all the values that have the same key (the output of mapper), but only for the data that has been output from the mapper, and not for all the data, unlike the reducer. 它充当“小型化简”，这意味着它将所有具有相同键的值（映射器的输出）分组，但仅针对已从映射器输出的数据，而不是针对所有数据，与减速器。

See this Yahoo Tutorial for more details. 有关更多详细信息，请参见此Yahoo教程。

组合器中的映射器输出增加了一倍 - Mapper output doubled in combiner

在MapReduce中调用完全合并器时？ - When Exactly Combiner is called in MapReduce?

Hadoop MapReduce运行组合器的确切来源 - Definitive source for when Hadoop MapReduce Runs a Combiner

hadoop将映射器，分区器和组合器的输出文件存储在哪里？ - where does hadoop store the output files of mapper, partitioner and combiner?

MapReduce中的合并器和映射器合并器之间的区别？ - Difference between combiner and in-mapper combiner in mapreduce?

合并器是否有条件运行 - Does combiner runs conditionally

mapper和reducer函数的确切输出是什么 - What exactly is output of mapper and reducer function

当reducer的输出VALUE为空时，如何使用合并器？ - How to use combiner, when the output VALUE of reducer is null?

什么先运行：分区器还是组合器？ - What runs first: the partitioner or the combiner?

为什么合并器输出记录= 0？ - Why combiner output records = 0?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 组合器中的映射器输出增加了一倍 - Mapper output doubled in combiner 在MapReduce中调用完全合并器时？ - When Exactly Combiner is called in MapReduce? Hadoop MapReduce运行组合器的确切来源 - Definitive source for when Hadoop MapReduce Runs a Combiner hadoop将映射器，分区器和组合器的输出文件存储在哪里？ - where does hadoop store the output files of mapper, partitioner and combiner? MapReduce中的合并器和映射器合并器之间的区别？ - Difference between combiner and in-mapper combiner in mapreduce? 合并器是否有条件运行 - Does combiner runs conditionally mapper和reducer函数的确切输出是什么 - What exactly is output of mapper and reducer function 当reducer的输出VALUE为空时，如何使用合并器？ - How to use combiner, when the output VALUE of reducer is null? 什么先运行：分区器还是组合器？ - What runs first: the partitioner or the combiner? 为什么合并器输出记录= 0？ - Why combiner output records = 0?

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM