简体   繁体   English

如何禁用hadoop组合器?

[英]How to disable hadoop combiner?

In wordcount example, the combiner is explicitly set in 在wordcount示例中,组合器是显式设置的

job.setCombinerClass(IntSumReducer.class); job.setCombinerClass(IntSumReducer.class);

I would like to disable the combiner so that the output of mapper is not processed by the combiner. 我想禁用组合器,以便组合器不处理mapper的输出。 Is there a way to do that using MR config files (ie without modifying and recompiling the wordcount code)? 有没有办法使用MR配置文件(即无需修改和重新编译字数代码)来做到这一点?

Thanks 谢谢

Suppose this is your command line 假设这是您的命令行

hadoop jar your_hadoop_job.jar your_mr_driver \
command_line_arg1 command_line_arg2 command_line_arg3 \
-libjars all_your_dependency_jars

Here following parameters 这里的以下参数

  • command_line_arg1 command_line_arg1
  • command_line_arg2 command_line_arg2
  • command_line_arg3 command_line_arg3

will be passed on to your main method as arg[0], arg[1] and arg[3] respectively. 将分别以arg [0],arg [1]和arg [3]的形式传递给您的主方法。 Assuming arg[0] and arg[1] is used for identifying input and output folder. 假设arg [0]和arg [1]用于标识输入和输出文件夹。 You can use arg[3] to pass a boolean flag like ('1' or 'true' or 'yes') to understand if you want to use combiner and accordingly set combiner. 您可以使用arg [3]传递一个布尔标志,例如('1'或'true'或'yes'),以了解是否要使用组合器并相应地设置组合器。 Example below (default...it won't set combiner class) 下面的示例(默认值……不会设置组合器类)

if ( "YyesTrue1".contains(arg[3])){
    job.setCombinerClass(IntSumReducer.class);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM