在Hadoop中没有Reducer的组合器

Question

Can I write a Hadoop code that has only Mappers and Combiners (ie mini-reducers with no reducer)? 我可以编写一个只有Mappers和Combiners的Hadoop代码（即没有减速器的迷你减速器）吗？

job.setMapperClass(WordCountMapper.class); job.setMapperClass（WordCountMapper.class）;
job.setCombinerClass(WordCountReducer.class); job.setCombinerClass（WordCountReducer.class）;

conf.setInt("mapred.reduce.tasks", 0); conf.setInt（“mapred.reduce.tasks”，0）;

I was trying to do so but I always see that I have one reduce task on the job tracker link 我试图这样做，但我总是看到我在工作跟踪链接上有一个减少任务

Launched reduce tasks = 1 推出reduce任务= 1

How can I delete reducers while keeping combiners? 如何在保留合并器的同时删除减速器？ is that possible? 那可能吗？

Answer 1

You need to tell your job that you don't care about the reducer: JobConf.html#setNumReduceTasks(int) 你需要告诉你的工作你不关心reducer： JobConf.html＃setNumReduceTasks（int）

// new Hadoop API
jobConf.setNumReduceTasks(0);

// old Hadoop API
job.setNumReduceTasks(0);

You can achieve the something with IdentityReducer . 您可以使用IdentityReducer实现某些功能。

Performs no reduction, writing all input values directly to the output. 不执行缩减，将所有输入值直接写入输出。

I'm not sure whether you can keep combiners but I will start with the previous lines. 我不确定你是否可以保留合并器，但我将从之前的行开始。

Answer 2

In the case you describe you should use Reducers. 在您描述的情况下，您应该使用Reducers。 Use as key: Context.getInputSplit().getPath() + Context.getInputSplit().getStart() - this combination is unique for each Mapper. 用作键：Context.getInputSplit（）。getPath（）+ Context.getInputSplit（）。getStart（） - 这个组合对于每个Mapper都是唯一的。

在Hadoop中没有Reducer的组合器

问题描述

2 个解决方案

解决方案1
0 2014-03-04 13:58:26

解决方案2
0 已采纳 2014-03-04 15:26:59

在Hadoop中没有Reducer的组合器

问题描述

2 个解决方案

解决方案1 0 2014-03-04 13:58:26

解决方案2 0 已采纳 2014-03-04 15:26:59

解决方案1
0 2014-03-04 13:58:26

解决方案2
0 已采纳 2014-03-04 15:26:59