简体   繁体   中英

Different context types in hadoop for mapper and combiner

Hello I'm trying to implement java hadoop application. I want to make mapper of <Object, Text, NaicsAreaPair, LongWritable> (so the output of mapper will be NaicsAreaPair as key and LongWritable as value). Then I need Combiner to be like <NaicsAreaPair,LongWritable, Text,AreaWagePair> so the input is correct with mapper output, but combiner output is different from mapper output.

I have in main class this configuration:

public static void main(String[] args) throws Exception {
 Configuration conf = new Configuration();
 Job job = Job.getInstance(conf, "NY statistics");
 job.setJarByClass(NYStatisticsOwnWritableComparable.class);
 job.setMapperClass(TokenizerMapper.class);
 job.setCombinerClass(Combiner.class);
 job.setReducerClass(IntSumReducer.class);
 job.setOutputKeyClass(NaicsAreaPair.class);
 job.setOutputValueClass(LongWritable.class);
 //job.setPartitionerClass(Rozdelovac.class);
 FileInputFormat.addInputPath(job, new Path(args[0]));
 FileOutputFormat.setOutputPath(job, new Path(args[1]));
 //job.setNumReduceTasks(3);
 System.exit(job.waitForCompletion(true) ? 0 : 1);
}

Here I have to say which output key and output value will be used. Is there any possibility to set it like ok for mapper use this output key and value but for combiner use different?

Thanks a lot for your answer

It is not. The Combiner output MUST be the same as Mapper output.

Why do you want to use a combiner for this? The purpose for which Combiners are there is 'Performance' by reducing the data sent over the network. There are several limitations like the input/output type must match the mapper output (key/value)types/reducer input(key/value) types ,the function that it performs should be associative and commutative see an example here http://www.philippeadjiman.com/blog/2010/01/14/hadoop-tutorial-series-issue-4-to-use-or-not-to-use-a-combiner/

What you want as your combiner,let it be the reducer

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM