can you explain word count mapreduce program step by step

Question

Can you explain any map reduce program. for example in word count program class in class is innerclass. can you explain the program step by step. what is the meaning of angle bracket. why we are writing output parameters also. what is context object. Like that can you explain the program step by step. I know logic but I can't understand few Java statements

public class WordCount {

public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
   private final static IntWritable one = new IntWritable(1);
   private Text word = new Text();

   public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
       String line = value.toString();
       StringTokenizer tokenizer = new StringTokenizer(line);
       while (tokenizer.hasMoreTokens()) {
           word.set(tokenizer.nextToken());
           context.write(word, one);
       }
   }
} 

public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

   public void reduce(Text key, Iterable<IntWritable> values, Context context) 
     throws IOException, InterruptedException {
       int sum = 0;
       for (IntWritable val : values) {
           sum += val.get();
       }
       context.write(key, new IntWritable(sum));
   }
}

public static void main(String[] args) throws Exception {
   Configuration conf = new Configuration();

       Job job = new Job(conf, "wordcount");

   job.setOutputKeyClass(Text.class);
   job.setOutputValueClass(IntWritable.class);

   job.setMapperClass(Map.class);
   job.setReducerClass(Reduce.class);

   job.setInputFormatClass(TextInputFormat.class);
   job.setOutputFormatClass(TextOutputFormat.class);

   FileInputFormat.addInputPath(job, new Path(args[0]));
   FileOutputFormat.setOutputPath(job, new Path(args[1]));

   job.waitForCompletion(true);
}

}

Answer 1

Your Map class extends Mapper class of Hadoop where generics are mentioned of Input and output parameters. First 2 parameters are Input Key-Value whereas Last 2 parameters are output Key-Value. The Mapper class needs to override map() method. Your mapper logic goes here. This method accepts specified Input parameters and returns void and writes Key-Value pair to Context (memory).

Your Reduce class extends Reducer class. The input of Reducer should match output Key-Value of Mapper/Combiner. The Reducer class needs to override reduce() method. Your reducer logic goes here. This method accepts specified Input parameters and returns void and reads Key-Value pair from Context (memory).

Hadoop performs combiner, sorting, shuffling operation between these two methods.

Your main method contains code setup Hadoop job.

Few more clarifications from. macalester.edu and javacodegeeks

can you explain word count mapreduce program step by step

Question

1 answers

solution1
0 2019-01-24 20:17:54

can you explain word count mapreduce program step by step

Question

1 answers

solution1 0 2019-01-24 20:17:54

solution1
0 2019-01-24 20:17:54