using context object with in the run method of Mapper class in Map-reduce Hadoop?

Question

Here is the source code for Mapper

public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKeyValue()) {
      map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
    cleanup(context);
  }
}

As you can notice, context is used both for read and write . How is it possible? That is context.getCurrentKey() and context.getCurrentValue() are used to retrieve key and value pair from context and is passed to map function. Is it same context used for Input and Output?

Answer 1

Yes, the same context is for both input and output. It stores references to RecordReader and RecordWriter . Whenever context.getCurrentKey() and context.getCurrentValue() are used to retrieve key and value pair, the request is delegated to RecordReader . And when context.write() is called, it is delegated to RecordWriter .

Note that RecordReader and RecordWriter are actually abstract classes.

Update:

org.apache.hadoop.mapreduce.Mapper$Context implements org.apache.hadoop.mapreduce.MapContext , which again sub classes org.apache.hadoop.mapreduce.TaskInputOutputContext

Look at the source of org.apache.hadoop.mapreduce.task.MapContextImpl . which is again a subclass of org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl to see where exactly Context delegates input and output to RecordReader and RecordWriter .

using context object with in the run method of Mapper class in Map-reduce Hadoop?

Question

1 answers

solution1
4 ACCPTED 2014-07-22 06:44:30

using context object with in the run method of Mapper class in Map-reduce Hadoop?

Question

1 answers

solution1 4 ACCPTED 2014-07-22 06:44:30

solution1
4 ACCPTED 2014-07-22 06:44:30