简体   繁体   中英

use of Generics in HADOOP map reduce problems

My question seems to be silly to the HADOOP users. But I am little confused with use of Generics in map reduce problem like "WORD COUNT".

I know that Generics are used bascialy for Type Casting and Type Safety. But I can not link up the concept here.

In word count problem,

public class WordCountMapper extends
        Mapper<LongWritable, Text, Text, LongWritable> {
    @Override
    protected void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {
        // TODO Auto-generated method stub
        ...

        }
    }

}

Please can anyone clear me the use of Generics here . Please correct me if I've done any mistake while asking this question.

I now understand the generics are used here for key value pair (KEY IN, VALUE IN, KEY OUT, VALUE OUT). But still I am not clear, why Generics is used here for key value pair. Is not there other way to do the same. What is the benefit of using Generics here?

Thanks!

MapReduce uses Generics specifically in Mapper and Reducer to specify what kind of input and output is expected to read in and write out.

In the example you have specified your WordCountMapper extends Mapper class with specified generics Mapper<LongWritable, Text, Text, LongWritable> where the first two classes LongWritable and Text represents the input key and value the Mapper class is expecting to read, while the last two classes Text and LongWritable represents the output key and value classes the map method is expected to emit out.

This thread discussion gives more insight into why generics have been implemented in MapReduce. Also, this JIRA Issue gives more information.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM