简体   繁体   中英

Instantiate an object used by all map of one mapper in hadoop

I want to instantiate an object once to be used by all map operations. The instantiation requires a few set of parameters (~10 or so). I think I should do that with the Mapper.setup method and use the job configuration to pass the parameters. I didn't find suitable example. (Note that I am new to hadoop)

Basically, what I am looking for is:

public class MyMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
    private static final IntWritable one = new IntWritable(1);

    private static MyParser parser;

    protected void setup(Context context) 
            throws IOException, InterruptedException{

        String param1 = "";  // how to get those?
        String param2 = "";

        parser = new MyParser(param1,param2);

    protected void map(LongWritable offset, Text value, Context context) 
            throws IOException, InterruptedException {

        String key = parser.parse(value.toString());
        context.write(new Text(key),one);

Is it a suitable approach? Is there alternative?

Sub-question: What if the parameters depend on the file that is processed?

In the main method add these lines after declaring configuration object and set the parameters

Configuration con = new Configuration();
con.set("param1", "welcome"); // for e.g
con.set("param2", "hello"); // for e.g

Add theses lines in the Mapper setup method . Those parameters can be retrived with the help of configuaration object from the context object

Configuration conf = context.getConfiguration();
 String param1 =conf.get("param1"); // welcome will be coming here
String param2 =conf.get("param2"); // hello will be coming here

You can make it as a static parameter and in a file if you want to process use distriubuted cache –

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM