简体   繁体   中英

Passing different parameters to each mapper

I have a job that uses multiple mappers and one reducer. The mappers are almost identical, except they differ in the value of a String that they use to produce the result.

Currently I have several classes, one for each value of the String I mentioned — it feels like there should be a better way, that doesn't require so much code duplication. Is there a way to pass these String values as parameters to the mappers?

My job looks like this:

Input File A  ---->  Mapper A using
                       String "Foo"  ----+
                                         |--->  Reducer
                     Mapper B using  ----+
Input File B  ---->    String "Bar" 

I want to turn it into something like this:

Input File A  ---->  GenericMapper parameterized
                               with String "Foo" ----+
                                                     |--->  Reducer
                     GenericMapper parameterized ----+ 
Input File B  ---->            with String "Bar"

Edit: Here are two simplified mapper class that I currently have. They accurately represent my actual situation.

class MapperA extends Mapper<Text, Text, Text, Text> {
    public void map(Text key, Text value, Context context) {
        context.write(key, new Text(value.toString() + "Foo"));
    }
}

class MapperB extends Mapper<Text, Text, Text, Text> {
    public void map(Text key, Text value, Context context) {
        context.write(key, new Text(value.toString() + "Bar"));
    }
}

Edit: What string each mapper should use depends only on which file the data comes from. There is no way to differentiate between the files, except through the file name.

Assuming you use file input formats, you can get you current input file name in the mapper like this:

if (context.getInputSplit() instanceof FileSplit) {
    FileSplit fileSplit = (FileSplit) context.getInputSplit();
    Path inputPath = fileSplit.getPath();
    String fileId = ... //parse inputPath into a file id
    ...
}

You can parse inputPath however you want, eg use file name only or partition id only, etc to generate a unique id identifying the input file. For example:

/some/path/A -> A
/some/path/B -> B

Configure your properties for each possible file "id" in your driver:

conf.set("my.property.A", "foo");
conf.set("my.property.B", "bar"); 

In the mapper compute file "id" as stated above and get the value:

conf.get("my.property." + fileId);

Maybe you would use if sentence inside your mapper for choose between strings. What depends the use of one string or another?

Or maybe use Abstract Mapper class.

Maybe something like this ?

abstract class AbstractMapper extends Mapper<Text, Text, Text, Text> {
    protected String text;
    public void map(Text key, Text value, Context context) {
        context.write(key, new Text(value.toString() + text));
    }
}
class MapperImpl1 extends AbstractMapper{
    @Override
    public void map(Text key, Text value, Context context) {
        text = "foo";
        super.map();
    }
}
class MapperImpl2 extends AbstractMapper{
        @Override
        public void map(Text key, Text value, Context context) {
            text = "bar";
            super.map();
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM