I have a job that uses multiple mappers and one reducer. The mappers are almost identical, except they differ in the value of a String
that they use to produce the result.
Currently I have several classes, one for each value of the String
I mentioned — it feels like there should be a better way, that doesn't require so much code duplication. Is there a way to pass these String
values as parameters to the mappers?
My job looks like this:
Input File A ----> Mapper A using
String "Foo" ----+
|---> Reducer
Mapper B using ----+
Input File B ----> String "Bar"
I want to turn it into something like this:
Input File A ----> GenericMapper parameterized
with String "Foo" ----+
|---> Reducer
GenericMapper parameterized ----+
Input File B ----> with String "Bar"
Edit: Here are two simplified mapper class that I currently have. They accurately represent my actual situation.
class MapperA extends Mapper<Text, Text, Text, Text> {
public void map(Text key, Text value, Context context) {
context.write(key, new Text(value.toString() + "Foo"));
}
}
class MapperB extends Mapper<Text, Text, Text, Text> {
public void map(Text key, Text value, Context context) {
context.write(key, new Text(value.toString() + "Bar"));
}
}
Edit: What string each mapper should use depends only on which file the data comes from. There is no way to differentiate between the files, except through the file name.
Assuming you use file input formats, you can get you current input file name in the mapper like this:
if (context.getInputSplit() instanceof FileSplit) {
FileSplit fileSplit = (FileSplit) context.getInputSplit();
Path inputPath = fileSplit.getPath();
String fileId = ... //parse inputPath into a file id
...
}
You can parse inputPath however you want, eg use file name only or partition id only, etc to generate a unique id identifying the input file. For example:
/some/path/A -> A
/some/path/B -> B
Configure your properties for each possible file "id" in your driver:
conf.set("my.property.A", "foo");
conf.set("my.property.B", "bar");
In the mapper compute file "id" as stated above and get the value:
conf.get("my.property." + fileId);
Maybe you would use if sentence inside your mapper for choose between strings. What depends the use of one string or another?
Or maybe use Abstract Mapper class.
Maybe something like this ?
abstract class AbstractMapper extends Mapper<Text, Text, Text, Text> {
protected String text;
public void map(Text key, Text value, Context context) {
context.write(key, new Text(value.toString() + text));
}
}
class MapperImpl1 extends AbstractMapper{
@Override
public void map(Text key, Text value, Context context) {
text = "foo";
super.map();
}
}
class MapperImpl2 extends AbstractMapper{
@Override
public void map(Text key, Text value, Context context) {
text = "bar";
super.map();
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.