简体   繁体   中英

Is it possible to pass properties from a mapper to a reducer in hadoop?

I have a value in map that i need it be the first value come in reduce. is it possible that i save this value in hdfs, then i read it in setup() in reduce? Is it possible to read a file from hdfs in setup() in reduce?

There are multiple ways of doing this:

  1. Emit the value to each reducer using the standard key/value output under "special" keys that will guarantee that your value will arrive first into your reducer. (You may need to set custom sort comparator and partitioner)
  2. Use MultipleOutputs in the mapper and FileSystem to read text files in the reducer setup. I think you can use FileSystem in the mapper as well to write. This is not optimal solution and need to be careful with timing to make sure you read data before you write and close the file.
  3. Use DistributedCache EDIT: Actually, I do not think that will work between mapper and reducer in the same job. Please ignore this option.

For the option #1, let's say, you have a Text keys and Text values to pass from mapper to reducer. And you know that neither of your keys can start with space " ". So what you can do is to construct a special key " " + #, where # is a reducer partition id (from 0 to N-1, where N is total number of your reducers). Then in a loop write into output the keys " 01", " 02", " 03", ... with your value that you need to pass to each reducer. Set up custom partitioner so that it recognizes that "special" key and routes to a corresponding partition:

int getPartition(Text key, Text value, int numPartitions) {
    if (key.toString().startWith(" ") {
        //special key
        int partId = Integer.parseInt(key.toString().substring(1));
        return partId;
    } else {
        //regular key
        return (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks;
    }
}

Obviously, for if you have other data types used for keys, you can still creatively come up with a similar logic.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM