Hadoop JAVA MR工作

Question

嗨，我是Hadoop MR的新手。 我試圖寫一個簡單的MR作業來計算一個節點到其目的節點的最短路徑，基本上邏輯是這樣的：

如果輸入文本文件具有以下路徑：ABCD ABD ACD BED BD BACD

輸出應為：ABD BD

這僅給出了節點A和D之間的最短路徑以及B和D之間的最短路徑。

我得到的輸出是：[ABCD ABD ACD BED BD BACD]

我寫了以下MR來做同樣的事情。 但是它沒有給出理想的答案。 我在獨立模式下運行MR。

請讓我知道代碼及其解決方案出了什么問題。 非常感謝您的寶貴時間。

public class Shpath {


    public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text> {

        public void map(LongWritable key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
            String[] line = value.toString().split("\t");
            List<String> l = new ArrayList<String>();

            for(String lin :line){
                l.add(lin);
            }

            List <String>startEnd = new ArrayList<String>();
            for(String s : l){
                String g = s.substring(0,1)+s.substring((s.length())-1);
                if(!startEnd.contains(g))
                {
                    startEnd.add(g);
                }
            }

            List <String> uniqueStringList = new ArrayList<String>();
            java.util.Map finalMap = new HashMap();
            for(String s1 : startEnd){

                for(String s : l) {
                    if(s.startsWith(s1.substring(0,1)) && (s.endsWith(s1.substring((s1.length())-1)))){
                        uniqueStringList.add(s);
                    }
                 }
                 String smallestKey = null;
                 int minSize = Integer.MAX_VALUE;
                 String smallest = null;
                 for(String s2 : uniqueStringList){

                     if(s2.length() < minSize) {
                         minSize = s2.length();
                         smallest  = s2;
                         smallestKey  = s1;
                     }    
                     finalMap.put(s1,smallest);

                 }
                 uniqueStringList.clear();
            }output.collect(new Text(),new Text(finalMap.values().toString()));
        }
    }

    public static class Reduce extends MapReduceBase implements Reducer<Text, Text, Text, Text> {
        public void reduce(Text key, Iterator<Text> value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {

           while (value.hasNext()){
               output.collect(new Text(key),new Text(value.next()));
           }
        }
    }

    public static void main(String[] args) throws Exception {
        JobConf conf = new JobConf(Shpath.class);
        conf.setJobName("shpath");

        conf.setOutputKeyClass(Text.class);
        conf.setOutputValueClass(Text.class);

        conf.setMapperClass(Map.class);
        conf.setCombinerClass(Reduce.class);
        conf.setReducerClass(Reduce.class);

        conf.setInputFormat(org.apache.hadoop.mapred.TextInputFormat.class);
        conf.setOutputFormat(org.apache.hadoop.mapred.TextOutputFormat.class);

        org.apache.hadoop.mapred.FileInputFormat.setInputPaths(conf, new Path(args[0]));
        org.apache.hadoop.mapred.FileOutputFormat.setOutputPath(conf, new Path(args[1]));

        JobClient.runJob(conf);
    }
}

Answer 1

我不確定，但是必須是這樣的：

public void map(LongWritable key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        Map<String , HashMap<Integer, String> > outMap = new HashMap<String, HashMap<Integer, String> >();
        HashMap<Integer, String> tempMap = new HashMap<Integer, String>();
        tempMap.put(Integer.MAX_VALUE, "");
        outMap.put("AD", tempMap);
        outMap.put("BD", tempMap);

        String[] line = value.toString().split("\t");
        for (String path : line) {
            String tempPath = new String( new char[]{path.charAt(0) , path.charAt(path.length() - 1)});
            if(outMap.containsKey(tempPath)) {
                HashMap<Integer, String> tempOutMap = outMap.get(tempPath);
                for (Iterator itr =  tempOutMap.keySet().iterator(); itr.hasNext(); ) {
                    Integer count = (Integer) itr.next();
                    if(count > tempPath.length()){
                       tempMap.remove(count);
                       tempMap.put(tempPath.length(), tempPath);
                    }
                }
            }
        }
        for (String str : outMap.keySet()) {
          output.collect(new Text(str), new Text(outMap.get(str).values().toString()));    
        }        
    }


public void reduce(Text key, Iterator<Text> value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
           String outString;
           int smallest = Integer.MAX_VALUE;
           while (value.hasNext()){
               String str = value.next();
               if(str.length() < smallest) {
                  outString = str;
                  smallest = str.length();
               }
           }
           output.collect(new Text(key),new Text(outString));
    }

Hadoop JAVA MR工作

問題描述

1 個解決方案

解決方案1
0 2013-02-06 12:20:16

Hadoop JAVA MR工作

問題描述

1 個解決方案

解決方案1 0 2013-02-06 12:20:16

解決方案1
0 2013-02-06 12:20:16