简体   繁体   English

Hadoop Map Reduce-读取HDFS文件-FileAlreadyExists错误

[英]Hadoop Map Reduce - Read HDFS File - FileAlreadyExists error

I am new to Hadoop. 我是Hadoop的新手。 I am trying to read an existing file on HDFS using the below code. 我正在尝试使用以下代码读取HDFS上的现有文件。 The configuration seem file and the file path is correct as well. 配置似乎文件,并且文件路径也正确。 - --

public static class Map extends Mapper<LongWritable, Text, Text, Text> {

    private static Text f1, f2, hdfsfilepath;
    private static HashMap<String, ArrayList<String>> friendsData = new HashMap<>();

    public void setup(Context context) throws IOException {
      Configuration conf = context.getConfiguration();
      Path path = new Path("hdfs://cshadoop1" + conf.get("hdfsfilepath"));
      FileSystem fs = FileSystem.get(path.toUri(), conf);
      if (fs.exists(path)) {
        BufferedReader br = new BufferedReader(
            new InputStreamReader(fs.open(path)));
        String line;
        line = br.readLine();
        while (line != null) {
          StringTokenizer str = new StringTokenizer(line, ",");
          String friend = str.nextToken();
          ArrayList<String> friendDetails = new ArrayList<>();
          while (str.hasMoreTokens()) {
            friendDetails.add(str.nextToken());
          }
          friendsData.put(friend, friendDetails);
        }
      }
    }

    public void map(LongWritable key, Text value, Context context)
        throws IOException, InterruptedException {
      for (String k : friendsData.keySet()) {
        context.write(new Text(k), new Text(friendsData.get(k).toString()));
      }
    }
  }

I am getting the below exception when I run the code - 运行代码时出现以下异常-

Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://cshadoop1/socNetData/userdata/userdata.txt already exists
        at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
        at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)

I am just trying to read an existing file. 我只是想读取一个现有文件。 Any ideas what I am missing here? 有什么想法我在这里想念的吗? Appreciate any help. 感谢任何帮助。

Exception tells you that your output directory already exists but it should not. 异常告诉您输出目录已经存在,但不应该存在。 Delete it or change its name. 删除或更改其名称。

Moreover the name of your output directory 'userdata.txt' looks like the name of a file. 此外,输出目录“ userdata.txt”的名称看起来像文件名。 So check you are not mistaken in your input/output directories. 因此,请检查您在输入/输出目录中没有记错。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM