简体   繁体   English

Hadoop:Reduce 阶段开始:FileNotFoundException 输出/file.out.index 不存在

[英]Hadoop: Start of Reduce phase: FileNotFoundException output/file.out.index does not exist

I'm trying to get Hadoop 3.1.0 to run on a Windows 10 system.我正在尝试让 Hadoop 3.1.0 在 Windows 10 系统上运行。

The exception I'm receiving is:我收到的例外是:

2019-11-19 14:49:13,310 INFO mapreduce.Job:  map 100% reduce 0%
2019-11-19 14:49:13,325 INFO mapred.LocalJobRunner: reduce task executor complete.
2019-11-19 14:49:13,435 WARN mapred.LocalJobRunner: job_local1214244919_0001
java.lang.Exception: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#1
        at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:559)
Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#1
        at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:377)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:347)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
**Caused by: java.io.FileNotFoundException: File C:/tmp/hadoop-king%20lui/mapred/local/localRunner/king%20lui/jobcache/job_local1214244919_0001/attempt_local1214244919_0001_m_000001_0/output/file.out.index does not exist**
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
        at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:211)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:950)
        at org.apache.hadoop.io.SecureIOUtils.openFSDataInputStream(SecureIOUtils.java:152)
        at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:71)
        at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:62)
        at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:57)
        at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:125)
        at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:103)
        at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:86)
**2019-11-19 14:49:14,357 INFO mapreduce.Job: Job job_local1214244919_0001 failed with state FAILED due to: NA**
2019-11-19 14:49:14,372 INFO mapreduce.Job: Counters: 18
        File System Counters
                FILE: Number of bytes read=840
                FILE: Number of bytes written=991668
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=2
                Map output records=8
                Map output bytes=82
                Map output materialized bytes=85
                Input split bytes=204
                Combine input records=8
                Combine output records=6
                Spilled Records=6
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=0
                Total committed heap usage (bytes)=577241088
        File Input Format Counters
                Bytes Read=48

(I highlighted the parts I consider important with **) (我用**强调了我认为重要的部分)

I'm running the introduction source code of Hadoop.我正在运行Hadoop的介绍源代码

  • The folder for which the file is missing exists.缺少文件的文件夹存在。
  • I deleted the entire tmp folder -> didn't help.我删除了整个 tmp 文件夹-> 没有帮助。
  • I tried MapReduce doesn't produce an output but there was no such entry.我试过MapReduce 没有产生 output但没有这样的条目。 Adding one didn't help either.添加一个也没有帮助。

You need to run Hadoop, and your code, as a user without spaces in the name (not king lui )您需要运行 Hadoop 和您的代码,作为名称中没有空格的用户(不是king lui

One way to test this scenario is测试这种情况的一种方法是

set HADOOP_USER_NAME=myuser 
yarn jar your-app.jar

Then it might error saying that /user/myuser doesn't exist, but you can create that directory然后它可能会错误地说 /user/myuser 不存在,但您可以创建该目录

Changing the username worked to one without spaces worked.将用户名更改为没有空格的用户名有效。

I tried this before set HADOOP_USER_NAME=myuser , so I don't know if this helps.我在set HADOOP_USER_NAME=myuser之前尝试过这个,所以我不知道这是否有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM