简体   繁体   English

使用YARN在集群模式下运行spark时出现java.io.FileNotFoundException

[英]java.io.FileNotFoundException when running spark in cluster mode using YARN

I have a spark application that runs as expected on one node. 我有一个可以按预期在一个节点上运行的spark应用程序。

I am now using yarn to run this across multiple nodes. 我现在正在使用yarn在多个节点上运行它。 However, this is failing with a file not found exception. 但是,此操作失败,找不到文件异常。 I first changed this file path from relative to absolute path but the error persisted. 我首先将此文件路径从相对路径更改为绝对路径,但错误仍然存​​在。 I then read here that it may be necessary to prefix the path with file:// in case the default is for HDFS. 然后我在这里读到,如果默认是HDFS,则可能需要在路径前加上file://前缀。 This file type in question is json . 该文件类型为json

Despite using the absolute path and prefixing with file , this error persists: 尽管使用了绝对路径并以file前缀,但此错误仍然存​​在:

16/11/10 10:19:56 INFO yarn.Client: client token: N/A diagnostics: User class threw exception: java.io.FileNotFoundException: file://absolute/dir/file.json (No such file or directory)

Why does this work correctly with one node but not in cluster mode with yarn? 为什么这在一个节点上不能正常工作,而在纱线的群集模式下却不能正常工作?

You're missing a slash / . 您缺少斜线/ Try: 尝试:

file:///absolute/dir/file.json

The file:// prefix here specifies the NFS file system, and you need to specify the absolute path from there beginning with a forward slash, requiring three forward slashes in total. 此处的file://前缀指定了NFS文件系统,您需要从此处指定绝对路径(从正斜杠开始),总共需要三个正斜杠。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Yarn Cluster上运行的Spark作业java.io.FileNotFoundException:文件在主节点上退出时不会退出 - Spark Job running on Yarn Cluster java.io.FileNotFoundException: File does not exits , eventhough the file exits on the master node 尝试将Spark RDD存储到HBase时java.io.FileNotFoundException - java.io.FileNotFoundException when trying to store a Spark RDD to HBase Livy REST Spark java.io.FileNotFoundException: - Livy REST Spark java.io.FileNotFoundException: 由于java.io.FileNotFoundException,Google的Dataproc上的Spark失败:/ hadoop / yarn / nm-local-dir / usercache / root / appcache / - Spark on Google's Dataproc failed due to java.io.FileNotFoundException: /hadoop/yarn/nm-local-dir/usercache/root/appcache/ Spark:java.io.FileNotFoundException:文件在copyMerge中不存在 - Spark: java.io.FileNotFoundException: File does not exist in copyMerge Hadoop流传输失败,并出现java.io.FileNotFoundException - Hadoop streaming failed with java.io.FileNotFoundException 在 yarn-cluster 模式下运行 Spark 时出错(应用程序返回退出代码 1) - Error (application returned with exitcode 1) when running Spark in yarn-cluster mode 在SPARK YARN群集模式下使用Scala代码时,“用户未初始化Spark上下文”错误 - “User did not initialize spark context” Error when using Scala code in SPARK YARN Cluster mode Spark流:java.io.FileNotFoundException:文件不存在: <input_filename> ._COPYING_ - Spark Streaming: java.io.FileNotFoundException: File does not exist: <input_filename>._COPYING_ 在独立模式和Yarn / Mesos上运行spark群集 - Running spark cluster on standalone mode vs Yarn/Mesos
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM