[英]java.io.FileNotFoundException when running spark in cluster mode using YARN
I have a spark application that runs as expected on one node. 我有一个可以按预期在一个节点上运行的spark应用程序。
I am now using yarn to run this across multiple nodes. 我现在正在使用yarn在多个节点上运行它。 However, this is failing with a file not found exception.
但是,此操作失败,找不到文件异常。 I first changed this file path from relative to absolute path but the error persisted.
我首先将此文件路径从相对路径更改为绝对路径,但错误仍然存在。 I then read here that it may be necessary to prefix the path with
file://
in case the default is for HDFS. 然后我在这里读到,如果默认是HDFS,则可能需要在路径前加上
file://
前缀。 This file type in question is json
. 该文件类型为
json
。
Despite using the absolute path and prefixing with file
, this error persists: 尽管使用了绝对路径并以
file
前缀,但此错误仍然存在:
16/11/10 10:19:56 INFO yarn.Client: client token: N/A diagnostics: User class threw exception: java.io.FileNotFoundException: file://absolute/dir/file.json (No such file or directory)
Why does this work correctly with one node but not in cluster mode with yarn? 为什么这在一个节点上不能正常工作,而在纱线的群集模式下却不能正常工作?
You're missing a slash /
. 您缺少斜线
/
。 Try: 尝试:
file:///absolute/dir/file.json
The file://
prefix here specifies the NFS file system, and you need to specify the absolute path from there beginning with a forward slash, requiring three forward slashes in total. 此处的
file://
前缀指定了NFS文件系统,您需要从此处指定绝对路径(从正斜杠开始),总共需要三个正斜杠。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.