简体   繁体   English

带有'yarn-client'的Spark-shell尝试从错误的位置加载配置

[英]Spark-shell with 'yarn-client' tries to load config from wrong location

I'm trying to launch bin/spark-shell and bin/pyspark from laptop, connecting to Yarn cluster in yarn-client mode, and I get the same error 我正试图从笔记本电脑启动bin/spark-shellbin/pyspark ,在yarn-client模式下连接到Yarn集群,我得到了同样的错误

WARN ScriptBasedMapping: Exception running
/etc/hadoop/conf.cloudera.yarn1/topology.py 10.0.240.71
java.io.IOException: Cannot run program "/etc/hadoop/conf.cloudera.yarn1/topology.py" 
(in directory "/Users/eugenezhulenev/projects/cloudera/spark"): error=2, 
No such file or directory

Spark is trying to run /etc/hadoop/conf.cloudera.yarn1/topology.py on my laptop, but not on worker node in Yarn. Spark试图在我的笔记本电脑上运行/etc/hadoop/conf.cloudera.yarn1/topology.py ,但不在Yarn的工作节点上运行。

This problem appeared after update from Spark 1.2.0 to 1.3.0 (CDH 5.4.2) 从Spark 1.2.0更新到1.3.0(CDH 5.4.2)后出现此问题

The following steps is a temporarily work-around for this issue on CDH 5.4.4 以下步骤是针对CDH 5.4.4中此问题的临时解决方法

cd ~
mkdir -p test-spark/
cd test-spark/

Then copy all files from /etc/hadoop/conf.clouder.yarn1 from one worker node to the above (local) directory. 然后将/etc/hadoop/conf.clouder.yarn1中的所有文件从一个工作节点复制到上面的(本地)目录。 And then run spark-shell from ~/test-spark/ 然后从~/test-spark/运行spark-shell

The problem is related with infrastructure where Hadoop conf files are not copied as Spark conf file on all nodes. 该问题与Hadoop conf文件未在所有节点上复制为Spark conf文件的基础结构有关。 Some of the node may be missing those files and if you are using that particular node where these files are missing you will hit this problem. 某些节点可能缺少这些文件,如果您正在使用缺少这些文件的特定节点,则会遇到此问题。

When spark starts it looks for the conf files: 1. first at the same location where HADOOP_CONF is located 2. If above 1 location is missing then look at the location from where the spark is started 当spark开始时,它会查找conf文件:1。首先在HADOOP_CONF所在的同一位置2.如果缺少1个以上的位置,请查看火花开始的位置

To solve this problem get the missing folder and look at other nodes and if available on other nodes, copy to the node you where you see the problem. 要解决此问题,请获取缺少的文件夹并查看其他节点,如果在其他节点上可用,请复制到您看到问题的节点。 Otherwise you can just copy the hadoop conf folders as yarn conf in the same location to solve this problem. 否则,您可以将hadoop conf文件夹作为yarn conf复制到同一位置以解决此问题。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么使用“java.lang.ClassNotFoundException:org.apache.hadoop.fs.FSDataInputStream”启动带有yarn-client的spark-shell失败? - Why does launching spark-shell with yarn-client fail with “java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream”? 使用scala将eclipse中的spark作业提交给yarn-client - Submitting spark job from eclipse to yarn-client with scala 如何在客户端模式下使用 YARN 运行 spark-shell? - How to run spark-shell with YARN in client mode? Spark提交以master作为yarn-client(windows)给出错误“找不到或加载主类” - Spark submit with master as yarn-client (windows) gives Error “Could not find or load main class” Spark-shell --master 纱线卡住 - Spark-shell --master yarn stuck 当我们将Spark从Standalone切换到Yarn-Client时,需要更改什么? - What needs to be changed when we switch Spark from Standalone to Yarn-Client? 如何知道在 YARN 客户端模式下使用 spark-shell 导致 ClosedChannelExceptions 的原因是什么? - How to know what is the reason for ClosedChannelExceptions with spark-shell in YARN client mode? Apache Spark在YARN错误上运行spark-shell - Apache Spark running spark-shell on YARN error 在RM UI中使用“Memory Used”,两次请求spark-shell - YARN “Memory Used” in RM UI twice as spark-shell requested 在Ubuntu 14.04上的Yarn-Client模式下在Spark上的Zeppelin中加载外部依赖项 - Loading external dependencies in Zeppelin on Spark in Yarn-Client mode on Ubuntu 14.04
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM