简体   繁体   English

为什么使用“java.lang.ClassNotFoundException:org.apache.hadoop.fs.FSDataInputStream”启动带有yarn-client的spark-shell失败?

[英]Why does launching spark-shell with yarn-client fail with “java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream”?

I am trying to set up a cluster at home for my personal needs (learning). 我正在尝试在家里建立一个集群,以满足我的个人需求(学习)。 First I made Hadoop+Yarn. 首先,我制作了Hadoop + Yarn。 MR2 is working. MR2正在运作。 Second - I am trying to add Spark but getting an error about missing classes. 第二 - 我正在尝试添加Spark,但收到有关缺少类的错误。

[root@master conf]# spark-shell --master yarn-client
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
...
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

I followed these instructions and added into spark-env.sh 我按照这些说明添加到spark-env.sh

export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop)

More info : 更多信息
Cent_OS.86_64 Cent_OS.86_64
Hadoop dir: /usr/local/hadoop Hadoop目录: /usr/local/hadoop

Hadoop version: Hadoop版本:

[root@master conf]# hadoop version
Hadoop 2.7.1
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657
Compiled by jenkins on 2015-06-29T06:04Z
Compiled with protoc 2.5.0
From source with checksum fc0a1a23fc1868e4d5ee7fa2b28a58a
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.1.jar

Manual says that I must have 2 vars: HADOOP_CONF_DIR or YARN_CONF_DIR 手册说我必须有2个变量: HADOOP_CONF_DIR or YARN_CONF_DIR

[root@master conf]# echo $HADOOP_CONF_DIR
/usr/local/hadoop/etc/hadoop
[root@master conf]# echo $YARN_CONF_DIR
/usr/local/hadoop/etc/hadoop

Spark is spark-1.5.0-bin-without-hadoop.tgz -> /usr/local/spark Spark是spark-1.5.0-bin-without-hadoop.tgz - > /usr/local/spark

I am trying to launch spark-shell --master yarn-client at the same time when hadoop+yarn are up and available http://master:50070/dfshealth.html#tab-overview http://master:8088/cluster/apps http://master:19888/jobhistory 我正在尝试在hadoop + yarn启动的同时启动spark-shell --master yarn-client http://master:50070/dfshealth.html#tab-overview http://master:8088/cluster/apps http://master:19888/jobhistory

I have no Scala installed if it matters. 如果重要,我没有安装Scala。 Any ideas what could I miss in Spark settings? 任何想法我可以在Spark设置中错过什么? Thank you. 谢谢。

Answering my own question: First of all this is my personal mistake. 回答我自己的问题:首先,这是我个人的错误。 Calling spark-shell I was launching it from the old (wrong) place /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/bin/spark-shell . 调用spark-shell我是从旧的(错误的)/ /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/bin/spark-shell I was sure that I've deleted all from CDH testings by yum remove cloudera* . 我确信我已经通过yum remove cloudera*从CDH测试中删除了所有内容。

[root@master bin]# type spark-shell
spark-shell is hashed (/usr/bin/spark-shell)
[root@master bin]# hash -d spark-shell

Now, launching if from old spark-1.5.0-bin-without-hadoop.tgz still gave me the same error. 现在,如果从旧的spark-1.5.0-bin-without-hadoop.tgz仍然给了我同样的错误。 Downloaded spark-1.5.0-bin-hadoop2.6 , added export SPARK_DIST_CLASSPATH=$HADOOP_HOME - spark-shell is working now. 下载了spark-1.5.0-bin-hadoop2.6 ,添加了export SPARK_DIST_CLASSPATH=$HADOOP_HOME - spark-shell现在正在运行。

I was getting this error because by typing spark-shell , /usr/bin/spark-shell was getting executed. 我收到此错误是因为通过键入spark-shell/usr/bin/spark-shell正在执行。

To call my specific spark-shell, I ran the following command from inside of own-built spark source - 要调用我特定的spark-shell,我从自己构建的spark源中运行以下命令 -

./bin/spark-shell

Instead of spark-1.5.0-bin-without-hadoop.tgz download one of the builds for Hadoop 2.x. 而不是spark-1.5.0-bin-without-hadoop.tgz下载Hadoop 2.x的一个版本。 They are simpler to set up as they come with the Hadoop client libraries. 随着Hadoop客户端库的出现,它们的设置更加简单。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 由以下原因导致:java.lang.ClassNotFoundException:eclipse中的org.apache.hadoop.fs.CanSetDropBehind问题 - Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.CanSetDropBehind issue in ecllipse java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem not found - java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem not found 解决线程“ main”中的异常java.lang.NoClassDefFoundError:org / apache / hadoop / fs / FSDataInputStream - Solving Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream fs.s3a.aws.credentials.provider java.lang.ClassNotFoundException:找不到类 org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider - fs.s3a.aws.credentials.provider java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider not found How solve java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.fs.LocalFileSystem in aws lambda? - How solve java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.fs.LocalFileSystem in aws lambda? 线程“ main”中的异常java.lang.NoClassDefFoundError:org / apache / hadoop / fs / FSDataInputStream - Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream SSH 线程“主”中的异常 java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream - SSH Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration - java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration java.lang.ClassNotFoundException:org.apache.hadoop.util.ProgramDriver - java.lang.ClassNotFoundException: org.apache.hadoop.util.ProgramDriver 带有'yarn-client'的Spark-shell尝试从错误的位置加载配置 - Spark-shell with 'yarn-client' tries to load config from wrong location
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM