由以下原因导致：java.lang.ClassNotFoundException：eclipse中的org.apache.hadoop.fs.CanSetDropBehind问题

Question

I have the below spark word count program : 我有以下火花字数统计程序：

    package com.sample.spark;
    import java.util.Arrays;
    import java.util.List;
    import java.util.Map;
    import org.apache.spark.SparkConf;
    import org.apache.spark.api.java.*;
    import org.apache.spark.api.java.function.FlatMapFunction;
    import org.apache.spark.api.java.function.Function;
    import org.apache.spark.api.java.function.Function2;
    import org.apache.spark.api.java.function.PairFlatMapFunction;
    import org.apache.spark.api.java.function.PairFunction;
    import org.apache.hadoop.fs.FSDataInputStream;
    import org.apache.hadoop.fs.FSDataOutputStream;
    import scala.Tuple2;


    public class SparkWordCount {

        public static void main(String[] args) {
            SparkConf conf = new SparkConf().setAppName("wordcountspark").setMaster("local").setSparkHome("/Users/hadoop/spark-1.4.0-bin-hadoop1");
            JavaSparkContext sc = new JavaSparkContext(conf);
            //SparkConf conf = new SparkConf();
            //JavaSparkContext sc = new JavaSparkContext("hdfs", "Simple App","/Users/hadoop/spark-1.4.0-bin-hadoop1", new String[]{"target/simple-project-1.0.jar"});
            JavaRDD<String> textFile = sc.textFile("hdfs://localhost:54310/data/wordcount");
            JavaRDD<String> words = textFile.flatMap(new FlatMapFunction<String, String>() {
              public Iterable<String> call(String s) { return Arrays.asList(s.split(" ")); }
            });
            JavaPairRDD<String, Integer> pairs = words.mapToPair(new PairFunction<String, String, Integer>() {
                public Tuple2<String, Integer> call(String s) { return new Tuple2<String, Integer>(s, 1); }

            });

            JavaPairRDD<String, Integer> counts = pairs.reduceByKey(new Function2<Integer, Integer, Integer>() {
                  public Integer call(Integer a, Integer b) { return a + b; }
                });  
            counts.saveAsTextFile("hdfs://localhost:54310/data/output/spark/outfile");

        }


    }

I get the Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.CanSetDropBehind exception when I run the code from ecllipse however if I export as runnable jar and run from the terminal as below it works : 当我从ecllipse运行代码时，我得到了原因：java.lang.ClassNotFoundException：org.apache.hadoop.fs.CanSetDropBehind异常，但是如果我导出为可运行的jar并从终端运行，如下所示：

      bin/spark-submit --class com.sample.spark.SparkWordCount --master local  /Users/hadoop/spark-1.4.0-bin-hadoop1/finalJars/SparkJar-v2.jar

The maven pom looks like : Maven pom看起来像：

    <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
        <modelVersion>4.0.0</modelVersion>
        <groupId>com.sample.spark</groupId>
        <artifactId>SparkRags</artifactId>
        <packaging>jar</packaging>
        <version>1.0-SNAPSHOT</version>
        <name>SparkRags</name>
        <url>http://maven.apache.org</url>
        <dependencies>
            <dependency>
                <groupId>junit</groupId>
                <artifactId>junit</artifactId>
                <version>3.8.1</version>
                <scope>test</scope>
            </dependency>
            <dependency> <!-- Spark dependency -->
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-core_2.10</artifactId>
                <version>1.4.0</version>
                <scope>compile</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-common</artifactId>
                <version>0.23.11</version>
                <scope>compile</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-core</artifactId>
                <version>1.2.1</version>
                <scope>compile</scope>
            </dependency>
    </dependencies>
    </project>

Answer 1

When you run in eclipse, the referenced jars are the only source for your program to run. 在eclipse中运行时，引用的jar是程序运行的唯一来源。 So the jar hadoop-core(thats where CanSetDropBehind is present), is not added properly in your eclipse from local repository for some reasons. 因此，由于某些原因，无法将jar hadoop-core（存在CanSetDropBehind的位置）正确地添加到本地存储库中的日食中。 You need to identify this if it is a proxy issue, or any other with pom. 如果是代理问题或其他与pom有关的问题，则需要标识此问题。

When you run the jar from terminal, the reason for running can be, due to the presence of jar in the classpath referenced. 从终端运行jar时，运行的原因可能是由于所引用的类路径中存在jar。 Also while running from terminal, you could also choose to have those jars as fat jar(to include hadoop-core) in your jar. 同样，从终端运行时，您也可以选择将这些罐子作为肥罐（包括hadoop-core）放入罐子中。 I hope you are not using this option while creating your jar. 希望您在创建jar时不要使用此选项。 Then the reference would be picked from inside your jar, without depending on the class path. 然后，可以从您的jar内部选择引用，而不必依赖于类路径。

Verify each step, and it will help you identify the cause. 验证每个步骤，它将帮助您确定原因。 Happy coding 快乐编码

Answer 2

Found that this was caused because the hadoop-common jar for the version 0.23.11 did not have the class,changed the version to 2.7.0 and also added below dependency : 发现这是由于版本0.23.11的hadoop通用jar没有该类导致的，将版本更改为2.7.0并还添加了以下依赖项：

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-mapreduce-client-core</artifactId>
        <version>2.7.0</version>
    </dependency>

Got rid of the error then but still seeing the below error : 然后摆脱了错误，但仍然看到以下错误：

Exception in thread "main" java.io.EOFException: End of File Exception between local host is: "mbr-xxxx.local/127.0.0.1"; 线程“主”中的异常java.io.EOFException：本地主机之间的文件末尾异常是：“ mbr-xxxx.local / 127.0.0.1”； destination host is: "localhost":54310; 目标主机是：“ localhost”：54310; : java.io.EOFException; ：java.io.EOFException; For more details see: http://wiki.apache.org/hadoop/EOFException 有关更多详细信息，请参见： http : //wiki.apache.org/hadoop/EOFException

由以下原因导致：java.lang.ClassNotFoundException：eclipse中的org.apache.hadoop.fs.CanSetDropBehind问题

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-06-20 18:07:32

解决方案2
1 2015-06-23 23:22:12

由以下原因导致：java.lang.ClassNotFoundException：eclipse中的org.apache.hadoop.fs.CanSetDropBehind问题

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-06-20 18:07:32

解决方案2 1 2015-06-23 23:22:12

解决方案1
1 已采纳 2015-06-20 18:07:32

解决方案2
1 2015-06-23 23:22:12