NoClassDefFoundError: org/apache/spark/SparkConf

Question

I have a simple Spark application and I for the life of me cannot run the output jar.我有一个简单的 Spark 应用程序，但我一生都无法运行输出 jar。 I'm simply running mvn clean install and running the jar with java -jar SparkUdemy2-1.0-SNAPSHOT.jar我只是运行mvn clean install并使用java -jar SparkUdemy2-1.0-SNAPSHOT.jar运行 jar

Below I've attached both the maven file and the small code snippet.下面我附上了 Maven 文件和小代码片段。

I made sure that the deps exist within my local m2.我确保 deps 存在于我的本地 m2 中。 What is happening?怎么了？ It's imported with no issues.它是进口的，没有问题。

Maven马文

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.example</groupId>
    <artifactId>SparkUdemy2</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
        <java.version>1.8</java.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>15.0</version>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.12</artifactId>
            <version>2.4.5</version>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <version>2.0.0</version>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>2.2.0</version>
            <scope>compile</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.5.1</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
            <plugin>
                <artifactId>maven-jar-plugin</artifactId>
                <version>3.0.2</version>
                <configuration>
                    <archive>
                        <manifest>
                            <mainClass>com.learning.SparkMain</mainClass>
                        </manifest>
                    </archive>
                </configuration>
            </plugin>
        </plugins>
    </build>

</project>

Code代码

package com.learning;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import scala.Tuple2;

import java.util.Arrays;

public class SparkMain {
  public static void main(String[] args) {
    // Configure spark in local cluster - use all available cores available on machine
    // Without this, the application would be running on a single thread
    SparkConf conf = new SparkConf().setAppName("LearningSpark");
    JavaSparkContext sc = new JavaSparkContext(conf);

    JavaRDD<String> initialRDD = sc.textFile("s3n://s3-spark-data-bucket/input.txt");
    JavaPairRDD<Long, String> dat =
        initialRDD
            .map(sentence -> sentence.replaceAll("[^a-zA-Z\\s]", ""))
            .filter(line -> line.trim().length() > 0)
            .flatMap(line -> Arrays.asList(line.split(" ")).iterator())
            .mapToPair(word -> new Tuple2<>(word, 1L))
            .reduceByKey((v1, v2) -> v1 + v2)
            .mapToPair(tuple -> new Tuple2<>(tuple._2, tuple._1))
            .sortByKey(false);

    dat.foreach(item -> System.out.println(item));

    sc.close();
  }
}

Answer 1

Java does not know what dependencies it has to include in the classpath when you run a jar file.当您运行 jar 文件时，Java 不知道它必须在类路径中包含哪些依赖项。 That's why you have to tell Maven to create a so-called fatjar (a jar that consists of your classes and all of the dependencies) using the shade plugin .这就是为什么您必须告诉 Maven 使用shade 插件创建一个所谓的 fatjar（一个包含您的类和所有依赖项的 jar）。

Answer 2

try changing your build in pom to following:尝试将您在 pom 中的构建更改为以下内容：

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.5.1</version>
            <configuration>
                <source>1.8</source>
                <target>1.8</target>
            </configuration>
        </plugin>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <executions>
                <execution>
                    <id>anything</id>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

NoClassDefFoundError: org/apache/spark/SparkConf

问题描述

Maven马文

Code代码

2 个解决方案

解决方案1
-1 2020-03-07 01:40:43

解决方案2
-1 2020-03-07 09:51:24

NoClassDefFoundError: org/apache/spark/SparkConf

问题描述

Maven马文

Code代码

2 个解决方案

解决方案1 -1 2020-03-07 01:40:43

解决方案2 -1 2020-03-07 09:51:24

解决方案1
-1 2020-03-07 01:40:43

解决方案2
-1 2020-03-07 09:51:24