简体   繁体   English

Spark + Kafka流式NoClassDefFoundError kafka / serializer / StringDecoder

[英]Spark + Kafka streaming NoClassDefFoundError kafka/serializer/StringDecoder

I'm trying to send message from my kafka producer and stream it in spark streaming. 我正在尝试从我的kafka制作人发送消息,并以Spark Streaming方式进行流式传输。 But I'm getting the following error when I run my application on spark submit. 但是在spark提交上运行我的应用程序时出现以下错误。

Error 错误

 Exception in thread "main" java.lang.NoClassDefFoundError: kafka/serializer/StringDecoder
        at com.spark_stream.Main.main(Main.java:37)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: kafka.serializer.StringDecoder
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 10 more

Application code is as follows: 应用代码如下:

Main.java Main.java

package com.spark_stream;

import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.streaming.Duration;
import org.apache.spark.streaming.api.java.JavaPairInputDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import org.apache.spark.streaming.kafka.KafkaUtils;

import kafka.serializer.StringDecoder;

public class Main {

    public static void main(String[] args) {
        // TODO Auto-generated method stub

         System.out.println( "spark started!" );

            SparkConf conf = new SparkConf()
                    .setAppName("kafka-sandbox")
                    .setMaster("local[*]");
            JavaSparkContext sc = new JavaSparkContext(conf);
            JavaStreamingContext ssc = new JavaStreamingContext(sc, new Duration(2000));


            Map<String, String> kafkaParams = new HashMap<String, String>();
            kafkaParams.put("metadata.broker.list", "localhost:9092");
            Set<String> topics = Collections.singleton("speed");

            JavaPairInputDStream<String, String> directKafkaStream = KafkaUtils.createDirectStream(ssc,
                    String.class, String.class, StringDecoder.class, StringDecoder.class, kafkaParams, topics);

            directKafkaStream.foreachRDD(rdd -> {
                System.out.println("--- New RDD with " + rdd.partitions().size()
                        + " partitions and " + rdd.count() + " records");
                rdd.foreach(record -> System.out.println(record._2));
            });

            System.out.println( "connection completed" );


            ssc.start();

            ssc.awaitTermination();

            System.out.println( "spark ended!" );

    }

}

Pom.xml 的pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.spark_stream</groupId>
  <artifactId>com.spark_stream</artifactId>
  <version>0.0.1-SNAPSHOT</version>


    <dependencies>

    <dependency> <!-- Spark dependency -->
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>1.6.0</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.10</artifactId>
        <version>1.6.0</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka_2.10</artifactId>
        <version>1.6.0</version>
    </dependency>


</dependencies>

    <properties>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
    </properties>
</project>

Couldn't find a solution for this error. 找不到此错误的解决方案。 Any help would be appreciated. 任何帮助,将不胜感激。

Have a look at the doc: http://spark.apache.org/docs/latest/submitting-applications.html#launching-applications-with-spark-submit 看一下文档: http : //spark.apache.org/docs/latest/submitting-applications.html#launching-applications-with-spark-submit

More specifically the part: 更具体地说,该部分:

Path to a bundled jar including your application and all dependencies. 捆绑jar的路径,包括您的应用程序和所有依赖项。

Whereas your pom.xml clearly shows that the jar you are building is without the dependencies. 而pom.xml清楚地表明您正在构建的jar没有依赖项。 That's why spark-submit cannot find the class kafka.serializer.StringDecoder. 这就是为什么spark-submit找不到类kafka.serializer.StringDecoder的原因。

What you might want to use to solve such a problem is a plugin that include your dependencies inside your jar, the maven assembly plugin can help you with this 您可能想使用的解决方案是一个将您的依赖项包含在jar中的插件, Maven程序集插件可以帮助您解决此问题

Seems like complier is unable to find kafka jars as you had not included in pom file. 好像complier无法找到kafka jars,因为您没有包含在pom文件中。 Trying adding below dependency in your pom file.Check for kafka version you are using. 尝试在pom文件中添加以下依赖项。检查您使用的kafka版本。

<!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka_2.10 -->
<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka_2.10</artifactId>
    <version>0.8.0</version>
</dependency>

This usually happens if you don't bundle all your dependent assemblies that your application needs, try to build a uber which contains all the dependencies. 如果没有捆绑应用程序所需的所有依赖程序集,通常会发生这种情况,请尝试构建一个包含所有依赖项的uber。

I have added a portion of the sample pom file which will do the same. 我已经添加了示例pom文件的一部分,它将执行相同的操作。

<build>
        <sourceDirectory>src/main/scala</sourceDirectory>
        <plugins>
            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.1.6</version>
                <executions>
                    <execution>
                        <phase>compile</phase>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.3</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <shadedArtifactAttached>true</shadedArtifactAttached>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <artifactSet>
                                <includes>
                                    <include>*:*</include>
                                </includes>
                            </artifactSet>
                            <transformers>
                                <transformer
                                        implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                    <resource>reference.conf</resource>
                                </transformer>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

        </plugins>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM