簡體   English   中英

Apache Spark Streaming 與 Java & Kafka

[英]Apache Spark Streaming with Java & Kafka

我正在嘗試從Spark 官方網站運行 Spark Streaming 示例

這些是我在 pom 文件中使用的依賴項:

<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-core_2.11</artifactId>
  <version>2.3.1</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming_2.11</artifactId>
    <version>2.3.1</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
    <version>2.2.0</version>
</dependency>

這是我的 Java 代碼:

package com.myproject.spark;

import java.util.Arrays;
import java.util.Collection;
import java.util.HashMap;
import java.util.Map;

import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.apache.spark.SparkConf;
import org.apache.spark.streaming.Durations;
import org.apache.spark.streaming.api.java.JavaInputDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import org.apache.spark.streaming.kafka010.ConsumerStrategies;
import org.apache.spark.streaming.kafka010.KafkaUtils;
import org.apache.spark.streaming.kafka010.LocationStrategies;

import com.myproject.spark.serialization.JsonDeserializer;

import scala.Tuple2;

public class MainEntryPoint {
  public static void main(String[] args) {
    Map<String, Object> kafkaParams = new HashMap<String, Object>();
    kafkaParams.put("bootstrap.servers", "localhost:9092,localhost:9093,localhost:9094");
    kafkaParams.put("key.deserializer", StringDeserializer.class);
    kafkaParams.put("value.deserializer",JsonDeserializer.class.getName());
    kafkaParams.put("group.id", "ttk-event-listener");
    kafkaParams.put("auto.offset.reset", "latest");
    kafkaParams.put("enable.auto.commit", false);

    Collection<String> topics = Arrays.asList("topic1", "topic2");

    SparkConf conf = new SparkConf()
        .setMaster("local[*]")
        .setAppName("EMSStreamingApp");
    JavaStreamingContext streamingContext =
        new JavaStreamingContext(conf, Durations.seconds(1));

    JavaInputDStream<ConsumerRecord<String, String>> stream =
      KafkaUtils.createDirectStream(
        streamingContext,
        LocationStrategies.PreferConsistent(),
        ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams)
      );

    stream.mapToPair(record -> new Tuple2<>(record.key(), record.value()));


    streamingContext.start();
    try {
      streamingContext.awaitTermination();
    } catch (InterruptedException e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
    }
  }
}

當我嘗試從 Eclipse 運行它時,出現以下異常:

18/07/16 13:35:27 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.1.106, 51604, None)
18/07/16 13:35:27 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.1.106, 51604, None)
Exception in thread "main" java.lang.AbstractMethodError
at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:99)
at org.apache.spark.streaming.kafka010.KafkaUtils$.initializeLogIfNecessary(KafkaUtils.scala:39)
at org.apache.spark.internal.Logging$class.log(Logging.scala:46)
at org.apache.spark.streaming.kafka010.KafkaUtils$.log(KafkaUtils.scala:39)
at org.apache.spark.internal.Logging$class.logWarning(Logging.scala:66)
at org.apache.spark.streaming.kafka010.KafkaUtils$.logWarning(KafkaUtils.scala:39)
at org.apache.spark.streaming.kafka010.KafkaUtils$.fixKafkaParams(KafkaUtils.scala:201)
at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.<init>(DirectKafkaInputDStream.scala:63)
at org.apache.spark.streaming.kafka010.KafkaUtils$.createDirectStream(KafkaUtils.scala:147)
at org.apache.spark.streaming.kafka010.KafkaUtils$.createDirectStream(KafkaUtils.scala:124)
at org.apache.spark.streaming.kafka010.KafkaUtils$.createDirectStream(KafkaUtils.scala:168)
at org.apache.spark.streaming.kafka010.KafkaUtils.createDirectStream(KafkaUtils.scala)
at com.myproject.spark.MainEntryPoint.main(MainEntryPoint.java:47)
18/07/16 13:35:28 INFO SparkContext: Invoking stop() from shutdown hook

我從我的 IDE (eclipse) 運行它。 我是否必須創建 JAR 並將其部署到 spark 中才能運行。 如果有人知道異常,請分享您的經驗。 提前致謝

嘗試將 2.3.1 也用於 spark-streaming-kafka 依賴項。

還檢查有關java.lang.AbstractMethodError的其他相關問題及其答案。

這通常意味着使用的庫與其接口/實現之間的不匹配。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM