简体   繁体   English

Kafka主题详细信息未显示在火花中

[英]Kafka topic details not displaying in spark

I have written a topic in Kafka as my-topic and I am trying to fetch the information of topic in spark. 我已经在Kafka中写了一个主题作为my-topic并且试图在spark中获取该主题的信息。 But I am facing some difficulty in displaying Kafka topic details as I am getting long list of errors. 但是由于出现了很多错误,在显示Kafka主题详细信息时遇到了一些困难。 I am using java for fetching data. 我正在使用Java来获取数据。

Below is my code: 下面是我的代码:

public static void main(String s[]) throws InterruptedException{
    SparkConf conf = new SparkConf().setMaster("local[*]").setAppName("Sampleapp");
    JavaStreamingContext jssc = new JavaStreamingContext(conf, Durations.seconds(10));

    Map<String, Object> kafkaParams = new HashMap<>();
    kafkaParams.put("bootstrap.servers", "localhost:9092");
    kafkaParams.put("key.deserializer", StringDeserializer.class);
    kafkaParams.put("value.deserializer", StringDeserializer.class);
    kafkaParams.put("group.id", "Different id is allotted for different stream");
    kafkaParams.put("auto.offset.reset", "latest");
    kafkaParams.put("enable.auto.commit", false);

    Collection<String> topics = Arrays.asList("my-topic");

    final JavaInputDStream<ConsumerRecord<String, String>> stream =
      KafkaUtils.createDirectStream(
        jssc,
        LocationStrategies.PreferConsistent(),
        ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams)
      );

    JavaPairDStream<String, String> jPairDStream =  stream.mapToPair(
            new PairFunction<ConsumerRecord<String, String>, String, String>() {
                /**
                 * 
                 */
                private static final long serialVersionUID = 1L;

                @Override
                public Tuple2<String, String> call(ConsumerRecord<String, String> record) throws Exception {
                    return new Tuple2<>(record.key(), record.value());
                }
            });

    jPairDStream.foreachRDD(jPairRDD -> {
           jPairRDD.foreach(rdd -> {
                System.out.println("key="+rdd._1()+" value="+rdd._2());
            });
        });

    jssc.start();            
    jssc.awaitTermination(); 

    stream.mapToPair(
            new PairFunction<ConsumerRecord<String, String>, String, String>() {
                /**
                 * 
                 */
                private static final long serialVersionUID = 1L;

                @Override
                public Tuple2<String, String> call(ConsumerRecord<String, String> record) throws Exception {
                    return new Tuple2<>(record.key(), record.value());
                }
            });
}

Below is the Error I am getting: 以下是我遇到的错误:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 17/09/04 11:41:15 INFO SparkContext: Running Spark version 2.1.0 17/09/04 11:41:15 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/09/04 11:41:15 INFO SecurityManager: Changing view acls to: 11014525 17/09/04 11:41:15 INFO SecurityManager: Changing modify acls to: 11014525 17/09/04 11:41:15 INFO SecurityManager: Changing view acls groups to: 17/09/04 11:41:15 INFO SecurityManager: Changing modify acls groups to: 17/09/04 11:41:15 INFO SecurityManager: SecurityManager: authentication disabled; 使用Spark的默认log4j配置文件:org / apache / spark / log4j-defaults.properties 17/09/04 11:41:15 INFO SparkContext:运行Spark版本2.1.0 17/09/04 11:41:15 WARN NativeCodeLoader:无法加载适用于您平台的本机Hadoop库...在适当情况下使用内建的Java类加载信息... 17/09/04 11:41:15 INFO SecurityManager:将视图ACL更改为:11014525 17/09/04 11:41:15 INFO SecurityManager :将修改ACL更改为:11014525 17/09/04 11:41:15 INFO SecurityManager:将视图ACL组更改为:17/09/04 11:41:15 INFO SecurityManager:将修改ACL组更改为:17/09/04 11:41:15 INFO SecurityManager:SecurityManager:身份验证已禁用; ui acls disabled; ui acls已禁用; users with view permissions: Set(11014525); 具有查看权限的用户:Set(11014525); groups with view permissions: Set(); 具有查看权限的组:Set(); users with modify permissions: Set(11014525); 具有修改权限的用户:Set(11014525); groups with modify permissions: Set() 17/09/04 11:41:15 INFO Utils: Successfully started service 'sparkDriver' on port 56668. 17/09/04 11:41:15 INFO SparkEnv: Registering MapOutputTracker 17/09/04 11:41:15 INFO SparkEnv: Registering BlockManagerMaster 17/09/04 11:41:15 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 17/09/04 11:41:15 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 17/09/04 11:41:15 INFO DiskBlockManager: Created local directory at C:\\Users\\11014525\\AppData\\Local\\Temp\\blockmgr-cba489b9-2458-455a-8c03-4c4395a01d44 17/09/04 11:41:15 INFO MemoryStore: MemoryStore started with capacity 896.4 MB 17/09/04 11:41:16 INFO SparkEnv: Registering OutputCommitCoordinator 17/09/04 11:41:16 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/09/04 11:41:16 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.16.202.21:4040 17/09/04 11:41:16 INFO Executor: Starting executo 具有修改权限的组:Set()17/09/04 11:41:15 INFO实用程序:在端口56668上成功启动了服务'sparkDriver'。17/09/04 11:41:15 INFO SparkEnv:注册MapOutputTracker 17/09 / 04 11:41:15 INFO SparkEnv:注册BlockManagerMaster 17/09/04 11:41:15 INFO BlockManagerMasterEndpoint:使用org.apache.spark.storage.DefaultTopologyMapper获取拓扑信息17/09/04 11:41:15 INFO BlockManagerMasterEndpoint :BlockManagerMasterEndpoint向上17/09/04 11:41:15 INFO DiskBlockManager:在C:\\ Users \\ 11014525 \\ AppData \\ Local \\ Temp \\ blockmgr-cba489b9-2458-455a-8c03-4c4395a01d44 17/09/04创建了本地目录:41:15 INFO MemoryStore:MemoryStore以容量896.4 MB启动17/04/04 17:41:16 INFO SparkEnv:注册OutputCommitCoordinator 17/09/04 11:41:16 INFO实用程序:已成功在端口4040上启动服务'SparkUI' 。17/09/04 11:41:16 INFO SparkUI:将SparkUI绑定到0.0.0.0,并从http://172.16.202.21:4040 17/09/04 11:41:16 INFO执行程序:启动执行程序 r ID driver on host localhost 17/09/04 11:41:16 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 56689. 17/09/04 11:41:16 INFO NettyBlockTransferService: Server created on 172.16.202.21:56689 17/09/04 11:41:16 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 17/09/04 11:41:16 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.16.202.21, 56689, None) 17/09/04 11:41:16 INFO BlockManagerMasterEndpoint: Registering block manager 172.16.202.21:56689 with 896.4 MB RAM, BlockManagerId(driver, 172.16.202.21, 56689, None) 17/09/04 11:41:16 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.16.202.21, 56689, None) 17/09/04 11:41:16 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 172.16.202.21, 56689, None) 17/09/04 11:41:16 WARN KafkaUtils: overriding enable.auto.commit to false for executor 17/09/04 11:41:16 r主机本地主机上的ID驱动程序17/09/04 11:41:16 INFO实用程序:端口56689上成功启动了服务'org.apache.spark.network.netty.NettyBlockTransferService'。17/09/04 11:41:16 INFO NettyBlockTransferService:服务器创建于172.16.202.21:56689 17/09/04 11:41:16 INFO BlockManager:使用org.apache.spark.storage.RandomBlockReplicationPolicy进行块复制策略17/09/04 11:41:16 INFO BlockManagerMaster:注册BlockManager BlockManagerId(驱动程序,172.16.202.21,56689,无)17/09/04 11:41:16 INFO BlockManagerMasterEndpoint:使用896.4 MB RAM注册BlockManager 172.1.202.21:56689,BlockManagerId(驱动程序,172.16.202.21,56689,无)17/09/04 11:41:16 INFO BlockManagerMaster:已注册的BlockManager BlockManagerId(驱动程序,172.16.202.21,56689,无)17/09/04 11:41:16 INFO BlockManager:已初始化BlockManager:BlockManagerId(驱动程序,172.16 .202.21,56689,None)17/09/04 11:41:16 WARF KafkaUtils:对执行者覆盖enable.auto.commit为false 17/09/04 11:41:16 WARN KafkaUtils: overriding auto.offset.reset to none for executor 17/09/04 11:41:16 WARN KafkaUtils: overriding executor group.id to spark-executor-Different id is allotted for different stream 17/09/04 11:41:16 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 see KAFKA-3135 17/09/04 11:41:16 INFO DirectKafkaInputDStream: Slide time = 10000 ms 17/09/04 11:41:16 INFO DirectKafkaInputDStream: Storage level = Serialized 1x Replicated 17/09/04 11:41:16 INFO DirectKafkaInputDStream: Checkpoint interval = null 17/09/04 11:41:16 INFO DirectKafkaInputDStream: Remember interval = 10000 ms 17/09/04 11:41:16 INFO DirectKafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka010.DirectKafkaInputDStream@23a3407b 17/09/04 11:41:16 INFO MappedDStream: Slide time = 10000 ms 17/09/04 11:41:16 INFO MappedDStream: Storage level = Serialized 1x Replicated 17/09/04 11:41:16 INFO MappedDStream: Checkpoint interval = null 17/09/04 11:41:16 INFO MappedDStream: Remember interval WARN KafkaUtils:将执行者的auto.offset.reset重置为none 17/09/04 11:41:16 WARN KafkaUtils:将执行者group.id覆盖为spark-executor-Different id分配给不同的流17/09/04 11: 41:16 WARF KafkaUtils:将receive.buffer.bytes覆盖为65536,请参阅KAFKA-3135 17/09/04 11:41:16 INFO DirectKafkaInputDStream:滑动时间= 10000 ms 17/09/04 11:41:16 INFO DirectKafkaInputDStream:存储级别=序列化的1x已复制17/09/04 11:41:16 INFO DirectKafkaInputDStream:检查点间隔= null 17/09/04 11:41:16 INFO DirectKafkaInputDStream:记住间隔= 10000 ms 17/09/04 11:41:16 INFO DirectKafkaInputDStream:初始化并验证的org.apache.spark.streaming.kafka010.DirectKafkaInputDStream@23a3407b 17/09/04 11:41:16 INFO MappedDStream:滑动时间= 10000 ms 17/09/04 11:41:16 INFO MappedDStream:存储级别=序列化的1x已复制17/09/04 11:41:16 INFO MappedDStream:检查点间隔=空17/09/04 11:41:16 INFO MappedDStream:记住间隔 = 10000 ms 17/09/04 11:41:16 INFO MappedDStream: Initialized and validated org.apache.spark.streaming.dstream.MappedDStream@140030a9 17/09/04 11:41:16 INFO ForEachDStream: Slide time = 10000 ms 17/09/04 11:41:16 INFO ForEachDStream: Storage level = Serialized 1x Replicated 17/09/04 11:41:16 INFO ForEachDStream: Checkpoint interval = null 17/09/04 11:41:16 INFO ForEachDStream: Remember interval = 10000 ms 17/09/04 11:41:16 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@65041548 17/09/04 11:41:16 ERROR StreamingContext: Error starting the context, marking it as stopped org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. = 10000 ms 17/09/04 11:41:16 INFO MappedDStream:初始化并验证的org.apache.spark.streaming.dstream.MappedDStream@140030a9 17/09/04 11:41:16 INFO ForEachDStream:滑动时间= 10000 ms 17/09/04 11:41:16 INFO ForEachDStream:存储级别=序列化1x复制17/09/04 11:41:16 INFO ForEachDStream:检查点间隔= null 17/09/04 11:41:16 INFO ForEachDStream:记住时间间隔= 10000毫秒17/09/04 11:41:16信息ForEachDStream:初始化并验证的org.apache.spark.streaming.dstream.ForEachDStream@65041548 17/09/04 11:41:16错误StreamingContext:启动上下文时出错,将其标记为已停止org.apache.kafka.common.config.ConfigException:缺少必需的配置“ partition.assignment.strategy”,该配置没有默认值。 at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124) at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:48) at org.apache.kafka.clients.consumer.ConsumerConfig.(ConsumerConfig.java:194) at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:380) at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:363) at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:350) at org.apache.spark.streaming.kafka010.Subscribe.onStart(ConsumerStrategy.scala:83) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.consumer(DirectKafkaInputDStream.scala:75) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.start(DirectKafkaInputDStream.scala:243) at org.apache.spark.streaming.DStreamGraph$$anonfun$start$5.apply(DStreamGraph.scala:49) at org.apache.spark.streaming.DStreamGraph$$anonfun$start$5.apply(DStreamGraph.scala:49) at scala.collection.parallel.mutable.ParArray$ParArrayIterator.foreach_quick(P 在org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)在org.apache.kafka.common.config.AbstractConfig。(AbstractConfig.java:48)在org.apache.kafka.clients处。 org.apache.kafka.clients.consumer.KafkaConsumer上的consumer.ConsumerConfig。(ConsumerConfig.java:194),org.apache.kafka.clients.consumer.KafkaConsumer。(KafkaConsumer.java:363) )的org.apache.spark.streams.kafka010.Subscribe.onStart(ConsumerStrategy.scala:83)的org.apache.spark.streams.kafka010.Subscribe.onStart(ConsumerStrategy.scala:83)的org.apache.spark.streaming .kafka010.DirectKafkaInputDStream.consumer(DirectKafkaInputDStream.scala:75)位于org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.start(DirectKafkaInputDStream.scala:243)位于org.apache.spark.streaming.DStreamGraph $$ anonfun $ start $ .apply(DStreamGraph.scala:49)位于org.apache.spark.streaming.DStreamGraph $$ anonfun $ start $ 5.apply(DStreamGraph.scala:49)位于scala.collection.parallel.mutable.ParArray $ ParArrayIterator.foreach_quick(P arArray.scala:143) at scala.collection.parallel.mutable.ParArray$ParArrayIterator.foreach(ParArray.scala:136) at scala.collection.parallel.ParIterableLike$Foreach.leaf(ParIterableLike.scala:972) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:49) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48) at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:51) at scala.collection.parallel.ParIterableLike$Foreach.tryLeaf(ParIterableLike.scala:969) at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:152) at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:443) at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.ja 在scala.collection.parallel.mutable.ParArray $ ParArrayIterator.foreach(ParArray.scala:136)处在scala.collection.parallel.ParIterableLike $ Foreach.leaf(ParIterableLike.scala:972)处在scala.collection .scala.collection.parallel.Task $$ anonfun $ tryLeaf $ 1.apply $ mcV $ sp(Tasks.scala:49)在scala.collection.parallel.Task $$ anonfun $ tryLeaf $ 1.apply(Tasks.scala:48)在scala.collection位于scala.collection.parallel.Task $ class.tryLeaf(Tasks.scala:51)的.parallel.Task $$ anonfun $ tryLeaf $ 1.apply(Tasks.scala:48)位于scala.collection.parallel.ParIterableLike $ Foreach.tryLeaf (在Scala.collection.parallel.AdaptiveWorkStealingTasks $ WrappedTask $ class.compute(Tasks.scala:152)在scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks $ WrappedTask.compute(Tasks.scala:443)(ParIterableLike.scala:969)在scala。在scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)处并发.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)在scala.concurrent.forkjoin.ForkJoinPool $ WorkQueue.runTask(ForkJoinPool.ja) va:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) at ... run in separate thread using org.apache.spark.util.ThreadUtils ... () at org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:578) at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:572) at org.apache.spark.streaming.api.java.JavaStreamingContext.start(JavaStreamingContext.scala:556) at Json.ExcelToJson.SparkConsumingKafka.main(SparkConsumingKafka.java:56) 17/09/04 11:41:16 INFO ReceiverTracker: ReceiverTracker stopped 17/09/04 11:41:16 INFO JobGenerator: Stopping JobGenerator immediately 17/09/04 11:41:16 INFO RecurringTimer: Stopped timer for JobGenerator after time -1 17/09/04 11:41:16 INFO JobGenerator: Stopped JobGenerator 17/09/04 11:41:16 INFO JobScheduler: Stopped JobScheduler Exception in thread "main" org.apache.kafka.common.config.ConfigException: M va:1339)在scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)在scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)在...使用org.apache在单独的线程中运行.spark.util.ThreadUtils ...()在org.apache.spark.streaming.StreamingContext.liftedTree1 $ 1(StreamingContext.scala:578)在org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:572)在org.apache.spark.streaming.api.java.JavaStreamingContext.start(JavaStreamingContext.scala:556)在Json.ExcelToJson.SparkConsumingKafka.main(SparkConsumingKafka.java:56)在17/09/04 11:41:16 INFO ReceiverTracker :ReceiverTracker停止17/09/04 11:41:16 INFO JobGenerator:立即停止JobGenerator 17/09/04 11:41:16 INFO RecurringTimer:在时间-1 17/09/04 11:41:16之后停止JobGenerator的计时器INFO JobGenerator:停止JobGenerator 17/09/04 11:41:16 INFO JobScheduler:停止线程“ main”中的JobScheduler异常org.apache.kafka.common.config.ConfigException:M issing required configuration "partition.assignment.strategy" which has no default value. 发出没有默认值的必需配置“ partition.assignment.strategy”。 at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124) at org.apache.kafka.common.config.AbstractConfig.(AbstractConfig.java:48) at org.apache.kafka.clients.consumer.ConsumerConfig.(ConsumerConfig.java:194) at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:380) at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:363) at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:350) at org.apache.spark.streaming.kafka010.Subscribe.onStart(ConsumerStrategy.scala:83) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.consumer(DirectKafkaInputDStream.scala:75) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.start(DirectKafkaInputDStream.scala:243) at org.apache.spark.streaming.DStreamGraph$$anonfun$start$5.apply(DStreamGraph.scala:49) at org.apache.spark.streaming.DStreamGraph$$anonfun$start$5.apply(DStreamGraph.scala:49) at scala.collection.parallel.mutable.ParArray$ParArrayIterator.foreach_quick(P 在org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)在org.apache.kafka.common.config.AbstractConfig。(AbstractConfig.java:48)在org.apache.kafka.clients处。 org.apache.kafka.clients.consumer.KafkaConsumer上的consumer.ConsumerConfig。(ConsumerConfig.java:194),org.apache.kafka.clients.consumer.KafkaConsumer。(KafkaConsumer.java:363) )的org.apache.spark.streams.kafka010.Subscribe.onStart(ConsumerStrategy.scala:83)的org.apache.spark.streams.kafka010.Subscribe.onStart(ConsumerStrategy.scala:83)的org.apache.spark.streaming .kafka010.DirectKafkaInputDStream.consumer(DirectKafkaInputDStream.scala:75)位于org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.start(DirectKafkaInputDStream.scala:243)位于org.apache.spark.streaming.DStreamGraph $$ anonfun $ start $ .apply(DStreamGraph.scala:49)位于org.apache.spark.streaming.DStreamGraph $$ anonfun $ start $ 5.apply(DStreamGraph.scala:49)位于scala.collection.parallel.mutable.ParArray $ ParArrayIterator.foreach_quick(P arArray.scala:143) at scala.collection.parallel.mutable.ParArray$ParArrayIterator.foreach(ParArray.scala:136) at scala.collection.parallel.ParIterableLike$Foreach.leaf(ParIterableLike.scala:972) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:49) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48) at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:51) at scala.collection.parallel.ParIterableLike$Foreach.tryLeaf(ParIterableLike.scala:969) at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:152) at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:443) at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.ja 在scala.collection.parallel.mutable.ParArray $ ParArrayIterator.foreach(ParArray.scala:136)处在scala.collection.parallel.ParIterableLike $ Foreach.leaf(ParIterableLike.scala:972)处在scala.collection .scala.collection.parallel.Task $$ anonfun $ tryLeaf $ 1.apply $ mcV $ sp(Tasks.scala:49)在scala.collection.parallel.Task $$ anonfun $ tryLeaf $ 1.apply(Tasks.scala:48)在scala.collection位于scala.collection.parallel.Task $ class.tryLeaf(Tasks.scala:51)的.parallel.Task $$ anonfun $ tryLeaf $ 1.apply(Tasks.scala:48)位于scala.collection.parallel.ParIterableLike $ Foreach.tryLeaf (在Scala.collection.parallel.AdaptiveWorkStealingTasks $ WrappedTask $ class.compute(Tasks.scala:152)在scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks $ WrappedTask.compute(Tasks.scala:443)(ParIterableLike.scala:969)在scala。在scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)处并发.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)在scala.concurrent.forkjoin.ForkJoinPool $ WorkQueue.runTask(ForkJoinPool.ja) va:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) at ... run in separate thread using org.apache.spark.util.ThreadUtils ... () at org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:578) at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:572) at org.apache.spark.streaming.api.java.JavaStreamingContext.start(JavaStreamingContext.scala:556) at Json.ExcelToJson.SparkConsumingKafka.main(SparkConsumingKafka.java:56) 17/09/04 11:41:16 INFO SparkContext: Invoking stop() from shutdown hook 17/09/04 11:41:16 INFO SparkUI: Stopped Spark web UI at http://172.16.202.21:4040 17/09/04 11:41:16 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! va:1339)在scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)在scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)在...使用org.apache在单独的线程中运行.spark.util.ThreadUtils ...()在org.apache.spark.streaming.StreamingContext.liftedTree1 $ 1(StreamingContext.scala:578)在org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:572)在org.apache.spark.streaming.api.java.JavaStreamingContext.start(JavaStreamingContext.scala:556)在Json.ExcelToJson.SparkConsumingKafka.main(SparkConsumingKafka.java:56)在17/09/04 11:41:16 INFO SparkContext :从关机挂钩调用stop()17/09/04 11:41:16 INFO SparkUI: http ://172.16.202.21:4040 17/09/04 11:41:16停止Spark Web UI INFO MapOutputTrackerMasterEndpoint:MapOutputTrackerMasterEndpoint已停止! 17/09/04 11:41:16 INFO MemoryStore: MemoryStore cleared 17/09/04 11:41:16 INFO BlockManager: BlockManager stopped 17/09/04 11:41:16 INFO BlockManagerMaster: BlockManagerMaster stopped 17/09/04 11:41:16 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 17/09/04 11:41:16 INFO MemoryStore:MemoryStore已清除17/09/04 11:41:16 INFO BlockManager:BlockManager已停止17/09/04 11:41:16 INFO BlockManagerMaster:BlockManagerMaster已停止04/09/17 11:41:16 INFO OutputCommitCoordinator $ OutputCommitCoordinatorEndpoint:OutputCommitCoordinator已停止! 17/09/04 11:41:16 INFO SparkContext: Successfully stopped SparkContext 17/09/04 11:41:16 INFO ShutdownHookManager: Shutdown hook called 17/09/04 11:41:16 INFO ShutdownHookManager: Deleting directory C:\\Users\\11014525\\AppData\\Local\\Temp\\spark-37334cdc-9680-4801-8e50-ef3024ed1d8a 17/09/04 11:41:16 INFO SparkContext:成功停止SparkContext 17/09/04 11:41:16 INFO ShutdownHookManager:名为17/09/04 11:41:16 INFO的Shutdown HookManager:INFO ShutdownHookManager:删除目录C:\\用户\\ 11014525 \\应用程序数据\\本地的\\ Temp \\火花37334cdc-9680-4801-8e50-ef3024ed1d8a

pom.xml 的pom.xml

org.apache.spark spark-streaming_2.11 2.1.0 commons-lang commons-lang 2.6 org.apache.kafka kafka_2.10 0.8.2.0 org.apache.spark spark-streaming-kafka-0-10_2.10 2.1.1 org.apache.spark spark-streaming_2.11 2.1.0 commons-lang commons-lang 2.6 org.apache.kafka kafka_2.10 0.8.2.0 org.apache.spark spark-streaming-kafka-0-10_2.10 2.1.1

From the log, your spark version is 2.1.0. 从日志中,您的Spark版本为2.1.0。 You have not shared the build file having other dependencies. 您尚未共享具有其他依赖项的构建文件。 It looks like you have both spark-streaming-kafka-0-8_2.11-2.1.0.jar and spark-streaming-kafka-0-10_2.11-2.1.0.jar in classpath and it is loading the wrong class. 看来您在classpath中同时拥有spark-streaming-kafka-0-8_2.11-2.1.0.jarspark-streaming-kafka-0-10_2.11-2.1.0.jar ,并且正在加载错误的类。 If you are using maven then you would need dependencies like below. 如果您使用的是maven,则需要如下所示的依赖项。 Please check and update your project. 请检查并更新您的项目。

<dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.1.0</version>
</dependency>
<dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.1.0</version>
</dependency>
<dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>2.1.0</version>
</dependency>  
<dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>2.1.0</version>
</dependency> 

EDIT 编辑

As you have edited the question and posted the dependencies I am editing my Answer. 当您编辑问题并发布相关性后,我正在编辑我的答案。 You are using Kafka version 0.8.* while your spark-streaming-kafka version is 0.10.* . 您正在使用Kafka版本0.8.*而spark-streaming-kafka版本是0.10.* Please use same version for Kafka dependencies. 请为Kafka依赖项使用相同版本。 Please use below dependency for org.apache.kafka 请对org.apache.kafka使用以下依赖项

<dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka_2.11</artifactId>
        <version>0.10.2.0</version>
</dependency>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Kafka主题分区和Spark执行器映射 - Kafka topic partition and Spark executor mapping Kafka API,用于获取主题详细信息,例如分区和偏移量 - Kafka APIs for getting topic details like Partition and Offset 如何在火花流中映射kafka主题名称和相应记录 - How to map kafka topic names and respective records in spark streaming Spark Streaming:写入从Kafka主题读取的行数 - Spark Streaming: Writing number of rows read from a Kafka topic 来自一个 Kafka 主题源的并发 Spark stream 作业 - Concurrent Spark stream job from one Kafka topic source Spark Streaming作业如何在Kafka主题上发送数据并将其保存在Elastic中 - Spark Streaming job how to send data on Kafka topic and saving it in Elastic 如何使用 Java Spark 结构化流从 Kafka 主题正确消费 - How to consume correctly from Kafka topic with Java Spark structured streaming Kafka主题与分区主题 - Kafka Topic vs Partition topic Spark Structured Streaming foreach Sink 自定义编写器无法从 Kafka 主题读取数据 - Spark Structured Streaming foreach Sink custom writer is not able to read data from Kafka topic 在使用 kafka 和 Spark 流创建直接流之前获取主题的分区数? - Get number of partitions for a topic before creating a direct stream using kafka and spark streaming?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM