简体   繁体   English

Spark结构化流获取最后一个Kafka分区的消息

[英]Spark Structured Streaming getting messages for last Kafka partition

I am using Spark Structured Streaming to read from Kafka topic. 我正在使用Spark结构化流技术来阅读Kafka主题。

Without any partition, Spark Structired Streaming consumer can read data. 没有任何分区,Spark Structired Streaming使用者可以读取数据。

But when I added partitions to topic, the client is showing messages from last partition only. 但是,当我向主题添加分区时,客户端仅显示来自最后一个分区的消息。 Ie if there are 4 partitions in topic and I.am pushing numbers like 1,2,3,4 in topic,then client printing only 4 not other values. 即如果主题中有4个分区,并且I.am在主题中推入1,2,3,4之类的数字,则客户端仅打印4个而不是其他值。

I am using latest samples and binaries from Spark Structured Streaming website. 我正在使用Spark Structured Streaming网站上的最新样本和二进制文件。

    DataFrame<Row> df = spark
 .readStream()
 .format("kafka") 
.option("kafka.bootstrap.servers", "host1:port1,host2:port2") 
.option("subscribe", "topic1") 
.load()

Am I missing anything? 我有什么想念的吗?

Issue is resolved by changing kafka-clients-0.10.1.1.jar to kafka-clients-0.10.0.1.jar. 通过将kafka-clients-0.10.1.1.jar更改为kafka-clients-0.10.0.1.jar解决了问题。

Found reference here Spark Structured Stream get messages from only one partition of Kafka 在此处找到参考Spark结构化流仅从Kafka的一个分区获取消息

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 不消耗Spark流式传输Kafka消息 - Spark streaming Kafka messages not consumed 使用Java Kafka进行Spark结构化流式编程 - Spark Structured Streaming Programming with Kafka in Java 跟踪 Spark 结构化流中的消费消息 - Track of consumed messages in Spark structured streaming 如何使用 Java Spark 结构化流从 Kafka 主题正确消费 - How to consume correctly from Kafka topic with Java Spark structured streaming 在Spark结构化流媒体中使用Kafka接收器时,检查点是否必须执行? - Is checkpointing mandatory when using a Kafka sink in Spark Structured Streaming? 如何从Spark结构化流媒体获取Kafka输出中的批次ID - How to get batch ID in Kafka output from Spark Structured Streaming 如何使用Spark结构化流为Kafka流实现自定义反序列化器? - How to implement custom deserializer for Kafka stream using Spark structured streaming? Java Kafka结构化流 - Java Kafka Structured Streaming 获取NotSerializableException - 将Spark Streaming与Kafka一起使用时 - Getting NotSerializableException - When using Spark Streaming with Kafka 带有 Spark 3.0.1 结构化流的 Kafka:ClassException:org.apache.kafka.common.TopicPartition; class 对反序列化无效 - Kafka with Spark 3.0.1 Structured Streaming : ClassException: org.apache.kafka.common.TopicPartition; class invalid for deserialization
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM