[英]Spark Structured Streaming getting messages for last Kafka partition
I am using Spark Structured Streaming to read from Kafka topic. 我正在使用Spark结构化流技术来阅读Kafka主题。
Without any partition, Spark Structired Streaming consumer can read data. 没有任何分区,Spark Structired Streaming使用者可以读取数据。
But when I added partitions to topic, the client is showing messages from last partition only. 但是,当我向主题添加分区时,客户端仅显示来自最后一个分区的消息。 Ie if there are 4 partitions in topic and I.am pushing numbers like 1,2,3,4 in topic,then client printing only 4 not other values.
即如果主题中有4个分区,并且I.am在主题中推入1,2,3,4之类的数字,则客户端仅打印4个而不是其他值。
I am using latest samples and binaries from Spark Structured Streaming website. 我正在使用Spark Structured Streaming网站上的最新样本和二进制文件。
DataFrame<Row> df = spark
.readStream()
.format("kafka")
.option("kafka.bootstrap.servers", "host1:port1,host2:port2")
.option("subscribe", "topic1")
.load()
Am I missing anything? 我有什么想念的吗?
Issue is resolved by changing kafka-clients-0.10.1.1.jar to kafka-clients-0.10.0.1.jar. 通过将kafka-clients-0.10.1.1.jar更改为kafka-clients-0.10.0.1.jar解决了问题。
Found reference here Spark Structured Stream get messages from only one partition of Kafka 在此处找到参考Spark结构化流仅从Kafka的一个分区获取消息
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.