[英]Kafka Spout read twice message on Storm Topology
I'm trying to simulate stream traffic using Kafka to Storm. 我正在尝试使用Kafka to Storm模拟流流量。 I used KafkaSpout to read a message from one topic sent by a Producer that read these Tweets and send them to a topic.
我使用KafkaSpout从生产者发送的一个主题中读取一条消息,该生产者阅读了这些推文并将其发送给一个主题。 My problem is that after topology consumes all tweet send in this topic it continues to read the message in the topic twice.
我的问题是,拓扑消耗了该主题中发送的所有tweet之后,它将继续读取该主题中的消息两次。 How can I stop KafkaSpout from reading twice?(replication factor is set to 1)
如何阻止KafkaSpout读取两次?(复制因子设置为1)
The configuration looks fine to me. 配置对我来说很好。
Maybe the issue is double acking. 也许问题是重复的。 Make sure you're only acking each tuple once in
execute
. 确保在
execute
只确认每个元组一次。
As mentioned in a comment, please consider upgrading to a newer Kafka version, as well as switching to storm-kafka-client
. 如评论中所述,请考虑升级到较新的Kafka版本,以及切换到
storm-kafka-client
。
Also something that may make your life a little easier: Consider extending BaseBasicBolt
instead of BaseRichBolt
. 还有一些可能会使您的生活更轻松的事情:考虑扩展
BaseBasicBolt
而不是BaseRichBolt
。 BaseBasicBolt
automatically acks the tuple for you if running execute
doesn't throw an error. 如果运行
execute
不会引发错误, BaseBasicBolt
自动为您修改元组。 If you want to fail a tuple you can throw FailedException
. 如果要使元组失败,可以抛出
FailedException
。 BaseRichBolt
should only be used if you want to do more complicated acking, eg aggregating tuples from many execute
invocations in-memory before acking. 仅当您要执行更复杂的确认时才应使用
BaseRichBolt
,例如在确认之前从内存中的许多execute
调用中聚合元组。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.