简体   繁体   English

Kafka 消费者的主题和分区发现

[英]Topic and partition discovery for Kafka consumer

I am fairly new to Flink and Kafka and have some data aggregation jobs written in Scala which run in Apache Flink, the jobs consume data from Kafka perform aggregation and produce results back to Kafka.我对 Flink 和 Kafka 相当陌生,并且在 Scala 中编写了一些数据聚合作业,这些作业在 Apache Flink 中运行,这些作业使用来自 Kafka 的数据执行聚合并将结果返回给 Kafka。

I need the jobs to consume data from any new Kafka topic created while the job is running which matches a pattern.我需要这些作业来使用在作业运行时创建的与模式匹配的任何新 Kafka 主题的数据。 I got this working by setting the following properties for my consumer我通过为我的消费者设置以下属性来完成这项工作

val properties = new Properties()
properties.setProperty(“bootstrap.servers”, “my-kafka-server”)
properties.setProperty(“group.id”, “my-group-id”)
properties.setProperty(“zookeeper.connect”, “my-zookeeper-server”)
properties.setProperty(“security.protocol”, “PLAINTEXT”)
properties.setProperty(“flink.partition-discovery.interval-millis”, “500”);
properties.setProperty(“enable.auto.commit”, “true”);
properties.setProperty(“auto.offset.reset”, “earliest”);

val consumer = new FlinkKafkaConsumer011[String](Pattern.compile(“my-topic-start-.*”), new SimpleStringSchema(), properties)

The consumer works fine and consumes data from existing topics which start with “my-topic-start-”消费者工作正常并使用以“my-topic-start-”开头的现有主题的数据

When I publish data against a new topic say for example “my-topic-start-test1” for the first time, my consumer does not recognise the topic until after 500 milliseconds after the topic was created, this is based on the properties.当我第一次发布针对新主题的数据时,例如“my-topic-start-test1”,我的消费者直到主题创建后 500 毫秒后才识别该主题,这是基于属性的。 When the consumer identifies the topic it does not read the first data record published and starts reading subsequent records so effectively I loose that first data record every time data is published against a new topic.当消费者识别出主题时,它不会读取发布的第一条数据记录,而是开始有效地读取后续记录,每次针对新主题发布数据时,我都会丢失第一条数据记录。

Is there a setting I am missing or is it how Kafka works?是否有我遗漏的设置或者卡夫卡的工作方式? Any help would be appreciated.任何帮助,将不胜感激。

Thanks Shravan谢谢Shravan

I think part of the issue is my producer was creating topic and publishing message in one go, so by the time consumer discovers new partition that message has already been produced.我认为部分问题是我的生产者在一个 go 中创建主题并发布消息,所以当消费者发现新分区时,该消息已经生成。

As a temporary solution I updated my producer to create the topic if it does not exists and then publish a message (make it 2 step process) and this works.作为一个临时解决方案,我更新了我的生产者以创建该主题(如果它不存在),然后发布一条消息(使其成为 2 步过程)并且这有效。

Would be nice to have a more robust consumer side solution though:)不过,如果有一个更强大的消费者端解决方案会很好:)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spark Kafka Consumer不使用主题消息 - Spark Kafka Consumer not consuming messages from topic Kafka 消费者属性从一个主题开始 - Kafka consumer properties from the beginning in a topic Apache Kafka:如何找出主题的消费者群体? - Apache Kafka: How to find out consumer group of a topic? 如何使用Samza在Kafka主题上创建分区? - How can you create a partition on a Kafka topic using Samza? 消息发送到的 Kafka Topic 的分区 id 是否与 ForeachWriter 中 open 方法的分区 id 匹配? - Does the partition id of the Kafka Topic to which message was sent maches with partition id of open method in ForeachWriter? Kafka的Spark流:如何从Kafka使用者DStream获得主题名称? - Spark streaming for Kafka: How to get the topic name from Kafka consumer DStream? 当我使用 akka 流在现有消费者组中创建新消费者时,如何寻求结束 kafka 主题? - How do I seek to end of a kafka topic when I am creating a new consumer in an existing consumer group with akka streams? Kafka:从消费者方面动态确定主题中分区数量的最佳方法是什么? - Kafka: what's the best way to dynamically determine the number of partitions in a topic from the consumer side? 有没有办法在Kafka使用者中指定多个解码器(或每个主题一个)? 其他人觉得需要吗? - Is there a way to specify multiple Decoders (or one per Topic) in a Kafka Consumer? Anyone else felt need for this? 为什么 Kafka 消费者会忽略我在 auto.offset.reset 参数中的“最早”指令,从而没有从绝对的第一个事件中读取我的主题? - Why is Kafka consumer ignoring my “earliest” directive in the auto.offset.reset parameter and thus not reading my topic from the absolute first event?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM