[英]Copy messages from one Kafka topic to another using offsets/timestamps
For some data processing, we need to reprocess all the messages between 2 timestamps say between 1st Jan to 15th Jan.对于某些数据处理,我们需要重新处理 2 个时间戳之间的所有消息,例如 1 月 1 日到 1 月 15 日之间。
to control upper bound we are planning to create a new topic that will have these messages so that once this task is complete, we can delete the topic too.为了控制上限,我们计划创建一个包含这些消息的新主题,以便在此任务完成后,我们也可以删除该主题。 The new topic will have data from a particular offsets of source topic
新主题将具有来自源主题的特定偏移量的数据
partition 1 - from offset 100分区 1 - 从偏移量 100
partition 2 - from offset 2400... and so on分区 2 - 从偏移量 2400... 等等
What is the most suitable solution for this?什么是最合适的解决方案? approx 10lacs messages fall in this.
大约有 10lacs 消息落入其中。
.assign
for the partitions you want to copy.assign
.seek
for each starting offset of those partitions..seek
。 You can use offsetsForTimes
method to get them for a specific timestamp;offsetsForTimes
方法为特定时间戳获取它们; then you can pass those on to the seek method.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.