简体   繁体   English

将kafka消息归档到AWS S3时如何实现“精确一次”语义?

[英]How to achieve Exactly Once semantics when archive kafka message into AWS S3?

How to store kafka message data with partition offset in one S3 PutObject transaction to achieve Exactly Once semantics? 如何在一个S3 PutObject事务中存储具有分区偏移量的kafka消息数据,以实现“恰好一次”语义? Does it possible? 有可能吗

Yes it should be possible. 是的,应该有可能。 One way to do that is take control of the offset management. 一种方法是控制偏移量管理。

Your consumer can read one message from Kafka at at time and put that as object in AWS, while doing that store offset + partition name as key in the AWS. 您的消费者可以一次从Kafka中读取一条消息,并将其作为对象放在AWS中,同时将偏移量+分区名称存储为AWS中的键。 Now lets say your client crashed. 现在,让您的客户崩溃了。 When it comes up next time you query S3 to find what is the last offset in S3 and start reading message from there. 下次出现时,您查询S3以查找S3中的最后一个偏移量,然后从那里开始读取消息。 For additional protection before you put message in S3 check if object with that key (It would be better if your producer produces UUID for message and you can use that) exists in S3 if yes dont overwrite it instead skip the message. 为了在将消息放入S3中​​之前提供额外的保护,请检查S3中是否存在带有该密钥的对象(如果生产者为消息生成UUID,并且可以使用该密钥,则更好),如果是,则不覆盖它,而是跳过消息。

kafkaConsumer.subscribe(Arrays.asList(topicName), new ConsumerRebalanceListener() {
     public void onPartitionsRevoked(Collection<TopicPartition> partitions) {}
     public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
        Iterator<TopicPartition> topicPartitionIterator = partitions.iterator();
        while(topicPartitionIterator.hasNext()){
              TopicPartition topicPartition = topicPartitionIterator.next();
              System.out.println("Current offset is " + kafkaConsumer.position(topicPartition) + " committed offset is kafkaConsumer.committed(topicPartition) 
              System.out.println("Resetting offset to " + startingOffset);
              kafkaConsumer.seek(topicPartition, startingOffset);
           }
        }
      }
  });

Hope that helps 希望能有所帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM