简体   繁体   English

我如何在我的 kafka 流应用程序中测试 Exactly Once Semantics

[英]how do i test Exactly Once Semantics working in my kafka streams application

i have a Kafka Streams DSL application, we have a requirement on exactly once processing, for the same i have added the configuration我有一个 Kafka Streams DSL 应用程序,我们要求只处理一次,同样我已经添加了配置

streamConfig.put(processing.gurantee, "exactly_once");

I am using kafka 2.7 I have 2 queries我正在使用 kafka 2.7 我有 2 个查询

  1. what's the difference between exactly_once and exactly_once_beta exact_once 和 exact_once_beta 有什么区别
  2. how do i test this functionality to be sure my messages are getting processed only once我如何测试此功能以确保我的消息只被处理一次

Thanks!谢谢!

exactly_once_beta is an improvement over exactly_once . exactly_once_beta是对exactly_once的改进。 While exactly_once uses a transactional producer for each stream task (combination of sub-topology and input partition, exactly_once_beta uses a transactional producer for each stream thread of a Kafka Streams client. Every producer comes with separate memory buffers, a separate thread, separate network connections which might limit scaling the number of input partitions (ie number of tasks). A high number of producers might also cause more load on the brokers. Hence, exactly_once_beta has better scaling characteristics. You can find more details in KIP-447 . While exactly_once uses a transactional producer for each stream task (combination of sub-topology and input partition, exactly_once_beta uses a transactional producer for each stream thread of a Kafka Streams client. Every producer comes with separate memory buffers, a separate thread, separate network connections这可能会限制扩展输入分区的数量(即任务的数量)。大量生产者也可能导致代理的负载增加。因此, exactly_once_beta具有更好的扩展特性。您可以在KIP-447中找到更多详细信息。

Note that exactly_once will be deprecated and exactly_once_beta will be renamed to exactly_once_v2 in Apache Kafka 3.0.请注意,在 Apache Kafka 3.0 中, exactly_once将被弃用, exactly_once_beta将重命名为exactly_once_v2 See KIP-732 for more details.有关详细信息,请参阅KIP-732

For tests you can get inspiration from the tests in the Apache Kafka repo:对于测试,您可以从 Apache Kafka 存储库中的测试中获得灵感:

Basically, you need to create a failover scenario and verify that messages are not produced multiple times to the output topics.基本上,您需要创建故障转移方案并验证消息是否不会多次生成到 output 主题。 Note that messages may be processed multiple times, but the results in the output topics must appear as if they were only processed once.请注意,消息可能会被处理多次,但 output 主题中的结果必须看起来好像只处理了一次。 You can find a pretty good talk about exactly-once semantics that also explains the failover scenarios here: https://www.confluent.io/kafka-summit-london18/dont-repeat-yourself-introducing-exactly-once-semantics-in-apache-kafka/你可以在这里找到一个关于完全一次语义的很好的讨论,它也解释了故障转移场景: https://www.confluent.io/kafka-summit-london18/dont-repeat-yourself-introducing-exactly-once-semantics-在-apache-kafka/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM