简体   繁体   English

我可以将task.commit.ms设置为每1ms吗?

[英]Can I set task.commit.ms to every 1ms?

I have a project with Apache-Samza and I have a problem with duplicate data. 我有一个Apache-Samza项目,重复数据有问题。

This is my checkpoint configuration : 这是我的检查点配置:

task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory
task.checkpoint.system=kafka
task.checkpoint.replication.factor=2
task.commit.ms=20000

On the documentation We can read this : 在文档上,我们可以阅读以下内容:

If task.checkpoint.factory is configured, this property determines how often a checkpoint is written. 如果配置了task.checkpoint.factory,则此属性确定编写检查点的频率。 The value is the time between checkpoints, in milliseconds. 该值是检查点之间的时间,以毫秒为单位。 The frequency of checkpointing affects failure recovery: if a container fails unexpectedly (eg due to crash or machine failure) and is restarted, it resumes processing at the last checkpoint. 检查点的频率会影响故障恢复:如果容器意外失败(例如,由于崩溃或机器故障)并重新启动,它将在最后一个检查点恢复处理。 Any messages processed since the last checkpoint on the failed container are processed again. 自失败容器上的最后一个检查点以来处理的所有消息都将再次处理。 Checkpointing more frequently reduces the number of messages that may be processed twice, but also uses more resources. 更加频繁地执行检查点操作可以减少可能被处理两次的消息数量,但同时也会占用更多资源。

So can I change task.commit.ms=20000 to 250ms or 1ms. 所以我可以将task.commit.ms=20000更改为250ms或1ms。 It's good or very bad ? 是好是坏? I have a very good cluster. 我有一个很好的群集。

Why I need change this, because this Samza(worker) crash 1-3 time each week. 为什么我需要更改此设置,因为这个Samza(工人)每周崩溃1-3次。 And now the temporary solution is commit offset each time. 现在,临时解决方案是每次提交偏移量。


Documentation ref : 参考文献:

Appache-Samza 阿帕奇-萨姆扎

Apache-Samza-Configuration Apache-Samza配置

My solution I know it's not the solution for all problem. 我知道我的解决方案不是所有问题的解决方案。 It's change the task.commit.ms to the same thing of task.shutdown.ms=5000 . task.commit.ms更改为task.shutdown.ms=5000的相同内容。

Atlas-Samza-Configuration Shutdown Atlas-Samza配置关闭

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM