简体   繁体   English

min.insync.replica 配置是否会影响 Kafka 生产者吞吐量?

[英]Does min.insync.replica configuration affect Kafka producer throughput?

From kafka documentation来自 kafka 文档

When a producer sets acks to "all" (or "-1"), this min.insync.replica configuration specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful.当生产者将 acks 设置为“全部”(或“-1”)时,此 min.insync.replica 配置指定必须确认写入才能将写入视为成功的最小副本数。

It says when the minimum number of in-sync replicas acknowledge, the write is successful but when i run performance test with min.insync.replica as 1 and 3 (for a topic of partition=1 and RF=5 in 5 broker setup), the performance of kafka producer, with acks='all' , is same.它说当最小数量的同步副本确认时,写入成功但是当我使用min.insync.replica作为 1 和 3 运行性能测试时(对于 5 个代理设置中的分区 = 1 和 RF = 5 的主题) , kafka 生产者的表现, acks='all' ,是一样的。

So, Does min.insync.replica per-topic configuration affects Kafka producer throughput (ran in isolation) with acks="all" ?那么, min.insync.replica每个主题的配置是否会影响acks="all" Kafka 生产者吞吐量(隔离运行)?

If you use acks='all' , the leader waits until in-sync replicas get the message before sending back an acknowledgment or an error, so the performance is affected.如果使用acks='all' ,leader 会等待同步副本收到消息,然后再发回确认或错误,因此性能会受到影响。 In case of min.insync.replica=1 , the producer gets a response back once the message is written to the leader.min.insync.replica=1情况下,一旦将消息写入领导者,生产者就会得到响应。 It should be faster than using min.insync.replica=3 as in this case the producer waits for 2 replicas to get all the messages before it can consider the message as committed.它应该比使用min.insync.replica=3更快,因为在这种情况下,生产者等待 2 个副本以获取所有消息,然后才能将消息视为已提交。

Your results mean that the latency between your brokers is very low.您的结果意味着您的经纪人之间的延迟非常低。 I believe you should see the difference if you start the brokers in different datacenters/regions.我相信如果您在不同的数据中心/区域启动代理,您应该会看到不同之处。

min.insync.replica is the minimum number of replicas that must acknowledge that data was received successfully for a write to be successful. min.insync.replica是必须确认数据已成功接收才能成功写入的最小副本数。

Throughput will be definitely affected if you set min.insync.replica to 3 and acks=all but won't be affected if you set acks=0 or 1 , but when you do this there is a possibility of data loss if the leader fails.如果您将min.insync.replica设置为 3 和acks=all吞吐量肯定会受到影响,但如果您设置acks=01则不会受到影响,但是当您这样做时,如果领导者失败,则可能会丢失数据.

if you DO NOT set acks='all' and min.insync.replica > 1 be aware that you are risking data loss.如果您没有设置acks='all'min.insync.replica > 1 请注意您有数据丢失的风险。 If the leader goes down, it means there is no guarentee that the replicated node is a copy of the leading one.如果领导者宕机,则意味着无法保证复制的节点是领导者的副本。 That was actually the main idea behind Kafka preventing such cases as a distributed system.这实际上是 Kafka 防止分布式系统等情况的主要思想。

When a producer sets acks to "all" (or "-1"), min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful当生产者将 acks 设置为“全部”(或“-1”)时,min.insync.replicas 指定必须确认写入才能将写入视为成功的最小副本数

When used together, min.insync.replicas and acks allow you to enforce greater durability guarantees.当一起使用时,min.insync.replicas 和 acks 允许您强制执行更大的持久性保证。 A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with acks of "all"一个典型的场景是创建一个复制因子为 3 的主题,将 min.insync.replicas 设置为 2,并使用“all”确认生产

HIgher the min.insync.replicas, the leader for that partition needs to wait for higher number of data copies to be written synchronously - hence lower performance.更高的 min.insync.replicas,该分区的领导者需要等待更多数量的数据副本被同步写入 - 因此性能较低。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM