[英]Does min.insync.replica configuration affect Kafka producer throughput?
From kafka documentation来自 kafka 文档
When a producer sets acks to "all" (or "-1"), this min.insync.replica configuration specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful.
当生产者将 acks 设置为“全部”(或“-1”)时,此 min.insync.replica 配置指定必须确认写入才能将写入视为成功的最小副本数。
It says when the minimum number of in-sync replicas acknowledge, the write is successful but when i run performance test with min.insync.replica
as 1 and 3 (for a topic of partition=1 and RF=5 in 5 broker setup), the performance of kafka producer, with acks='all'
, is same.它说当最小数量的同步副本确认时,写入成功但是当我使用
min.insync.replica
作为 1 和 3 运行性能测试时(对于 5 个代理设置中的分区 = 1 和 RF = 5 的主题) , kafka 生产者的表现, acks='all'
,是一样的。
So, Does min.insync.replica
per-topic configuration affects Kafka producer throughput (ran in isolation) with acks="all"
?那么,
min.insync.replica
每个主题的配置是否会影响acks="all"
Kafka 生产者吞吐量(隔离运行)?
If you use acks='all'
, the leader waits until in-sync replicas get the message before sending back an acknowledgment or an error, so the performance is affected.如果使用
acks='all'
,leader 会等待同步副本收到消息,然后再发回确认或错误,因此性能会受到影响。 In case of min.insync.replica=1
, the producer gets a response back once the message is written to the leader.在
min.insync.replica=1
情况下,一旦将消息写入领导者,生产者就会得到响应。 It should be faster than using min.insync.replica=3
as in this case the producer waits for 2 replicas to get all the messages before it can consider the message as committed.它应该比使用
min.insync.replica=3
更快,因为在这种情况下,生产者等待 2 个副本以获取所有消息,然后才能将消息视为已提交。
Your results mean that the latency between your brokers is very low.您的结果意味着您的经纪人之间的延迟非常低。 I believe you should see the difference if you start the brokers in different datacenters/regions.
我相信如果您在不同的数据中心/区域启动代理,您应该会看到不同之处。
min.insync.replica
is the minimum number of replicas that must acknowledge that data was received successfully for a write to be successful. min.insync.replica
是必须确认数据已成功接收才能成功写入的最小副本数。
Throughput will be definitely affected if you set min.insync.replica
to 3 and acks=all
but won't be affected if you set acks=0
or 1
, but when you do this there is a possibility of data loss if the leader fails.如果您将
min.insync.replica
设置为 3 和acks=all
吞吐量肯定会受到影响,但如果您设置acks=0
或1
则不会受到影响,但是当您这样做时,如果领导者失败,则可能会丢失数据.
if you DO NOT set acks='all'
and min.insync.replica
> 1 be aware that you are risking data loss.如果您没有设置
acks='all'
和min.insync.replica
> 1 请注意您有数据丢失的风险。 If the leader goes down, it means there is no guarentee that the replicated node is a copy of the leading one.如果领导者宕机,则意味着无法保证复制的节点是领导者的副本。 That was actually the main idea behind Kafka preventing such cases as a distributed system.
这实际上是 Kafka 防止分布式系统等情况的主要思想。
When a producer sets acks to "all" (or "-1"), min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful当生产者将 acks 设置为“全部”(或“-1”)时,min.insync.replicas 指定必须确认写入才能将写入视为成功的最小副本数
When used together, min.insync.replicas and acks allow you to enforce greater durability guarantees.当一起使用时,min.insync.replicas 和 acks 允许您强制执行更大的持久性保证。 A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with acks of "all"
一个典型的场景是创建一个复制因子为 3 的主题,将 min.insync.replicas 设置为 2,并使用“all”确认生产
HIgher the min.insync.replicas, the leader for that partition needs to wait for higher number of data copies to be written synchronously - hence lower performance.更高的 min.insync.replicas,该分区的领导者需要等待更多数量的数据副本被同步写入 - 因此性能较低。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.