Kafka min.insync.replicas < replication.factor

Question

Suppose I have a cluster with 3 kafka brokers.假设我有一个包含 3 个 kafka 代理的集群。 I set:我设置：

min.insync.replicas=2
default.replication.factor=3

All brokers are up, ISR is fine, I get a message where ack=all .所有代理都已启动，ISR 正常，我收到一条消息，其中ack=all 。 Since ISR=2 , two copies of the message are for sure stored.由于ISR=2 ，消息的两个副本肯定会被存储。 1) Will one more copy (because replication=3) be made in the background? 1）是否会在后台制作一份副本（因为replication=3）？ 2) If it fails - it does not matter, correct? 2）如果失败了——没关系，对吗？ Cluster health is just fine.集群运行状况良好。
One broker is down, ISR=2 can be maintained and the message is saved to two brokers.一个broker挂了，可以维持ISR=2 ，消息保存到两个broker。 After some time that broker that was down comes up again.一段时间后，那个挂掉的经纪人又出现了。 3) Since replication=3, will it try to catch up with the others in the back-ground? 3) 因为replication=3，它会在后台追上其他人吗？

I am trying to figure out of a practical example where setting replication factor to be bigger than ISR would make sense.我试图找出一个实际的例子，在这个例子中，将复制因子设置为大于 ISR 是有意义的。 A real example I could "touch" and understand.一个我可以“触摸”和理解的真实例子。 If this is a duplicate, please refer me to it.如果这是重复的，请让我参考。 Thank you.谢谢你。

Answer 1

Yes, one replica is made in the background.是的，一个复制品是在后台制作的。

Yes, the broker will catch up all out of sync replicas upon restarts.是的，代理将在重新启动时赶上所有不同步的副本。

If you ever have in-sync replicas <= replication factor, then you cannot lose any brokers (due to maintenance or failure).如果你有同步副本 <= 复制因子，那么你就不会丢失任何代理（由于维护或故障）。 Therefore, replication factor should always be greater因此，复制因子应该总是更大

Answer 2

The other answer is absolutely correct, but it took me quite a while to figure out.另一个答案绝对正确，但我花了很长时间才弄明白。 imho, this is somehow subtle and though my understanding might be a little incorrect here and there, it helped to build a mental model of what is going on.恕我直言，这在某种程度上是微妙的，虽然我的理解可能在这里和那里有点不正确，但它有助于建立一个心理 model 正在发生的事情。

Suppose I have a cluster of 3 brokers:假设我有一个由 3 个经纪人组成的集群：

[a, b, c]  ->  brokers
[a, b]     -> ISR
[a, b, c]  -> RF

How many brokers can I tolerate to be down?我能容忍多少经纪人倒下？ The answer is 1.答案是 1。

If lose broker "c", ISR can still be satisfied and the cluster will work just fine.如果丢失代理“c”，ISR 仍然可以得到满足，集群将正常工作。
If I lose broker "a" (the explanation is the same if I lost "b"), a rebalance has to happen.如果我失去了经纪人“a”（如果我失去了“b”，解释是一样的），则必须进行再平衡。 zookeeper will ask what brokers were in-sync (who satisfied RF) before I lost one from the ISR.在我从 ISR 中丢失一个之前，zookeeper 会询问哪些代理处于同步状态（谁满足 RF）。 Well, there were 3 of them part of RF = a, b, c. Since I lost "a", there are two left now that are in sync: "b" and "c".好吧，RF = a, b, c 中有 3 个部分。由于我丢失了“a”，所以现在剩下两个是同步的：“b”和“c”。 A leader election has to happen and the ISR will be satisfied with "b" and "c".必须进行领导选举，ISR 将对“b”和“c”感到满意。
This means that I can lose any one broker from the cluster and still work fine.这意味着我可以从集群中丢失任何一个代理，但仍然可以正常工作。 It might be trivial here, but the next example is not so much, imho.这在这里可能微不足道，但下一个例子并不是那么多，恕我直言。

Suppose I have a (artificial example) cluster with 5 brokers:假设我有一个包含 5 个代理的（人工示例）集群：

[a, b, c, d, e]  -> brokers
[a, b]           -> ISR
[a, b, c]        -> RF

How many brokers can I tolerate as being down now?我现在可以容忍多少经纪人倒闭？ Initially I thought 2, but that can't be correct.最初我以为是 2，但这不正确。

If I lose "d" and "e", it's simple, the cluster will continue to work just fine.如果我丢失了“d”和“e”，很简单，集群将继续正常工作。
If lose "a" and "b", in theory a rebalance has to happen.如果失去“a”和“b”，理论上必须发生重新平衡。 But what brokers were part of RF before I lost "a" and "b" or which brokers were in-sync?但是在我丢失“a”和“b”之前哪些经纪人是 RF 的一部分，或者哪些经纪人是同步的？ [a, b, c]. [a, b, c]。 There is no way to satisfy ISR if two of those brokers are down.如果其中两个代理出现故障，则无法满足 ISR。
This means that I can't tolerate any two brokers being down, which means this set-up is not really fault tolerant with any 2 brokers down.这意味着我不能容忍任何两个代理宕机，这意味着这个设置对于任何2 个代理宕机并不是真正的容错。
It can only be tolerant with two brokers down if my set-up is different:如果我的设置不同，它只能容忍两个经纪人倒下：
```
 5 -> brokers 3 -> ISR 5 -> RF
```

And this is where the other answer is correct and makes total sense:这就是另一个答案正确且完全有意义的地方：

If you ever have in-sync replicas <= replication factor, then you cannot lose any brokers more than the difference between the values如果你有同步副本 <= 复制因子，那么你不能丢失任何代理超过值之间的差异

Kafka min.insync.replicas < replication.factor

问题描述

2 个解决方案

解决方案1
0 2022-12-26 17:38:04

解决方案2
0 2022-12-28 11:35:11

Kafka min.insync.replicas < replication.factor

问题描述

2 个解决方案

解决方案1 0 2022-12-26 17:38:04

解决方案2 0 2022-12-28 11:35:11

解决方案1
0 2022-12-26 17:38:04

解决方案2
0 2022-12-28 11:35:11