简体繁体 English

以降低弹性为代价经济高效地运行 Kafka

[英]Running Kafka cost-effectively at the expense of lower resilience

原文 2022-11-30 21:30:53 2 2 apache-kafka/ high-availability/ automatic-failover

Let's say I have a cheap and less reliable datacenter A, and an expensive and more reliable datacenter B. I want to run Kafka in the most cost-effective way, even if that means risking data loss and/or downtime.假设我有一个便宜但不太可靠的数据中心 A，以及一个昂贵但更可靠的数据中心 B。我想以最具成本效益的方式运行 Kafka，即使这意味着冒着数据丢失和/或停机的风险。 I can run any number of brokers in either datacenter, but remember that costs need to be as low as possible.我可以在任一数据中心运行任意数量的代理，但请记住，成本需要尽可能低。

For this scenario, assume that no costs are incurred if brokers are not running.对于这种情况，假设如果代理不运行则不会产生任何费用。 Also assume that producers/consumers run completely reliably with no concern for their cost.还假设生产者/消费者完全可靠地运行而不用担心他们的成本。

Two ideas I have are as follows:我有两个想法如下：

Provision two completely separate Kafka clusters, one in each datacenter, but keep the cluster in the more expensive datacenter (B) powered off.提供两个完全独立的 Kafka 集群，每个数据中心一个，但让更昂贵的数据中心 (B) 中的集群保持关闭状态。 Upon detecting an outage in A, power on the cluster in B. Producers/consumers will have logic to switch between clusters.在检测到 A 中的中断后，启动 B 中的集群。生产者/消费者将具有在集群之间切换的逻辑。
Run the Zookeeper cluster in B, with powered on brokers in A, and powered off brokers in B. If there is an outage in A, then brokers in B come online to pick up where A left off.在 B 中运行 Zookeeper 集群，在 A 中启动代理，在 B 中关闭代理。如果 A 中出现中断，那么 B 中的代理会联机从 A 停止的地方接手。

Option 1 would be cheaper, but requires more complexity in the producers/consumers.选项 1 会更便宜，但需要生产者/消费者更加复杂。 Option 2 would be more expensive, but requires less complexity in the producers/consumers.选项 2 会更昂贵，但对生产者/消费者的复杂性要求较低。

Is Option 2 even possible?选项 2 甚至可能吗？ If there is an outage in A, is there any way to have brokers in B come online, get elected as leaders for the topics and have the producers/consumers seamlessly start sending to them?如果 A 中出现中断，是否有任何方法可以让 B 中的代理上线，被选为主题的领导者并让生产者/消费者无缝地开始向他们发送消息？ Again, data loss is okay and so is switchover downtime.同样，数据丢失是正常的，切换停机时间也是正常的。 But whatever option needs to not require manual intervention.但是无论选择什么，都不需要人工干预。

Is there any other approach that I can consider?我可以考虑其他方法吗？

2 个解决方案

Neither is feasible.两者都不可行。

Topics and their records are unique to each cluster.主题及其记录对于每个集群都是唯一的。 Only one leader partition can exist for any Kafka partition in a cluster.集群中的任何 Kafka 分区只能存在一个领导分区。

With these two pieces of information, example scenarios include:有了这两条信息，示例场景包括：

Producers cut over to a new cluster, and find the new leaders until old cluster comes back生产者切换到一个新的集群，并找到新的领导者，直到旧集群回来
Even if above could happen instantaneously, or with minimal retries, consumers then are responsible for reading from where?即使以上可能立即发生，或者重试次数最少，消费者也有责任从哪里读取？ They cannot aggregate data from more than one bootstrap.servers at any time.他们无法在任何时候聚合来自多个bootstrap.servers的数据。
So, now you get into a situation where both clusters always need to be available, with N consumer threads for N partitions existing in the other cluster, and M threads for the original cluster因此，现在您遇到了两个集群始终都需要可用的情况，另一个集群中存在 N 个分区的 N 个消费者线程，以及原始集群中的 M 个线程
Meanwhile, producers are back to writing to the appropriate (cheaper) cluster, so data will potentially be out of order since you have no control which consumer threads process what data first.同时，生产者重新写入适当的（更便宜的）集群，因此数据可能会乱序，因为您无法控制哪些消费者线程首先处理哪些数据。
Only after you track the consumer lag from the more expensive cluster consumers will you be able to reasonably stop those threads and shut down that cluster upon reaching zero lag across all consumers只有在您跟踪来自更昂贵的集群消费者的消费者滞后之后，您才能合理地停止这些线程并在所有消费者达到零滞后时关闭该集群

Another thing to keep in mind is that topic creation/update/delete events aren't automatically synced across clusters, so Kafka Streams apps, especially, will all be unable to maintain state with this approach.另一件要记住的事情是，主题创建/更新/删除事件不会自动跨集群同步，因此 Kafka Streams 应用程序，尤其是，将无法使用这种方法维护 state。

You can use tools like MirrorMaker or Confluent Replicator / Cluster Linking to help with all this, but the client failover piece I've personally never seen handled very well, especially when record order and idempotent processing matters您可以使用 MirrorMaker 或 Confluent Replicator / Cluster Linking 等工具来帮助完成所有这些，但我个人从未见过客户端故障转移部分处理得很好，尤其是当记录顺序和幂等处理很重要时

Ultimately, this is what availability zones are for.最终，这就是可用性区域的用途。 From what I understand, the chances of a cloud provider losing more than one availability zone at a time is extremely rare.据我了解，云提供商一次丢失多个可用区的可能性非常小。 So, you'd setup one Kafka cluster across 3 or more availability zones, and configure "rack awareness" for Kafka to account for its installation locations.因此，您需要跨 3 个或更多可用性区域设置一个 Kafka 集群，并为 Kafka 配置“机架感知”以说明其安装位置。

If you want to keep the target / passive cluster shutdown while not operational and then spin up the cluster you should be ok if you don't need any history and don't care about the consumer lag gap in the source cluster.. obv use case dependent.如果您想在不运行时保持目标/被动集群关闭然后启动集群，如果您不需要任何历史记录并且不关心源集群中的消费者滞后差距.. obv 使用视情况而定。

MM2 or any sort of async directional replication requires the cluster to be active all the time. MM2 或任何类型的异步定向复制要求集群始终处于活动状态。

Stretch cluster is not really doable b/c of the 2 dc thing, whether raft or zk you need a 3rd dc for that, and that would probably be your most expensive option. Stretch cluster 并不是真正可行的 2 dc 事物的 b/c，无论是 raft 还是 zk，你都需要第三个 dc，这可能是你最昂贵的选择。

Redpanda has the capability of offloading all of your log segments to s3 and then indexes them to allow them to be used for other clusters, so if you constantly wrote one copy of your log segments to your standby DC storage array with s3 interface it might be palatable. Redpanda 具有将所有日志段卸载到 s3 的能力，然后对它们进行索引以允许它们用于其他集群，因此如果您不断地将日志段的一个副本写入具有 s3 接口的备用 DC 存储阵列，则可能是可口的。 Then whenever needed you just spin up a cluster on demand in the target dc and point it to the object store and you can immediately start producing and consuming with your new clients.然后，无论何时需要，您只需在目标 dc 中按需启动一个集群并将其指向 object 商店，您就可以立即开始生产并与您的新客户一起消费。