简体   繁体   English

K8s 集群中的 Redis 故障转移

[英]Redis failover in a K8s cluster

I am trying to get Redis failover to work in Kubernetes with a worker-node failure scenario.我正在尝试让 Redis 故障转移在 Kubernetes 中使用工作节点故障场景。 I have a K8s cluster that consists of a master node and two worker nodes.我有一个 K8s 集群,它由一个主节点和两个工作节点组成。 The master node does not schedule pods.主节点不调度 pod。 The manifests for Redis are such that there is a master and a slave instance in a stateful set and 3 sentinels in another stateful set. Redis 的清单是这样的,在一个有状态集中有一个主实例和一个从实例,在另一个有状态集中有 3 个哨兵。 The manifests have affinity to steer the pods to be scheduled on separate worker nodes.清单具有关联性,可以引导 Pod 在不同的工作节点上进行调度。 If I drain a worker node that has the master instance and one sentinel, failover works like a champ.如果我耗尽了一个具有主实例和一个哨兵的工作节点,故障转移就像一个冠军。

If, however, there are 2 sentinels that are evicted with the master instance, no master is elected and the 2 sentinels that are re-started on the remaining worker node report: -failover-abort-no-good-slave master jnpr-ipb-redis-masters 10.244.1.209 7380 .但是,如果有2 个哨兵与主实例一起被驱逐,则没有选择主实例,并且在剩余的工作节点上重新启动的 2 个哨兵报告: -failover-abort-no-good-slave master jnpr-ipb-redis-masters 10.244.1.209 7380 That IP address in the log message is the IP address of the former slave (which I expected to be promoted to the new master).日志消息中的 IP 地址是前从站的 IP 地址(我希望将其提升为新的主站)。

Is there a bit of wizardry to make this work?是否有一些魔法可以使这项工作? Is this a valid cluster config?这是一个有效的集群配置吗? Not really sure what I should be looking at to get an idea of what is happening.不太确定我应该看什么来了解正在发生的事情。

What you want is a PodDisruptionBudget.您想要的是 PodDisruptionBudget。 That will make voluntary evictions at least not break things.这将使自愿驱逐至少不会破坏事情。 Beyond that you can use hard anti-affinities to force the pods to be scheduled on different nodes.除此之外,您可以使用硬反关联来强制将 pod 安排在不同的节点上。 Failures are still possible though, if you lose two nodes at the same time the Sentinels can desync.但是,如果您同时丢失两个节点,Sentinel 可能会不同步,则仍然可能出现故障。 This is a large part of why Redis Sentinel is mostly no longer used in favor of Cluster Mode.这就是为什么 Redis Sentinel 不再使用以支持集群模式的很大一部分原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM