简体   繁体   English

恢复Etcd Quorum

[英]Restore Etcd Quorum

I have a Kubernetes cluster distributed on AWS via Kops consisting of 3 master nodes, each in a different AZ. 我有一个Kubernetes集群,通过Kops分布在AWS上,由3个主节点组成,每个节点位于不同的AZ中。 As is well known, Kops realizes the deployment of a cluster where Etcd is executed on each master node through two pods, each of which mounts an EBS volume for saving the state. 众所周知,Kops实现了一个集群的部署,其中在每个主节点上通过两个pod 执行Etcd ,每个pod安装一个EBS卷以保存状态。 If you lose the volumes of 2 of the 3 masters, you automatically lose consensus among the masters. 如果你失去了 3个主人中的2个 ,你会自动失去主人的共识

Is there a way to use information about the only master who still has the status of the cluster, and retrieve the Quorum between the three masters on that state? 有没有办法使用有关仍然具有群集状态的唯一主人的信息,并检索该状态的三个主人之间的仲裁 I recreated this scenario, but the cluster becomes unavailable, and I can no longer access the Etcd pods of any of the 3 masters, because those pods fail with an error. 我重新创建了这个场景,但群集变得不可用,我无法再访问3个主人中任何一个的Etcd pod,因为这些pod失败并出现错误。 Moreover, Etcd itself becomes read-only and it is impossible to add or remove members of the cluster, to try to perform manual interventions. 此外,Etcd本身变为只读,并且无法添加或删除群集成员,以尝试执行手动干预。

Tips? 提示? Thanks to all of you 感谢大家

This is documented here . 在此处记录 There's also another guide here 还有另一位导游在这里

You basically have to backup your cluster and create a brand new one. 您基本上必须备份群集并创建一个全新的群集。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在多个实例中运行 etcd? - How to run etcd in multiple instance? 普罗米修斯无法刮擦外部etcd - Prometheus cannot scrape external etcd etcd 集群健康检查 | aws ELB - etcd cluster healthcheck | aws ELB etcd 未启动,探测状态时出现 gettig 错误 - etcd not starting, gettig error while probling status 为什么Kubernetes apiserver向etcd服务器提供错误的证书? - Why does Kubernetes apiserver present a bad certificate to the etcd server? 在ECS中运行etcd容器所需的入口点/命令是什么? - What is the entry point/command required to run an etcd container in ECS? 使用 EC2 实例在 ELB 后面设置 ETCD 集群 - Setup an ETCD cluster behind an ELB using EC2 instances InvalidQueryException:此操作不支持一致性级别 LOCAL_ONE。 支持的一致性级别是:LOCAL_QUORUM - InvalidQueryException: Consistency level LOCAL_ONE is not supported for this operation. Supported consistency levels are: LOCAL_QUORUM 在Amzon EMR上运行hbase时,外部hbase客户端的Zookeeper仲裁问题 - Zookeeper quorum issue with external hbase client when running hbase on Amzon EMR 此操作不支持一致性级别 LOCAL_ONE。 支持的一致性级别是:LOCAL_QUORUM - Consistency level LOCAL_ONE is not supported for this operation. Supported consistency levels are: LOCAL_QUORUM
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM