繁体   English   中英

ETCD群集设置失败

[英]Etcd cluster setup failure

我正在尝试在Ubuntu机器上设置3节点etcd集群作为用于网络的docker数据存储。 我使用etcd docker镜像成功创建了etcd集群。 现在,当我尝试复制它时,步骤在一个节点上失败。 即使从逐步删除故障节点后,群集仍在寻找已删除的节点。 当我使用etcd binary时,将面临相同的错误。

通过在所有节点上相应地更改ip来使用以下命令:

docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
 --name etcd quay.io/coreos/etcd \
 -name etcd0 \
 -advertise-client-urls http://172.27.59.141:2379,http://172.27.59.141:4001 \
 -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
 -initial-advertise-peer-urls http://172.27.59.141:2380 \
 -listen-peer-urls http://0.0.0.0:2380 \
 -initial-cluster-token etcd-cluster-1 \
 -initial-cluster etcd0=http://172.27.59.141:2380,etcd1=http://172.27.59.244:2380,etcd2=http://172.27.59.232:2380 \
 -initial-cluster-state new

两个节点正确连接,但第三个节点的服务停止。 以下是第三个节点的日志。

2016-06-16 17:16:34.293248 I | etcdmain: etcd Version: 2.3.6
2016-06-16 17:16:34.294368 I | etcdmain: Git SHA: 128344c
2016-06-16 17:16:34.294584 I | etcdmain: Go Version: go1.6.2
2016-06-16 17:16:34.294781 I | etcdmain: Go OS/Arch: linux/amd64
2016-06-16 17:16:34.294962 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2016-06-16 17:16:34.295142 W | etcdmain: no data-dir provided, using default data-dir ./node2.etcd
2016-06-16 17:16:34.295438 I | etcdmain: listening for peers on http://0.0.0.0:2380
2016-06-16 17:16:34.295654 I | etcdmain: listening for client requests on http://0.0.0.0:2379
2016-06-16 17:16:34.295846 I | etcdmain: listening for client requests on http://0.0.0.0:4001
2016-06-16 17:16:34.296193 I | etcdmain: stopping listening for client requests on http://0.0.0.0:4001
2016-06-16 17:16:34.301139 I | etcdmain: stopping listening for client requests on http://0.0.0.0:2379
2016-06-16 17:16:34.301454 I | etcdmain: stopping listening for peers on http://0.0.0.0:2380
2016-06-16 17:16:34.301718 I | etcdmain: --initial-cluster must include node2=http://172.27.59.232:2380 given --initial-advertise-peer-urls=http://172.27.59.232:2380

即使删除了发生故障的节点,我也可以看到两个节点正在等待第三个节点连接。

2016-06-16 17:16:12.063893 N | etcdserver: added member 17879927ec74147b [http://172.27.59.232:238] to cluster ba4424e006edb53e
2016-06-16 17:16:12.064431 N | etcdserver: added local member 24d9feabb7e2f26f [http://172.27.59.244:2380] to cluster ba4424e006edb53e
2016-06-16 17:16:12.065229 N | etcdserver: added member 2bda70be57138cfe [http://172.27.59.141:2380] to cluster ba4424e006edb53e
2016-06-16 17:16:12.218560 I | raft: 24d9feabb7e2f26f [term: 1] received a MsgVote message with higher term from 2bda70be57138cfe [term: 29]
2016-06-16 17:16:12.218964 I | raft: 24d9feabb7e2f26f became follower at term 29
2016-06-16 17:16:12.219276 I | raft: 24d9feabb7e2f26f [logterm: 1, index: 3, vote: 0] voted for 2bda70be57138cfe [logterm: 1, index: 3] at term 29
2016-06-16 17:16:12.222667 I | raft: raft.node: 24d9feabb7e2f26f elected leader 2bda70be57138cfe at term 29
2016-06-16 17:16:12.335904 I | etcdserver: published {Name:node1 ClientURLs:[http://172.27.59.244:2379 http://172.27.59.244:4001]} to cluster ba4424e006edb53e
2016-06-16 17:16:12.336459 N | etcdserver: set the initial cluster version to 2.2
2016-06-16 17:16:42.059177 W | rafthttp: the connection to peer 17879927ec74147b is unhealthy
2016-06-16 17:17:12.060313 W | rafthttp: the connection to peer 17879927ec74147b is unhealthy
2016-06-16 17:17:42.060986 W | rafthttp: the connection to peer 17879927ec74147b is unhealthy

可以看出,尽管从两个节点开始集群,但它仍在搜索第三个节点。

本地磁盘上是否有保存数据的位置,并且尽管未提供数据,但仍会拾取旧数据。

请提出我所缺少的。

本地磁盘上是否有保存数据的位置,并且尽管未提供数据,但仍会拾取旧数据。

是的,成员资格数据已经存储在node0.etcdnode1.etcd

您可以从日志中获得以下消息,该消息指示服务器已经属于集群:

etcdmain: the server is already initialized as member before, starting as etcd member...

为了运行具有两个成员的新集群,只需在命令中添加另一个参数:

--data-dir bak

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM