[英]Docker-swarm overlay network is not working for containers in different hosts
We have a.networking problem in docker-swarm.我们在 docker-swarm 中有一个网络问题。 The problem is below;
问题在下面;
Where should I check, any advices?我应该在哪里检查,有什么建议吗?
server-1:~$ docker version
Client:
Version: 17.03.0-ce
API version: 1.26
Go version: go1.7.5
Git commit: 3a232c8
Built: Tue Feb 28 08:01:32 2017
OS/Arch: linux/amd64
Server:
Version: 17.03.0-ce
API version: 1.26 (minimum version 1.12)
Go version: go1.7.5
Git commit: 3a232c8
Built: Tue Feb 28 08:01:32 2017
OS/Arch: linux/amd64
Experimental: true
ps: I checked this post but I have latest version of docker / docker-swarm so the issue should be fixed.. ps:我检查了这篇文章,但我有最新版本的 docker / docker-swarm 所以这个问题应该是固定的..
ps-2: similar problem; ps-2:类似问题; https://github.com/docker/swarm/issues/2687
https://github.com/docker/swarm/issues/2687
"VTEP Port is reserved or restricted for VMware use, any virtual machine cannot use this port for other purpose or for any other application." “VTEP 端口保留或限制供 VMware 使用,任何虚拟机都不能将此端口用于其他目的或任何其他应用程序。”
But we can change docker swarm data-path-port(the default port number 4789 is used) to another:但我们可以将 docker swarm data-path-port(使用默认端口号 4789)更改为另一个:
docker swarm init --data-path-port=7789
Out of curiosity, in your VMware environment, do you have NSX deployed?出于好奇,在您的 VMware 环境中,您是否部署了 NSX? I may have an answer, but it only applies if NSX is deployed in the environment.
我可能有答案,但它仅适用于在环境中部署了 NSX 的情况。
ESXi will apparently drop OUTBOUND packets from VMs if the destination port is the same as the port configured for the VXLAN VTEP communication.如果目标端口与为 VXLAN VTEP通信配置的端口相同,ESXi 显然会丢弃来自虚拟机的出站数据包。
NSX utilizes port 4789/udp for VTEP communication for VXLAN (by default, as of 6.2.3; prior to that, it was 8472/udp ). NSX 使用端口4789/udp进行 VXLAN 的 VTEP 通信(默认情况下,从 6.2.3 开始;在此之前,它是8472/udp )。 (If the VMs are on the same host, then traffic is not dropped, because, while it may be OUTBOUND traffic, it does not egress the host, and does not get to the same stage within the VMKernel to be dropped.)
(如果 VM 在同一台主机上,则流量不会被丢弃,因为虽然它可能是出站流量,但它不会离开主机,也不会到达要丢弃的 VMKernel 中的同一阶段。)
The wording in KB2079386 is a little off. KB2079386中的措辞有点偏离。 It states:
它指出:
VXLAN port 8472 is reserved or restricted for VMware use, any virtual machine cannot use this port for other purpose or for any other application.
VXLAN 端口 8472 保留或限制供 VMware 使用,任何虚拟机都不能将此端口用于其他目的或任何其他应用程序。
But, it should read:但是,它应该是:
VTEP Port is reserved or restricted for VMware use, any virtual machine cannot use this port for other purpose or for any other application.
VTEP 端口保留或限制供 VMware 使用,任何虚拟机都不能将此端口用于其他目的或任何其他应用程序。
If you are using NSX, you could try changing the port used for the VXLAN VTEPs, but port 4789/udp is required if you are going to leverage hardware VTEPs at all.如果您使用的是 NSX,您可以尝试更改用于 VXLAN VTEP 的端口,但如果您要完全利用硬件 VTEP,则需要端口4789/udp 。
(I can't take full credit for this. I stumbled across this blog post talking about similar behavior when troubleshooting a similar issue.) (我不能完全相信这一点。我在解决类似问题时偶然发现了这篇谈论类似行为的博客文章。)
The first thing I would check for overlay.networking is your firewall rules.我要检查 overlay.networking 的第一件事是你的防火墙规则。 You need the following open between the hosts:
您需要在主机之间打开以下内容:
iptables -A INPUT -p 50 -j ACCEPT
) iptables -A INPUT -p 50 -j ACCEPT
) If that doesn't help, look into using netshoot to debug where the traffic is getting stopped.如果这没有帮助,请考虑使用netshoot调试流量停止的位置。
If your nodes are not on the same su.net (eg. they all have public IPs) - then make sure you use the --advertise-addr
option specifying the IP address that the other nodes can reach when that node (other managers AND workers) joins the swarm.如果您的节点不在同一个 su.net 上(例如,它们都有公共 IP)——那么请确保您使用
--advertise-addr
选项指定其他节点在该节点(其他管理器和工人)加入蜂群。
Otherwise the overlay.network will not route correctly between hosts even though stack deployment & node registration etc appear to be working fine.否则 overlay.network 将无法在主机之间正确路由,即使堆栈部署和节点注册等看起来工作正常。
See the detailed explanation for my case in the same GitHub issue --> https://github.com/docker/swarm/issues/2687在同一个 GitHub 问题中查看我的案例的详细解释 --> https://github.com/docker/swarm/issues/2687
Resolution to the issue as mentioned above.解决上述问题。
Use the following when you initializing the swarm初始化群时使用以下内容
docker swarm init --advertise-addr=YOURIP --listen-addr=0.0.0.0 --data-path-port=7779 --force-new-cluster=true
Resources :资源:
Docker: Docker:
VMWare:虚拟机:
Thanks @Izkuru谢谢@Izkuru
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.