简体   繁体   English

为什么Docker容器无法相互通信?

[英]Why Docker containers can't communicate with each other?

I have created a small project to test Docker clustering. 我创建了一个小项目来测试Docker集群。 Basically, the cluster.sh script launches three identical containers, and uses pipework to configure a bridge ( bridge1 ) on the host and add an NIC ( eth1 ) to each container. 基本上,cluster.sh脚本启动三个相同的容器,并使用管道配置主机上的桥( bridge1 )并向每个容器添加NIC( eth1 )。

If I log into one of the containers, I can arping other containers: 如果我登录其中一个容器,我可以arping其他容器:

# 172.17.99.1
root@d01eb56fce52:/# arping 172.17.99.2
ARPING 172.17.99.2
42 bytes from aa:b3:98:92:0b:08 (172.17.99.2): index=0 time=1.001 sec
42 bytes from aa:b3:98:92:0b:08 (172.17.99.2): index=1 time=1.001 sec
42 bytes from aa:b3:98:92:0b:08 (172.17.99.2): index=2 time=1.001 sec
42 bytes from aa:b3:98:92:0b:08 (172.17.99.2): index=3 time=1.001 sec
^C
--- 172.17.99.2 statistics ---
5 packets transmitted, 4 packets received,  20% unanswered (0 extra)

So it seems packets can go through bridge1 . 所以似乎数据包可以通过bridge1

But the problem is I can't ping other containers, neither can I send any IP packets through via any tools like telnet or netcat . 但问题是我无法ping其他容器,也无法通过telnetnetcat等任何工具发送任何IP数据包。

In contrast, the bridge docker0 and NIC eth0 work correctly in all containers. 相反,桥接器docker0和NIC eth0在所有容器中都能正常工作。

Here's my route table 这是我的路线表

# 172.17.99.1
root@d01eb56fce52:/# ip route
default via 172.17.42.1 dev eth0 
172.17.0.0/16 dev eth0  proto kernel  scope link  src 172.17.0.17 
172.17.99.0/24 dev eth1  proto kernel  scope link  src 172.17.99.1

and bridge config 和桥配置

# host
$ brctl show
bridge name bridge id       STP enabled interfaces
bridge1     8000.8a6b21e27ae6   no      veth1pl25432
                                        veth1pl25587
                                        veth1pl25753
docker0     8000.56847afe9799   no      veth7c87801
                                        veth953a086
                                        vethe575fe2

# host
$ brctl showmacs bridge1
port no mac addr        is local?   ageing timer
  1 8a:6b:21:e2:7a:e6   yes        0.00
  2 8a:a3:b8:90:f3:52   yes        0.00
  3 f6:0c:c4:3d:f5:b2   yes        0.00

# host
$ ifconfig
bridge1   Link encap:Ethernet  HWaddr 8a:6b:21:e2:7a:e6  
          inet6 addr: fe80::48e9:e3ff:fedb:a1b6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:163 errors:0 dropped:0 overruns:0 frame:0
          TX packets:68 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:8844 (8.8 KB)  TX bytes:12833 (12.8 KB)

# I'm showing only one veth here for simplicity
veth1pl25432 Link encap:Ethernet  HWaddr 8a:6b:21:e2:7a:e6  
          inet6 addr: fe80::886b:21ff:fee2:7ae6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:155 errors:0 dropped:0 overruns:0 frame:0
          TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:12366 (12.3 KB)  TX bytes:23180 (23.1 KB)

...

and IP FORWARD chain 和IP FORWARD链

# host
$ sudo iptables -x -v --line-numbers -L FORWARD
Chain FORWARD (policy ACCEPT 10675 packets, 640500 bytes)
num      pkts      bytes target     prot opt in     out     source               destination         
1       15018 22400195 DOCKER     all  --  any    docker0  anywhere             anywhere            
2       15007 22399271 ACCEPT     all  --  any    docker0  anywhere             anywhere             ctstate RELATED,ESTABLISHED
3        8160   445331 ACCEPT     all  --  docker0 !docker0  anywhere             anywhere            
4          11      924 ACCEPT     all  --  docker0 docker0  anywhere             anywhere            
5          56     4704 ACCEPT     all  --  bridge1 bridge1  anywhere             anywhere            

Note the pkts cound for rule 5 isn't 0, which means ping has been routed correctly (FORWARD chain is executed after routing right?), but somehow didn't reach the destination. 注意规则5的pkts cound不是0,这意味着ping已经正确路由(在路由正确后执行FORWARD链?),但不知何故没有到达目的地。

I'm out of ideas why docker0 and bridge1 behave differently. 我不明白为什么docker0bridge1表现不同。 Any suggestion? 有什么建议吗?

Update 1 更新1

Here's the tcpdump output on the target container when pinged from another. 这是从另一个容器中ping到目标容器时的tcpdump输出。

$ tcpdump -i eth1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
22:11:17.754261 IP 192.168.1.65 > 172.17.99.1: ICMP echo request, id 26443, seq 1, length 6

Note the source IP is 192.168.1.65 , which is the eth0 of the host, so there seems to be some SNAT going on on the bridge. 注意源IP是192.168.1.65 ,这是主机的eth0 ,所以桥上似乎有一些SNAT正在进行。

Finally, printing out the nat IP table revealed the cause of the problem: 最后,打印出nat IP表揭示了问题的原因:

$ sudo iptables -L -t nat
...
Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  172.17.0.0/16        anywhere
...

Because my container's eth0 's IP is on 172.17.0.0/16 , packets sent have their source IP changed. 因为我的容器的eth0的IP是在172.17.0.0/16 ,所以发送的数据包的源IP已更改。 This is why the responses from ping can't go back to the source. 这就是ping的响应无法返回源的原因。

Conclusion 结论

The solution is to change the container's eth0 's IP to a different network than that of the default docker0 . 解决方案是将容器的eth0的IP更改为与默认docker0不同的网络。

Copied from Update 1 in question 从有问题的Update 1复制

Here's the tcpdump output on the target container when pinged from another. 这是从另一个容器中ping到目标容器时的tcpdump输出。

$ tcpdump -i eth1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
22:11:17.754261 IP 192.168.1.65 > 172.17.99.1: ICMP echo request, id 26443, seq 1, length 6

Note the source IP is 192.168.1.65 , which is the eth0 of the host, so there seems to be some SNAT going on on the bridge. 注意源IP是192.168.1.65 ,这是主机的eth0 ,所以桥上似乎有一些SNAT正在进行。

Finally, printing out the nat IP table revealed the cause of the problem: 最后,打印出nat IP表揭示了问题的原因:

$ sudo iptables -L -t nat
...
Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  172.17.0.0/16        anywhere
...

Because my container's eth0 's IP is on 172.17.0.0/16 , packets sent have their source IP changed. 因为我的容器的eth0的IP是在172.17.0.0/16 ,所以发送的数据包的源IP已更改。 This is why the responses from ping can't go back to the source. 这就是ping的响应无法返回源的原因。

Conclusion 结论

The solution is to change the container's eth0 's IP to a different network than that of the default docker0 . 解决方案是将容器的eth0的IP更改为与默认docker0不同的网络。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM