简体   繁体   English

Kubernetes:无法跨节点 ping pod

[英]Kubernetes: Can't ping pods across nodes

I am currently following this tutorial (except that I am on AWS, and I can do nothing about that).我目前正在关注本教程(除了我在 AWS 上,我对此无能为力)。
I am currently at the 10th step and seem to be having problems while trying to reach pods from one worker to another.我目前处于第10 步,在尝试将 Pod 从一名工人带到另一名工人时似乎遇到了问题。

Here is a log from two workers which will help underline the problem:这是来自两名工人的日志,有助于强调问题:

worker-0 :工人-0

root@worker-0:/home/admin# ip addr show eth0 | grep 'inet '                                                                                                                                                        
inet 10.240.1.230/24 brd 10.240.1.255 scope global eth0
root@worker-0:/home/admin# traceroute 10.200.1.10 -n -i cnio0 -I -m 5                                                                                                                                              
traceroute to 10.200.1.10 (10.200.1.10), 5 hops max, 60 byte packets
 1  10.200.1.10  0.135 ms  0.079 ms  0.073 ms
root@worker-0:/home/admin# ping 10.240.1.232
PING 10.240.1.232 (10.240.1.232) 56(84) bytes of data.
64 bytes from 10.240.1.232: icmp_seq=1 ttl=64 time=0.151 ms
^C
--- 10.240.1.232 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.151/0.151/0.151/0.000 ms
root@worker-0:/home/admin# traceroute 10.200.3.5 -g 10.240.1.232 -n -i eth0 -I -m 5                                                                                                                                
traceroute to 10.200.3.5 (10.200.3.5), 5 hops max, 72 byte packets
 1  * * *
 2  * * *
 3  * * *
 4  * * *
 5  * * *
root@worker-0:/home/admin#

worker-2 :工人 2

root@worker-2:/home/admin# ip addr show eth0 | grep 'inet '
    inet 10.240.1.232/24 brd 10.240.1.255 scope global eth0
root@worker-2:/home/admin# traceroute 10.200.3.5 -n -i cnio0 -I -m 5                                                                                                                                                
traceroute to 10.200.3.5 (10.200.3.5), 5 hops max, 60 byte packets
 1  10.200.3.5  0.140 ms  0.077 ms  0.072 ms
root@worker-2:/home/admin# ping 10.200.3.5
PING 10.200.3.5 (10.200.3.5) 56(84) bytes of data.
64 bytes from 10.200.3.5: icmp_seq=1 ttl=64 time=0.059 ms
64 bytes from 10.200.3.5: icmp_seq=2 ttl=64 time=0.047 ms
^C
--- 10.200.3.5 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1017ms
rtt min/avg/max/mdev = 0.047/0.053/0.059/0.006 ms
root@worker-2:/home/admin#

The pods deploy correctly (I have tried spawning 11 instances of busybox and here is the result: Pod 部署正确(我尝试生成 11 个 busybox 实例,结果如下:

admin@ip-10-240-1-250:~$ kubectl get pods
busybox-68654f944b-vjs2s    1/1       Running     69         2d
busybox0-7665ddff5d-2856g   1/1       Running     69         2d
busybox1-f9585ffdb-tg2lj    1/1       Running     68         2d
busybox2-78c5d7bdb6-fhfdc   1/1       Running     68         2d
busybox3-74fd4b4f98-pp4kz   1/1       Running     69         2d
busybox4-55d568f8c4-q9hk9   1/1       Running     68         2d
busybox5-69f77b4fdb-d7jf2   1/1       Running     68         2d
busybox6-b5b869f4-2vnkz     1/1       Running     69         2d
busybox7-7df7958c4b-4bxzx   0/1       Completed   68         2d
busybox8-6d78f4f5d6-cvfx7   1/1       Running     69         2d
busybox9-86d49fdf4-75ddn    1/1       Running     68         2d

Thank you for your insights谢谢你的见解

EDIT : Adding infos for workers编辑:为工人添加信息

worker-0 :工人-0

root@worker-0:/home/admin# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:2b:ed:df:b7:58 brd ff:ff:ff:ff:ff:ff
    inet 10.240.1.230/24 brd 10.240.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::2b:edff:fedf:b758/64 scope link
       valid_lft forever preferred_lft forever
root@worker-0:/home/admin# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.240.1.1      0.0.0.0         UG    0      0        0 eth0
10.200.1.0      0.0.0.0         255.255.255.0   U     0      0        0 cnio0
10.240.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0

worker-2 :工人 2

root@worker-2:/home/admin# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:b0:2b:67:73:9e brd ff:ff:ff:ff:ff:ff
    inet 10.240.1.232/24 brd 10.240.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::b0:2bff:fe67:739e/64 scope link
       valid_lft forever preferred_lft forever
root@worker-2:/home/admin# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.240.1.1      0.0.0.0         UG    0      0        0 eth0
10.200.3.0      0.0.0.0         255.255.255.0   U     0      0        0 cnio0
10.240.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0

Your nodes are missing routes to other node pods' subnet.您的节点缺少通往其他节点 Pod 子网的路由。

To get it working you need to either add static routes on the worker nodes or add routes to all pods' subnets on the default gateway 10.240.1.1要使其正常工作,您需要在工作节点上添加静态路由或将路由添加到默认网关10.240.1.1上的所有 Pod 的子网

The first case:第一种情况:

Run on the worker1 node:worker1节点上运行:

route add -net 10.200.3.0/24 netmask 255.255.255.0 gw 10.240.1.232

Run on the worker2 node:worker2节点上运行:

route add -net 10.200.1.0/24 netmask 255.255.255.0 gw 10.240.1.230

In this case, traffic will go directly from one worker node to another, but if your cluster grows, you have to change the route table on all workers accordingly.在这种情况下,流量将直接从一个工作节点传输到另一个工作节点,但是如果您的集群增长,您必须相应地更改所有工作节点上的路由表。 However, these subnets will not be reachable from other VPC hosts without adding IP routes to the cloud router.但是,如果不向云路由器添加 IP 路由,则其他 VPC 主机将无法访问这些子网。

The second case:第二种情况:

On the default router ( 10.240.1.1 ):默认路由器( 10.240.1.1 ) 上:

route add -net 10.200.3.0/24 netmask 255.255.255.0 gw 10.240.1.232
route add -net 10.200.1.0/24 netmask 255.255.255.0 gw 10.240.1.230

In this case, traffic will be routed by default router, and if you add new nodes to your cluster, you will need to update only one route table on the default router.在这种情况下,流量将通过默认路由器路由,如果您向集群添加新节点,则只需更新默认路由器上的一个路由表。
This solution is used in the Routes part of the “Kubernetes the hard way”.该解决方案用于“Kubernetes the hard way”的路由部分

This article would be helpful for creating routes using AWS CLI. 本文将有助于使用 AWS CLI 创建路由。

Thanks @VAS, it was helpful,谢谢@VAS,很有帮助,

kubernet master Kubernetes 大师

# edit /etc/hosts

192.168.2.150 master master.localdomain
192.168.2.151 node1 node1.localdomain
192.168.2.152 node2 node2.localdomain
...

# then add routes
$ route add -net 10.244.1.0/24 gw node1
$ route add -net 10.244.2.0/24 gw node2
...

that's because那是因为

"..flannel gives each host an IP subnet (/24 by default).." “..flannel 为每个主机提供一个 IP 子网(默认情况下为 /24)..”

Flannel: A Network Fabric for Containers Flannel:容器的网络结构

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM