简体   繁体   中英

Kubernetes: Container not able to ping www.google.com

I have kubernetes cluster running on 4 Raspberry-pi devices, out of which 1 is acting as master and other 3 are working as worker ie w1 , w2 , w3 . I have started a daemon set deployment, so each worker is running a pod of 2 containers.

w2 is running pod of 2 container. If I exec into any container and ping www.google.com from the container, I get the response. But if I do the same on w1 and w3 it says temporary failure in name resolution . All the pods in kube-system are running. I am using weave for networking. Below are all the pods for kube-system

NAME                                READY     STATUS    RESTARTS   AGE
etcd-master-pi                      1/1       Running   1          23h
kube-apiserver-master-pi            1/1       Running   1          23h
kube-controller-manager-master-pi   1/1       Running   1          23h
kube-dns-7b6ff86f69-97vtl           3/3       Running   3          23h
kube-proxy-2tmgw                    1/1       Running   0          14m
kube-proxy-9xfx9                    1/1       Running   2          22h
kube-proxy-nfgwg                    1/1       Running   1          23h
kube-proxy-xbdxl                    1/1       Running   3          23h
kube-scheduler-master-pi            1/1       Running   1          23h
weave-net-7sh5n                     2/2       Running   1          14m
weave-net-c7x8p                     2/2       Running   3          23h
weave-net-mz4c4                     2/2       Running   6          22h
weave-net-qtgmw                     2/2       Running   10         23h

If I am starting the containers using the normal docker container command but not from the kubernetes deployment then I do not see this issue. I think this is because of kube-dns . How can I debug this issue.?

You can start by checking if the dns is working

Run the nslookup on kubernetes.default from inside the pod, check if it is working.

[root@metrics-master-2 /]# nslookup kubernetes.default
Server:     10.96.0.10
Address:    10.96.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.96.0.1

Check the local dns configuration inside the pods:

[root@metrics-master-2 /]# cat /etc/resolv.conf 
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local ec2.internal
options ndots:5

At last, check the kube-dns container logs while you run the ping command, It will give you the possible reasons why the name is not resolving.

kubectl logs kube-dns-86f4d74b45-7c4ng -c kubedns -n kube-system

Hope this helps.

This might not be applicable to your scenario, but I wanted to document the solution I found. My issues ended up being related to a flannel network overlay setup on our master nodes.

# kubectl get pods --namespace kube-system
NAME                         READY   STATUS    RESTARTS   AGE
coredns-qwer                 1/1     Running   0          4h54m
coredns-asdf                 1/1     Running   0          4h54m
etcd-h1                      1/1     Running   0          4h53m
etcd-h2                      1/1     Running   0          4h48m
etcd-h3                      1/1     Running   0          4h48m
kube-apiserver-h1            1/1     Running   0          4h53m
kube-apiserver-h2            1/1     Running   0          4h48m
kube-apiserver-h3            1/1     Running   0          4h48m
kube-controller-manager-h1   1/1     Running   2          4h53m
kube-controller-manager-h2   1/1     Running   0          4h48m
kube-controller-manager-h3   1/1     Running   0          4h48m
kube-flannel-ds-amd64-asdf   1/1     Running   0          4h48m
kube-flannel-ds-amd64-qwer   1/1     Running   1          4h48m
kube-flannel-ds-amd64-zxcv   1/1     Running   0          3h51m
kube-flannel-ds-amd64-wert   1/1     Running   0          4h54m
kube-flannel-ds-amd64-sdfg   1/1     Running   1          4h41m
kube-flannel-ds-amd64-xcvb   1/1     Running   1          4h42m
kube-proxy-qwer              1/1     Running   0          4h42m
kube-proxy-asdf              1/1     Running   0          4h54m
kube-proxy-zxcv              1/1     Running   0          4h48m
kube-proxy-wert              1/1     Running   0          4h41m
kube-proxy-sdfg              1/1     Running   0          4h48m
kube-proxy-xcvb              1/1     Running   0          4h42m
kube-scheduler-h1            1/1     Running   1          4h53m
kube-scheduler-h2            1/1     Running   1          4h48m
kube-scheduler-h3            1/1     Running   0          4h48m
tiller-deploy-asdf           1/1     Running   0          4h28m

If I exec'd into any container and ping'd google.com from the container, I get a bad address response.

# ping google.com
ping: bad address 'google.com'

# ip route
default via 10.168.3.1 dev eth0
10.168.3.0/24 dev eth0 scope link  src 10.168.3.22
10.244.0.0/16 via 10.168.3.1 dev eth0

ip route varies from ip route run from the master node.

altering my pods deployment configuration to include the hostNetwork: true allowed me to ping outside my container.

my newly running pod ip route

# ip route
default via 172.25.10.1 dev ens192  metric 100
10.168.0.0/24 via 10.168.0.0 dev flannel.1 onlink
10.168.1.0/24 via 10.168.1.0 dev flannel.1 onlink
10.168.2.0/24 via 10.168.2.0 dev flannel.1 onlink
10.168.3.0/24 dev cni0 scope link  src 10.168.3.1
10.168.4.0/24 via 10.168.4.0 dev flannel.1 onlink
10.168.5.0/24 via 10.168.5.0 dev flannel.1 onlink
172.17.0.0/16 dev docker0 scope link  src 172.17.0.1
172.25.10.0/23 dev ens192 scope link  src 172.25.11.35  metric 100
192.168.122.0/24 dev virbr0 scope link  src 192.168.122.1

# ping google.com
PING google.com (172.217.6.110): 56 data bytes
64 bytes from 172.217.6.110: seq=0 ttl=55 time=3.488 ms

Update 1

My associate and I found a number of different websites which advise against setting hostNetwork: true . We then found this issue and are currently investigating it as a possible solution, sans hostNetwork: true .

Usually you'd do this with the '--ip-masq' flag to flannel which is 'false' by default and is defined as "setup IP masquerade rule for traffic destined outside of overlay network". Which sounds like what you want.

Update 2

It turns out that our flannel network overlay was misconfigured. We needed to ensure that our configmap for flannel had net-conf\\.json.network matching our networking.podSubnet ( kubeadm config view ). Changing these networks to match alleviated our networking woes. We were then able to remove hostNetwork: true from our deployments.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM