简体   繁体   中英

Kubernetes PODs not accessible within the cluster

I tried to install Kubernetes with kubeadm on 3 virtual machines with Debian OS on my laptop, one as master node and the other two as worker nodes. I did exactly as the tutorials on kubernetes.io suggests. I initialized cluster with command kubeadm init --pod-network-cidr=10.244.0.0/16 and joined the workers with corresponding kube join command. I installed Flannel as the network overlay with command kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml .

The repsonse of command kubectl get nodes looks fine:

NAME        STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE
k8smaster   Ready    master   20h   v1.18.3   192.168.1.100   <none>        Debian GNU/Linux 10 (buster)   4.19.0-9-amd64   docker://19.3.9
k8snode1    Ready    <none>   20h   v1.18.3   192.168.1.101   <none>        Debian GNU/Linux 10 (buster)   4.19.0-9-amd64   docker://19.3.9
k8snode2    Ready    <none>   20h   v1.18.3   192.168.1.102   <none>        Debian GNU/Linux 10 (buster)   4.19.0-9-amd64   docker://19.3.9

The response of command kubectl get pods --all-namespaces doesn't show any error:

NAMESPACE     NAME                                READY   STATUS    RESTARTS   AGE    IP              NODE        NOMINATED NODE   READINESS GATES
kube-system   coredns-66bff467f8-7hlnp             1/1     Running   9          20h    10.244.0.22     k8smaster   <none>           <none>
kube-system   coredns-66bff467f8-wmvx4             1/1     Running   11         20h    10.244.0.23     k8smaster   <none>           <none>
kube-system   etcd-k8smaster                      1/1     Running   11         20h    192.168.1.100   k8smaster   <none>           <none>
kube-system   kube-apiserver-k8smaster            1/1     Running   9          20h    192.168.1.100   k8smaster   <none>           <none>
kube-system   kube-controller-manager-k8smaster   1/1     Running   11         20h    192.168.1.100   k8smaster   <none>           <none>
kube-system   kube-flannel-ds-amd64-9c5rr          1/1     Running   17         20h    192.168.1.102   k8snode2    <none>           <none>
kube-system   kube-flannel-ds-amd64-klw2p          1/1     Running   21         20h    192.168.1.101   k8snode1    <none>           <none>
kube-system   kube-flannel-ds-amd64-x7vm7          1/1     Running   11         20h    192.168.1.100   k8smaster   <none>           <none>
kube-system   kube-proxy-jdfzg                    1/1     Running   11         19h    192.168.1.101   k8snode1    <none>           <none>
kube-system   kube-proxy-lcdvb                    1/1     Running   6          19h    192.168.1.102   k8snode2    <none>           <none>
kube-system   kube-proxy-w6jmf                    1/1     Running   11         20h    192.168.1.100   k8smaster   <none>           <none>
kube-system   kube-scheduler-k8smaster            1/1     Running   10         20h    192.168.1.100   k8smaster   <none>           <none>

Then i tried to create a POD with command kubectl apply -f podexample.yml with following content:

apiVersion: v1
kind: Pod
metadata:
  name: example 
spec:
  containers:
  - name: nginx 
    image: nginx

Command kubectl get pods -o wide shows that the POD is created on worker node1 and is in Running state.

NAME      READY   STATUS    RESTARTS   AGE    IP            NODE       NOMINATED NODE   READINESS GATES
example   1/1     Running   0          135m   10.244.1.14   k8snode1   <none>           <none>

The thing is, when i try to connect to the pod with curl -I 10.244.1.14 command i get the following response in master node:

curl: (7) Failed to connect to 10.244.1.14 port 80: Connection timed out

but the same command on the worker node1 responds successfully with:

HTTP/1.1 200 OK
Server: nginx/1.17.10
Date: Sat, 23 May 2020 19:45:05 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 14 Apr 2020 14:19:26 GMT
Connection: keep-alive
ETag: "5e95c66e-264"
Accept-Ranges: bytes

I thought maybe that's because somehow kube-proxy is not running on master node but command ps aux | grep kube-proxy ps aux | grep kube-proxy shows that it's running.

root     16747  0.0  1.6 140412 33024 ?        Ssl  13:18   0:04 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=k8smaster

Then i checked for kernel routing table with command ip route and it shows that packets destined for 10.244.1.0/244 get routed to flannel.

default via 192.168.1.1 dev enp0s3 onlink 
10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1 
10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink 
10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink 
169.254.0.0/16 dev enp0s3 scope link metric 1000 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
192.168.1.0/24 dev enp0s3 proto kernel scope link src 192.168.1.100 

Everything looks fine to me and i don't know what else should i check to see what's the problem. Am i missing something?

UPDATE1:

If i start an NGINX container on worker node1 and map it's 80 port to port 80 of the worker node1 host, then i can connect to it via command curl -I 192.168.1.101 from master node. Also, i didn't add any iptable rule and there is no firewall daemon like UFW installed on the machines. So, i think it's not a firewall issue.

UPDATE2:

I recreated the cluster and used canal instead of flannel , still no luck.

UPDATE3:

I took a look at canal and flannel logs with following commands and everything seems fine:

kubectl logs -n kube-system canal-c4wtk calico-node
kubectl logs -n kube-system canal-c4wtk kube-flannel
kubectl logs -n kube-system canal-b2fkh calico-node
kubectl logs -n kube-system canal-b2fkh kube-flannel 

UPDATE4:

for the sake of completeness, here are the logs of mentioned containers .

UPDATE5:

I tried to install specific version of kubernetes components and docker, to check if there is an issue related to versioning mismatch with following commands:

sudo apt-get install docker-ce=18.06.1~ce~3-0~debian
sudo apt-get install -y kubelet=1.12.2-00 kubeadm=1.12.2-00 kubectl=1.12.2-00 kubernetes-cni=0.6.0-00
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml

but nothing changed.

i even updated file /etc/bash.bashrc on all nodes to clear any proxy settings just to make sure it's not about proxy:

export HTTP_PROXY=
export http_proxy=
export NO_PROXY=127.0.0.0/8,192.168.0.0/16,172.0.0.0/8,10.0.0.0/8

and also added following environments to docker systemd file /lib/systemd/system/docker.service on all nodes:

Environment="HTTP_PROXY="
Environment="NO_PROXY="

Then rebooted all nodes and when i logged in, still got curl: (7) Failed to connect to 10.244.1.12 port 80: Connection timed out

UPDATE6:

i event tried to setup the cluster in CentOS machines. thought maybe there is something related to Debian . i also stopped and disabled firewalld to make sure that firewall is not causing problem, but i got the exact same result again: Failed to connect to 10.244.1.2 port 80: Connection timed out .

The only thing that now i'm suspicious about is that maybe it's all because of VirtualBox and virtual machines network configuration? The virtual machines are attched to a Bridge Adapter connected to my Wireless network interface.

UPDATE7:

I went inside the created POD and figured out there is no internet connectivity inside the POD. So, I created another POD from a NGINX image that has commands like curl , wget , ping and traceroute and tried curl https://www.google.com -I and got result: curl: (6) Could not resolve host: www.google.com . I checked /etc/resolv.conf file and found that the DNS server address inside the POD is 10.96.0.10 . Changed the DNS to 8.8.8.8 still curl https://www.google.com -I results in curl: (6) Could not resolve host: www.google.com . Tried to ping 8.8.8.8 and the result is 56 packets transmitted, 0 received, 100% packet loss, time 365ms . For the last step i tried traceroute 8.8.8.8 and got the following result:

 1  10.244.1.1 (10.244.1.1)  0.116 ms  0.056 ms  0.052 ms
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

I don't know the fact that there is no internet connectivity in POD has anything to do with the problem that i can't connect to POD within the cluster from nodes other than the one that POD is deployed on.

Debian system uses nftables for the iptables backend which is not compatible with Kubernetes network setup. So you have to set it to use iptables-legacy instead of nftables with the following commands:

sudo update-alternatives --set iptables /usr/sbin/iptables-legacy 
sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM