[英]Kubernetes - NodePort service can be accessed only on node where pod is deployed
I have set up a kubernetes cluster based on three VMs Centos 8 and I deployed a pod with nginx.我已经建立了一个基于三个虚拟机 Centos 8 的 kubernetes 集群,并使用 nginx 部署了一个 pod。
IP addresses of the VMs:虚拟机的 IP 地址:
kubemaster 192.168.56.20
kubenode1 192.168.56.21
kubenode2 192.168.56.22
On each VM the interfaces and routes are defined as following:在每个 VM 上,接口和路由定义如下:
ip addr:
lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:d2:1b:97 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3
valid_lft 75806sec preferred_lft 75806sec
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:df:77:05 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.22/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:ff:47:9a brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
5: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:ff:47:9a brd ff:ff:ff:ff:ff:ff
6: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:19:52:19:b1 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
7: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether 22:b8:b4:5a:5a:26 brd ff:ff:ff:ff:ff:ff
inet 10.244.2.0/32 brd 10.244.2.0 scope global flannel.1
valid_lft forever preferred_lft forever
ip route:
default via 10.0.2.2 dev enp0s3 proto dhcp metric 100
default via 192.168.56.1 dev enp0s8 proto static metric 101
10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 metric 100
10.244.0.0/24 via 10.244.0.0 dev flannel.1 onlink
10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.56.0/24 dev enp0s8 proto kernel scope link src 192.168.56.22 metric 101
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
On each VM I have two network adapters, one NAT for internet access (enp0s3) and one Host only Network for the 3 VMs to communicate (enp0s8) with each other (it is ok I tested it with ping command).在每个 VM 上,我有两个网络适配器,一个用于 Internet 访问的 NAT (enp0s3) 和一个用于 3 个 VM 相互通信的仅主机网络 (enp0s8)(我用 ping 命令对其进行了测试)。
On each VM I applied the following firewall rules:在每个 VM 上,我应用了以下防火墙规则:
firewall-cmd --permanent --add-port=6443/tcp # Kubernetes API server
firewall-cmd --permanent --add-port=2379-2380/tcp # etcd server client API
firewall-cmd --permanent --add-port=10250/tcp # Kubelet API
firewall-cmd --permanent --add-port=10251/tcp # kube-scheduler
firewall-cmd --permanent --add-port=10252/tcp # kube-controller-manager
firewall-cmd --permanent --add-port=8285/udp # Flannel
firewall-cmd --permanent --add-port=8472/udp # Flannel
firewall-cmd --add-masquerade –permanent
firewall-cmd --reload
finally I deployed the cluster and nginx with the following commands:最后,我使用以下命令部署了集群和 nginx:
sudo kubeadm init --apiserver-advertise-address=192.168.56.20 --pod-network-cidr=10.244.0.0/16 (for Flannel CNI)
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl create deployment nginx --image=nginx
kubectl create service nodeport nginx --tcp=80:80
More general information of my cluster:我的集群的更多一般信息:
kubectl get nodes -o wide kubectl 获取节点 -o 宽
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kubemaster Ready master 3h8m v1.19.2 192.168.56.20 <none> CentOS Linux 8 (Core) 4.18.0-193.19.1.el8_2.x86_64 docker://19.3.13
kubenode1 Ready <none> 3h6m v1.19.2 192.168.56.21 <none> CentOS Linux 8 (Core) 4.18.0-193.19.1.el8_2.x86_64 docker://19.3.13
kubenode2 Ready <none> 165m v1.19.2 192.168.56.22 <none> CentOS Linux 8 (Core) 4.18.0-193.19.1.el8_2.x86_64 docker://19.3.13
kubectl get pods --all-namespaces -o wide kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default nginx-6799fc88d8-mrvsg 1/1 Running 0 3h 10.244.1.3 kubenode1 <none> <none>
kube-system coredns-f9fd979d6-6qxk9 1/1 Running 0 3h9m 10.244.1.2 kubenode1 <none> <none>
kube-system coredns-f9fd979d6-bj2fd 1/1 Running 0 3h9m 10.244.0.2 kubemaster <none> <none>
kube-system etcd-kubemaster 1/1 Running 0 3h9m 192.168.56.20 kubemaster <none> <none>
kube-system kube-apiserver-kubemaster 1/1 Running 0 3h9m 192.168.56.20 kubemaster <none> <none>
kube-system kube-controller-manager-kubemaster 1/1 Running 0 3h9m 192.168.56.20 kubemaster <none> <none>
kube-system kube-flannel-ds-fdv4p 1/1 Running 0 166m 192.168.56.22 kubenode2 <none> <none>
kube-system kube-flannel-ds-vvhsz 1/1 Running 0 3h6m 192.168.56.21 kubenode1 <none> <none>
kube-system kube-flannel-ds-vznl5 1/1 Running 0 3h6m 192.168.56.20 kubemaster <none> <none>
kube-system kube-proxy-45tmz 1/1 Running 0 3h9m 192.168.56.20 kubemaster <none> <none>
kube-system kube-proxy-nb7jt 1/1 Running 0 3h7m 192.168.56.21 kubenode1 <none> <none>
kube-system kube-proxy-tl9n5 1/1 Running 0 166m 192.168.56.22 kubenode2 <none> <none>
kube-system kube-scheduler-kubemaster 1/1 Running 0 3h9m 192.168.56.20 kubemaster <none> <none>
kubectl get service -o wide kubectl get service -o wide
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3h10m <none>
nginx NodePort 10.102.152.25 <none> 80:30086/TCP 179m app=nginx
Kubernetes version: Kubernetes 版本:
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:41:02Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:58Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
iptables version: iptables 版本:
iptables v1.8.4 (nf_tables)
Results and issue:结果和问题:
What I tried to debug:我试图调试的内容:
sudo netstat -antup | grep kube-proxy
o tcp 0 0 0.0.0.0:30086 0.0.0.0:* LISTEN 4116/kube-proxy
o tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN 4116/kube-proxy
o tcp 0 0 192.168.56.20:49812 192.168.56.20:6443 ESTABLISHED 4116/kube-proxy
o tcp6 0 0 :::10256 :::* LISTEN 4116/kube-proxy
Thus on each VM it seems the kube-proxy listens on port 30086 which is OK.因此,在每个 VM 上,kube-proxy 似乎在端口 30086 上侦听,这没问题。
I tried to apply this rule on each node (found on another ticket) without any success:我尝试在每个节点(在另一张票上找到)上应用此规则,但没有成功:
iptables -A FORWARD -j ACCEPT
Do you have any idea why I can't reach the service from master node and node 2?您知道为什么我无法从主节点和节点 2 访问服务吗?
First update:第一次更新:
kubectl logs kube-flannel-ds-nn6v4 -n kube-system:
I0929 06:19:36.842149 1 main.go:531] Using interface with name enp0s8 and address 192.168.56.22
I0929 06:19:36.842243 1 main.go:548] Defaulting external address to interface address (192.168.56.22)
Even by fixing these two things I still have the same issue...即使通过修复这两件事我仍然有同样的问题......
Second update:第二次更新:
The final solution was to flush iptables on each VM with the following commands:最终的解决方案是使用以下命令刷新每个 VM 上的 iptables:
systemctl stop kubelet
systemctl stop docker
iptables --flush
iptables -tnat --flush
systemctl start kubelet
systemctl start docker
Now it is working correctly :)现在它工作正常:)
This is because you are running k8s on CentOS 8.这是因为您在 CentOS 8 上运行 k8s。
According to kubernetes documentation the list of supported host operating systems is as follows:根据kubernetes文档,支持的主机操作系统列表如下:
- Ubuntu 16.04+
Ubuntu 16.04+
- Debian 9+
Debian 9+
- CentOS 7
CentOS 7
- Red Hat Enterprise Linux (RHEL) 7
红帽企业 Linux (RHEL) 7
- Fedora 25+
软呢帽 25+
- HypriotOS v1.0.1+
HypriotOS v1.0.1+
- Flatcar Container Linux (tested with 2512.3.0)
Flatcar Container Linux(使用2512.3.0测试)
This article mentioned that there are network issues on RHEL 8:这篇文章提到RHEL 8上存在网络问题:
(2020/02/11 Update : After installation, I keep facing pod network issue which is like deployed pod is unable to reach external network or pods deployed in different workers are unable to ping each other even I can see all nodes (master, worker1 and worker2) are ready via kubectl get nodes . After checking through the Kubernetes.io official website, I observed the nfstables backend is not compatible with the current kubeadm packages. Please refer the following link in “ Ensure iptables tooling does not use the nfstables backend ”.
(2020/02/11 更新: 安装后,我一直面临 pod 网络问题,例如部署的 pod 无法访问外部网络或部署在不同 worker 中的 pod 无法互相 ping 通,即使我可以看到所有节点 (master, worker1)和worker2)已经通过kubectl get nodes准备好了。通过Kubernetes.io官网查看后,发现nfstables后端与当前的kubeadm包不兼容。请参考以下链接“确保iptables工具不使用nfstables后端”。
The simplest solution here is to reinstall the nodes on supported operating system.这里最简单的解决方案是在支持的操作系统上重新安装节点。
I finally found the solution after having switched to Centos 7 and correct Flannel configuration (see other comments).在切换到 Centos 7 并正确 Flannel 配置后,我终于找到了解决方案(请参阅其他评论)。 Actually, I noticed some issues in the pods where coredns is running.
实际上,我注意到运行 coredns 的 pod 中存在一些问题。 Here is an example of what happens inside one of these pods:
以下是其中一个 Pod 内发生的情况的示例:
kubectl logs coredns-f9fd979d6-8gtlp -n kube-system:
E0929 07:09:40.200413 1 reflector.go:178] pkg/mod/k8s.io/client-go@v0.18.3/tools/cache/reflector.go:125: Failed to list *v1.Endpoints: Get "https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: connect: no route to host
[INFO] plugin/ready: Still waiting on: "kubernetes"
The final solution was to flush iptables on each VM with the following commands:最终的解决方案是使用以下命令刷新每个 VM 上的 iptables:
systemctl stop kubelet
systemctl stop docker
iptables --flush
iptables -tnat --flush
systemctl start kubelet
systemctl start docker
Then I can access the service deployed from each VM :)然后我可以访问从每个 VM 部署的服务:)
I am still not sure to understand clearly what was the issue.我仍然不确定是否清楚地了解了问题所在。 Here is some information:
以下是一些信息:
I will keep investigating and post more information here.我将继续调查并在此处发布更多信息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.