简体   繁体   English

为什么kube-proxy无法将流量路由到另一个工作程序节点?

[英]Why doesn't kube-proxy route traffic to another worker node?

I've deployed several different services and always get the same error. 我已经部署了几种不同的服务,并且总是收到相同的错误。

The service is reachable on the node port from the machine where the pod is running. 可以从运行Pod的计算机的节点端口上访问该服务。 On the two other nodes I get timeouts. 在其他两个节点上,我超时了。

The kube-proxy is running on all worker nodes and I can see in the logfiles from kube-proxy that the service port was added and the node port was opened. kube-proxy正在所有工作程序节点上运行,我可以从kube-proxy的日志文件中看到已添加服务端口并且已打开节点端口。 In this case I've deployed the stars demo from calico 在这种情况下,我已经部署了calico的stars demo

Kube-proxy log output: Kube-proxy日志输出:

Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.229458     659 service.go:309] Adding new service port "management-ui/management-ui:" at 10.32.0.133:9001/TCP
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.257483     659 proxier.go:1427] Opened local port "nodePort for management-ui/management-ui:" (:30002/tcp)

The kube-proxy is listening on the port 30002 kube-proxy正在侦听端口30002

root@kuben1:/tmp# netstat -lanp | grep 30002
tcp6       0      0 :::30002                :::*                    LISTEN      659/kube-proxy   

There are also some iptable rules defined: 还定义了一些iptable规则:

root@kuben1:/tmp# iptables -L -t nat | grep management-ui
KUBE-MARK-MASQ  tcp  --  anywhere             anywhere             /* management-ui/management-ui: */ tcp dpt:30002
KUBE-SVC-MIYW5L3VT4JVLCIZ  tcp  --  anywhere             anywhere             /* management-ui/management-ui: */ tcp dpt:30002
KUBE-MARK-MASQ  tcp  -- !10.200.0.0/16        10.32.0.133          /* management-ui/management-ui: cluster IP */ tcp dpt:9001
KUBE-SVC-MIYW5L3VT4JVLCIZ  tcp  --  anywhere             10.32.0.133          /* management-ui/management-ui: cluster IP */ tcp dpt:9001

The interesting part is that I can reach the service IP from any worker node 有趣的是,我可以从任何工作节点访问服务IP

root@kubem1:/tmp# kubectl get svc -n management-ui
NAME            TYPE       CLUSTER-IP    EXTERNAL-IP   PORT(S)          AGE
management-ui   NodePort   10.32.0.133   <none>        9001:30002/TCP   52m

The service IP/port can be accessed from any worker node if I do a "curl http://10.32.0.133:9001 " 如果我执行“ curl http://10.32.0.133:9001 ”,则可以从任何工作程序节点访问服务IP /端口

I don't understand why kube-proxy does not "route" this properly... 我不明白为什么kube-proxy无法正确“路由”这个问题……
Has anyone a hint where I can find the error? 有没有人暗示我可以找到错误?


Here some cluster specs: 以下是一些群集规格:

This is a hand build cluster inspired by Kelsey Hightower's "kubernetes the hard way" guide. 这是一个手工构建集群,其灵感来自于Kelsey Hightower的“ kubernetes艰辛之路”指南。

  • 6 Nodes (3 master: 3 worker) local vms 6个节点(3个主服务器:3个工作器)本地虚拟机
  • OS: Ubuntu 18.04 操作系统:Ubuntu 18.04
  • K8s: v1.13.0 K8s:v1.13.0
  • Docker: 18.9.3 码头工人:18.9.3
  • Cni: calico 中国:印花布

Component status on the master nodes looks okay 主节点上的组件状态看起来还不错

root@kubem1:/tmp# kubectl get componentstatus
NAME                 STATUS    MESSAGE             ERROR
controller-manager   Healthy   ok                  
scheduler            Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"}     

The worker nodes are looking okay if I trust kubectl 如果我相信kubectl,工作节点看起来还可以

root@kubem1:/tmp# kubectl get nodes -o wide
NAME     STATUS   ROLES    AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
kuben1   Ready    <none>   39d   v1.13.0   192.168.178.77   <none>        Ubuntu 18.04.2 LTS   4.15.0-46-generic   docker://18.9.3
kuben2   Ready    <none>   39d   v1.13.0   192.168.178.78   <none>        Ubuntu 18.04.2 LTS   4.15.0-46-generic   docker://18.9.3
kuben3   Ready    <none>   39d   v1.13.0   192.168.178.79   <none>        Ubuntu 18.04.2 LTS   4.15.0-46-generic   docker://18.9.3

As asked by P Ekambaram: 正如P Ekambaram所问:

root@kubem1:/tmp# kubectl get po -n kube-system
NAME                                   READY   STATUS    RESTARTS   AGE
calico-node-bgjdg                      1/1     Running   5          40d
calico-node-nwkqw                      1/1     Running   5          40d
calico-node-vrwn4                      1/1     Running   5          40d
coredns-69cbb76ff8-fpssw               1/1     Running   5          40d
coredns-69cbb76ff8-tm6r8               1/1     Running   5          40d
kubernetes-dashboard-57df4db6b-2xrmb   1/1     Running   5          40d

I've found a solution for my "Problem". 我为“问题”找到了解决方案。
This behavior was caused by a change in Docker v1.13.x and the issue was fixed in kubernetes with version 1.8. 此行为是由Docker v1.13.x中的更改引起的,此问题已在版本1.8的kubernetes中修复。

The easy solution was to change the forward rules via iptables. 一种简单的解决方案是通过iptables更改转发规则。
Run the following cmd on all worker nodes: "iptables -A FORWARD -j ACCEPT" 在所有工作节点上运行以下cmd:“ iptables -A FORWARD -j ACCEPT”

To fix it the right way i had to tell the kube-proxy the cidr for the pods. 为了正确地解决它,我不得不告诉kube-proxy豆荚的cidr。 Theoretical that could be solved in two ways: 理论上可以通过两种方式解决:

  • Add "--cluster-cidr=10.0.0.0/16" as argument to the kube-proxy command line(in my case in the systemd service file) 将“ --cluster-cidr = 10.0.0.0 / 16”添加为kube-proxy命令行的参数(在我的情况下为systemd服务文件)
  • Add 'clusterCIDR: "10.0.0.0/16"' to the kubeconfig file for kube-proxy 将'clusterCIDR:“ 10.0.0.0/16”'添加到kubeconfig文件以获取kube-proxy

In my case the cmd line argument doesn't had any effect. 就我而言,cmd line参数没有任何作用。
As i've added the line to my kubeconfig file and restarted the kube-proxy on all worker nodes everything works well. 当我将行添加到我的kubeconfig文件中并在所有工作节点上重新启动kube-proxy时,一切正常。

Here is the github merge request for this "FORWARD" issue: link 这是此“ FORWARD”问题的github合并请求: 链接

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 具有 IPVS 模式的 Kube-proxy 不保持连接 - Kube-proxy with IPVS mode doesn't keep a connection EKS:使用 kube-proxy(或其他方式)将外部 VPC 流量路由到服务 ClusterIP? - EKS: Route external VPC traffic to service ClusterIP using kube-proxy (or something else)? 为什么某些kube-system Pod(例如kube-proxy)与它们所在的节点具有相同的Pod IP? - Why do certain kube-system Pods such as kube-proxy have the same Pod IP as the node that they are on? kube-proxy 问题 - 尝试将 windows 工作节点添加到 kubernetes 集群 - kube-proxy issues - trying to add a windows worker node to a kubernetes cluster 为什么 fluentd/kube-proxy/prometheus 的 GKE 中的 IP 地址等于节点地址 - Why are IP adresses in GKE for fluentd / kube-proxy/prometheus equal to node addresses 在iptables模式下使用kube-proxy无法从节点外部访问Kubernetes服务 - Can't reach Kubernetes service from outside of node when kube-proxy in iptables mode kubectl 主节点未就绪:启动 kube-proxy - kubectl master node notready: Starting kube-proxy Kube-proxy无法检索节点信息-无效的nodeIP - Kube-proxy fails to retrieve node info - invalid nodeIP 如何配置kube-proxy仅从本地节点服务 - How to configure kube-proxy to serve from local node only Kubernetes Kube-proxy无法检索节点信息 - Kubernetes Kube-proxy failed to retrieve node info
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM