简体   繁体   English

kube-proxy 从 1.21 升级到 1.22 插入错误的 iptables 规则

[英]kube-proxy upgrade from 1.21 to 1.22 insert wrong iptables rule

I'm experimenting an issue upgrading kube-proxy from 1.21 to 1.22.我正在尝试将 kube-proxy 从 1.21 升级到 1.22 的问题。 Already update control-plane components (apiserver,scheduler and controller-manager) to 1.22 without any problem.已经将控制平面组件(apiserver、调度程序和控制器管理器)更新到 1.22,没有任何问题。 When I updated the first worker node (kubelet and kube-proxy), from 1.21 to 1.22, LoadBalancer Service on the node became unreachable, reverting to 1.21 fixed the problem.当我将第一个工作节点(kubelet 和 kube-proxy)从 1.21 更新到 1.22 时,节点上的 LoadBalancer 服务变得无法访问,恢复到 1.21 解决了这个问题。

I verified that ARP requests receive replies with the correct MAC address and I see correct traffic flow with tcpdump on the NIC of the node.我验证了 ARP 请求接收到带有正确 MAC 地址的回复,并且我在节点的 NIC 上使用 tcpdump 看到了正确的流量。

After a bit investigation on the worker node inside iptables rules I noticed that on 1.22 node I have this rule (nat table):在对 iptables 规则中的工作节点进行了一些调查后,我注意到在 1.22 节点上我有这个规则(nat 表):

-A KUBE-XLB-GYH4OE6JZWRDML2Y -m comment --comment "swp-customer/swpc-25abfa45-ac5c-487f-81b9-178602c569f3:http has no local endpoints" -j KUBE-MARK-DROP

On the 1.21, instead, I have this rules:相反,在 1.21 上,我有以下规则:

-A KUBE-XLB-B67G6CBBIZ3WMS7Y -m comment --comment "Balancing rule 0 for swp-customer/swpc-2ad2a9e3-25cf-430e-893b-dbd4ec77b197:http" -j KUBE-SEP-3LIV6VCSPFRWVHFU
-A KUBE-SEP-3LIV6VCSPFRWVHFU -p tcp -m comment --comment "swp-customer/swpc-2ad2a9e3-25cf-430e-893b-dbd4ec77b197:http" -m tcp -j DNAT --to-destination 10.244.1.219:80

The second one, on 1.21 node, is the correct rule in order to nat traffic to the container.第二个,在 1.21 节点上,是正确的规则,以便将流量 nat 到容器。

I guess that kube-proxy 1.22 thinks that there are no local endpoints (reverting to kube-proxy 1.21 on the same node works fine) but I can't figure out why.我猜 kube-proxy 1.22 认为没有本地端点(在同一个节点上恢复到 kube-proxy 1.21 工作正常),但我不知道为什么。 kube-proxy seems to start regularly and there is nothing strange in their log. kube-proxy 似乎定期启动,他们的日志中没有什么奇怪的。

My environment:我的环境:

  • k8s nodes: VM based on CentOS 7 with VNIC bridged to Physical NIC on hypervisor k8s 节点:VM 基于 CentOS 7,VNIC 桥接到管理程序上的物理 NIC
  • Container runtime: docker://19.3.5容器运行时:docker://19.3.5
  • k8s cluster deployment mode: from scratch k8s集群部署模式:从头开始
  • k8s network plugin: flannel + metallb k8s网络插件:flannel + metallb

Thanks a lot for any help非常感谢您的帮助

Delete and recreate service, with the same spec, solved the problem.删除并重新创建服务,使用相同的规范,解决了这个问题。

I don't known why because i compared saved yaml file, before deletion and after recreation, and they have the same fields.我不知道为什么,因为我比较了删除之前和重新创建后保存的 yaml 文件,并且它们具有相同的字段。

The issue: https://github.com/kubernetes/kubernetes/issues/110208问题: https://github.com/kubernetes/kubernetes/issues/110208

You can just restart/recreate one of the service backend to workaround.您只需重新启动/重新创建服务后端之一即可解决问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM