kube-proxy 从 1.21 升级到 1.22 插入错误的 iptables 规则

Question

I'm experimenting an issue upgrading kube-proxy from 1.21 to 1.22.我正在尝试将 kube-proxy 从 1.21 升级到 1.22 的问题。 Already update control-plane components (apiserver,scheduler and controller-manager) to 1.22 without any problem.已经将控制平面组件（apiserver、调度程序和控制器管理器）更新到 1.22，没有任何问题。 When I updated the first worker node (kubelet and kube-proxy), from 1.21 to 1.22, LoadBalancer Service on the node became unreachable, reverting to 1.21 fixed the problem.当我将第一个工作节点（kubelet 和 kube-proxy）从 1.21 更新到 1.22 时，节点上的 LoadBalancer 服务变得无法访问，恢复到 1.21 解决了这个问题。

I verified that ARP requests receive replies with the correct MAC address and I see correct traffic flow with tcpdump on the NIC of the node.我验证了 ARP 请求接收到带有正确 MAC 地址的回复，并且我在节点的 NIC 上使用 tcpdump 看到了正确的流量。

After a bit investigation on the worker node inside iptables rules I noticed that on 1.22 node I have this rule (nat table):在对 iptables 规则中的工作节点进行了一些调查后，我注意到在 1.22 节点上我有这个规则（nat 表）：

-A KUBE-XLB-GYH4OE6JZWRDML2Y -m comment --comment "swp-customer/swpc-25abfa45-ac5c-487f-81b9-178602c569f3:http has no local endpoints" -j KUBE-MARK-DROP

On the 1.21, instead, I have this rules:相反，在 1.21 上，我有以下规则：

-A KUBE-XLB-B67G6CBBIZ3WMS7Y -m comment --comment "Balancing rule 0 for swp-customer/swpc-2ad2a9e3-25cf-430e-893b-dbd4ec77b197:http" -j KUBE-SEP-3LIV6VCSPFRWVHFU
-A KUBE-SEP-3LIV6VCSPFRWVHFU -p tcp -m comment --comment "swp-customer/swpc-2ad2a9e3-25cf-430e-893b-dbd4ec77b197:http" -m tcp -j DNAT --to-destination 10.244.1.219:80

The second one, on 1.21 node, is the correct rule in order to nat traffic to the container.第二个，在 1.21 节点上，是正确的规则，以便将流量 nat 到容器。

I guess that kube-proxy 1.22 thinks that there are no local endpoints (reverting to kube-proxy 1.21 on the same node works fine) but I can't figure out why.我猜 kube-proxy 1.22 认为没有本地端点（在同一个节点上恢复到 kube-proxy 1.21 工作正常），但我不知道为什么。 kube-proxy seems to start regularly and there is nothing strange in their log. kube-proxy 似乎定期启动，他们的日志中没有什么奇怪的。

My environment:我的环境：

k8s nodes: VM based on CentOS 7 with VNIC bridged to Physical NIC on hypervisor k8s 节点：VM 基于 CentOS 7，VNIC 桥接到管理程序上的物理 NIC
Container runtime: docker://19.3.5容器运行时：docker://19.3.5
k8s cluster deployment mode: from scratch k8s集群部署模式：从头开始
k8s network plugin: flannel + metallb k8s网络插件：flannel + metallb

Thanks a lot for any help非常感谢您的帮助

Answer 1

Delete and recreate service, with the same spec, solved the problem.删除并重新创建服务，使用相同的规范，解决了这个问题。

I don't known why because i compared saved yaml file, before deletion and after recreation, and they have the same fields.我不知道为什么，因为我比较了删除之前和重新创建后保存的 yaml 文件，并且它们具有相同的字段。

Answer 2

The issue: https://github.com/kubernetes/kubernetes/issues/110208问题： https://github.com/kubernetes/kubernetes/issues/110208

You can just restart/recreate one of the service backend to workaround.您只需重新启动/重新创建服务后端之一即可解决问题。

kube-proxy 从 1.21 升级到 1.22 插入错误的 iptables 规则

问题描述

2 个解决方案

解决方案1
0 2022-01-24 14:03:14

解决方案2
0 2022-05-26 11:10:57

kube-proxy 从 1.21 升级到 1.22 插入错误的 iptables 规则

问题描述

2 个解决方案

解决方案1 0 2022-01-24 14:03:14

解决方案2 0 2022-05-26 11:10:57

解决方案1
0 2022-01-24 14:03:14

解决方案2
0 2022-05-26 11:10:57