[英]Prevent kube-system pods from running on a specific node
I have a cluster running on GKE.我有一个在 GKE 上运行的集群。 I created 2 separated node pools.我创建了 2 个独立的节点池。 My first node pool (let's call it main-pool
) scales from 1 to 10 nodes.我的第一个节点池(我们称之为main-pool
)从 1 个节点扩展到 10 个节点。 The second one (let's call it db-pool
) scales from 0 to 10 nodes.第二个(我们称之为db-pool
)从 0 到 10 个节点。 The db-pool
nodes have specific needs as I have to dynamically create some pretty big databases, requesting a lot of memory, while the main-pool
is for "light" workers. db-pool
节点有特定的需求,因为我必须动态创建一些非常大的数据库,请求大量内存,而main-pool
用于“轻”工人。 I used node selectors for my workers to be created on the right nodes and everything works fine.我在正确的节点上为我的工作人员使用了节点选择器,一切正常。
The problem I have is that the db-pool
nodes, because they request a lot of memory, are way more expensive and I want them to scale down to 0 when no database is running.我遇到的问题是db-pool
节点,因为它们需要大量内存,所以成本更高,我希望它们在没有数据库运行时缩小到 0。 It was working fine until I added the node selectors (I am not 100% sure but it seems to be when it happened), but now it will not scale down to less than 1 node.在我添加节点选择器之前,它一直工作正常(我不是 100% 确定,但似乎是在它发生的时候),但现在它不会缩小到少于 1 个节点。 I believe it is because some kube-system pods are running on this node:我相信这是因为一些 kube-system pod 正在这个节点上运行:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
heapster-v1.6.0-beta.1-6c9dfdb9f5-2htn7 3/3 Running 0 39m 10.56.18.22 gke-padawan-cluster-ipf-db-pool-bb2827a7-99pm <none>
metrics-server-v0.3.1-5b4d6d8d98-7h659 2/2 Running 0 39m 10.56.18.21 gke-padawan-cluster-ipf-db-pool-bb2827a7-99pm <none>
fluentd-gcp-v3.2.0-jmlcv 2/2 Running 0 1h 10.132.15.241 gke-padawan-cluster-ipf-db-pool-bb2827a7-99pm <none>
kube-proxy-gke-padawan-cluster-ipf-db-pool-bb2827a7-99pm 1/1 Running 0 1h 10.132.15.241 gke-padawan-cluster-ipf-db-pool-bb2827a7-99pm <none>
prometheus-to-sd-stfz4 1/1 Running 0 1h 10.132.15.241 gke-padawan-cluster-ipf-db-pool-bb2827a7-99pm <none>
Is there any way to prevent it from happening?有什么办法可以防止它发生吗?
System pods like Fluentd and (eventually kube-proxy) are daemonsets and are required on each node;像 Fluentd 和(最终是 kube-proxy)这样的系统 pod 是守护进程,并且在每个节点上都是必需的; these shouldn't stop scaling down though.尽管如此,这些不应该停止缩小。 Pods like Heapster and metrics-server are not required and those can block the node pool from scaling down to 0.不需要像 Heapster 和 metrics-server 这样的 Pod,它们可以阻止节点池缩小到 0。
The best way to stop these non-node critical system pods from scheduling on your expensive node pool is to use taints and tolerations .阻止这些非节点关键系统 pod 在昂贵的节点池上进行调度的最佳方法是使用taints 和 tolerations 。 The taints will prevent pods from being scheduled to the nodes, you just need to make sure that the db pods do get scheduled on the larger node pool by setting tolerations along with the node selector.污点将阻止 pod 被调度到节点,您只需要通过设置容忍度和节点选择器来确保 db pod 确实在更大的节点池上被调度。
You should configure the node taints when you create the node pool so that new nodes are created with the taint already in place.您应该在创建节点池时配置节点污点,以便创建新节点时污点已经就位。 With proper taints and tolerations, your node pool should be able to scale down to 0 without issue.有了适当的污点和容忍度,您的节点池应该能够毫无问题地缩小到 0。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.