简体   繁体   English

Kubeadm和主节点上的Pod调度风险(Pod总是等待)

[英]Kubeadm and the Risks of Scheduling Pods on Master Node (Pods always Pending)

While following the kubernetes article on Using kubeadm to Create a Cluster , I was stuck when the AddOn pods I was trying to install (Nginx, Tiller, Grafana, InfluxDB, Dashboard) would always stay in a state of Pending . 在关于使用kubeadm创建群集的kubernetes文章的同时,当我尝试安装的AddOn pod(Nginx,Tiller,Grafana,InfluxDB,Dashboard)始终处于Pending状态时,我被困住了。

Checking the message from kubectl describe pod tiller-deploy-df4fdf55d-jwtcz --namespace=kube-system resulted in the following message: 检查来自kubectl describe pod tiller-deploy-df4fdf55d-jwtcz --namespace=kube-system的消息kubectl describe pod tiller-deploy-df4fdf55d-jwtcz --namespace=kube-system导致以下消息:

Type     Reason            Age                From               Message
----     ------            ----               ----               -------
Warning  FailedScheduling  51s (x15 over 3m)  default-scheduler  0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.

When I ran the command from the Master Isolation section kubectl taint nodes --all node-role.kubernetes.io/master- , the AddOns would install as expected. 当我从主隔离部分运行命令kubectl taint nodes --all node-role.kubernetes.io/master- ,AddOns将按预期安装。

At this point I can only suspect (because they are already installed on the master node) that the reason was that I hadn't connected a worker node to the cluster yet for the scheduler to schedule the pods on. 此时我只能怀疑(因为它们已经安装在主节点上),原因是我还没有将工作节点连接到集群,而调度程序还没有安排pod。

The documentation states "your cluster will not schedule pods on the master for security reasons". 该文档指出“出于安全原因,您的群集不会在主服务器上安排pod”。 I know that this is a non-production environment so there is little risk in this situation but what is the risk of removing that taint in a production cluster? 我知道这是一个非生产环境,因此在这种情况下几乎没有风险但是在生产集群中消除污染的风险是什么?

Follow-up: If this is a risk, how can I re-add that taint so I can then uninstall the AddOn pods and try to have the scheduler install them on my Worker Node? 后续行动:如果这是一个风险,我该如何重新添加该污点,以便我可以卸载AddOn pod并尝试让调度程序在我的工作节点上安装它们?

Environment Details: Operating System - CentOS 7.4.1708 (Core) Kubernetes Version - 1.10 环境详细信息:操作系统 - CentOS 7.4.1708(核心)Kubernetes版本 - 1.10

the reason was that I hadn't connected a worker node to the cluster yet for the scheduler to schedule the pods on. 原因是我还没有将工作节点连接到集群,以便调度程序安排pod。

100% correct. 100%正确。 You will for sure want some worker nodes, otherwise the idea of "scheduling work" becomes very weird. 你肯定会想要一些工作节点,否则“调度工作”的想法变得非常奇怪。

but what is the risk of removing that taint in a production cluster? 但是在生产集群中消除污染的风险是什么?

I am not a kubernetes security expert, but a pragmatic risk is CPU, I/O, and/or memory exhaustion on the master nodes, which would have very severe consequences to the health of the cluster. 我不是kubernetes安全专家,但实际风险是主节点上的CPU,I / O和/或内存耗尽,这将对集群的运行状况产生非常严重的影响。 There is almost never a reason to run any workload on a master node, and almost entirely an increase in risk, so the advice "just don't do it" is well founded. 几乎没有理由在主节点上运行任何工作负载,并且几乎完全增加了风险,因此“只是不这样做”的建议是有根据的。

how can I re-add that taint so I can then uninstall the AddOn pods and try to have the scheduler install them on my Worker Node? 如何重新添加该污点,以便我可以卸载AddOn pod并尝试让调度程序在我的工作节点上安装它们?

I'm not sure I follow that question, but I would for sure start by just adding a worker node before trying to do complicated stuff with taints and tolerations. 我不确定我是否遵循了这个问题,但我肯定会先尝试添加一个工作节点,然后再尝试使用污点和容忍来复杂化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM