简体   繁体   English

1 个节点有 pod 在 kube.netes 集群中不能容忍的污点

[英]1 node(s) had taints that the pod didn't tolerate in kubernetes cluster

Today my kube.netes cluster v1.15.2 give me this error: 1 node(s) had taints that the pod didn't tolerate and the pods could not start.今天我的 kube.netes 集群 v1.15.2 给我这个错误: 1 node(s) had taints that the pod didn't tolerate无法启动。

It tells me one nodes have taints and I check the node status and works fine, how to know it exactly have taints?它告诉我一个节点有污点,我检查节点状态并且工作正常,如何知道它确实有污点?

I am searching from inte.net and all tells me that master node could not allocate for pods running by default.我正在从 inte.net 搜索,所有这些都告诉我主节点无法为默认运行的 pod 分配。 But now my kube.netes pods is not running a master node.但是现在我的 kube.netes pod 没有运行主节点。

  • What may cause my node have taints(for example this node have not enough resource)?什么可能导致我的节点有污点(例如这个节点没有足够的资源)?
  • What should I do to find out the taints of the node and fix it?我应该怎么做才能找出节点的污点并修复它?

You can use kubectl describe node <nodename> to check taints.您可以使用kubectl describe node <nodename>来检查污点。

kubectl describe node masternode
Name:               masternode
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-0-0-115
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    projectcalico.org/IPv4Address: 10.0.0.115/24
                    projectcalico.org/IPv4IPIPTunnelAddr: 192.168.217.0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Thu, 18 Jun 2020 10:21:48 +0530
Taints:             node-role.kubernetes.io/master:NoSchedule

The node controller automatically taints a Node when certain conditions are true.当某些条件为真时,节点 controller 会自动污染节点。 The following taints are built in:内置以下污点:

node.kubernetes.io/not-ready : Node is not ready. node.kubernetes.io/not-ready :节点未准备好。 This corresponds to the NodeCondition Ready being "False".这对应于 NodeCondition Ready 为“False”。

node.kubernetes.io/unreachable : Node is unreachable from the node controller. node.kubernetes.io/unreachable :无法从节点 controller 访问节点。 This corresponds to the NodeCondition Ready being "Unknown".这对应于 NodeCondition Ready 为“未知”。

node.kubernetes.io/out-of-disk : Node becomes out of disk. node.kubernetes.io/out-of-disk :节点超出磁盘。

node.kubernetes.io/memory-pressure : Node has memory pressure. node.kubernetes.io/memory-pressure :节点有 memory 压力。

node.kubernetes.io/disk-pressure : Node has disk pressure. node.kubernetes.io/disk-pressure :节点有磁盘压力。

node.kubernetes.io/network-unavailable : Node's network is unavailable. node.kubernetes.io/network-unavailable :节点的网络不可用。

node.kubernetes.io/unschedulable : Node is unschedulable. node.kubernetes.io/unschedulable :节点不可调度。

node.cloudprovider.kubernetes.io/uninitialized : When the kubelet is started with "external" cloud provider, this taint is set on a node to mark it as unusable. node.cloudprovider.kubernetes.io/uninitialized :当使用“外部”云提供商启动 kubelet 时,会在节点上设置此污点以将其标记为不可用。 After a controller from the cloud-controller-manager initializes this node, the kubelet removes this taint.在 cloud-controller-manager 中的 controller 初始化此节点后,kubelet 会删除此 taint。

Along with above a special taint node-role.kubernetes.io/master:NoSchedule is added by default to master nodes.除了上面一个特殊的污点node-role.kubernetes.io/master:NoSchedule默认添加到主节点。

The error typically comes if there is a taint on nodes for which you don't have corresponding toleration in pod spec.如果节点上有taint ,而您在 pod 规范中没有相应的toleration ,则通常会出现该错误。

Below is an example pod with toleration.下面是一个具有容忍度的示例 pod。

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  tolerations:
  - key: "example-key"
    operator: "Exists"
    effect: "NoSchedule"

By default master node is tainted (means no pod or workload will be scheduled on master node. and this is best practices because master node meant to run cluster component like ETCD, kubeapi-server etc. and all other application related pods should go onto worker nodes ) so that's why by default taint applied on master node.默认情况下,主节点被污染(意味着不会在主节点上安排 Pod 或工作负载。这是最佳实践,因为主节点意味着运行集群组件(如 ETCD、kubeapi-server 等)以及所有其他与应用程序相关的 Pod 应该 go 到 worker 上nodes ) 所以这就是为什么默认情况下 taint 应用在主节点上。 Taints and toleration work together to ensure that pods are not scheduled onto inappropriate nodes.污点和容忍一起工作,以确保 Pod 不会被调度到不合适的节点上。 One or more taints are applied to a node.一个或多个污点应用于节点。

To check if node has taint or not检查节点是否有污点

kubectl describe node <nodename> | grep Taints

and you will get something like this if any taint present on node如果节点上存在任何污点,您将得到类似的东西

node-role.kubernetes.io/master:NoSchedule

If you want to keep the taint on node as it is and still you want to schedule particular pod on that node then include this in your pod/deployment.yaml file.如果您想保持节点上的污点不变,并且仍然想在该节点上安排特定的 pod,则将其包含在您的 pod/deployment.yaml 文件中。

tolerations:
- key: "key"
  operator: "Exists"
  effect: "NoSchedule"

To get more info about this check this section https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/要获取有关此的更多信息,请查看此部分https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/

and If you want to remove taint from that node then follow these steps如果您想从该节点中删除污点,请按照以下步骤操作

First check the taint present or not with nodename首先用 nodename 检查 taint 是否存在

kubectl describe node <nodename> | grep Taints

and you will get something like this (master or worker_node)你会得到这样的东西(master或worker_node)

node-role.kubernetes.io/master:NoSchedule

To remove taint from node just run like this (here in my case it is master node)要从节点中删除污点,只需像这样运行(在我的例子中是主节点)

kubectl taint node master node-role.kubernetes.io/master:NoSchedule-

Make sure you add - in front of NoSchedule确保在 NoSchedule 前面添加-

kubectl describe node nodename | grep Taints`

kubectl taint node master node-role.kubernetes.io/master:NoSchedule-

This one works fine这个工作正常

This will list it all for you:这将为您列出所有内容:

kubectl get nodes -o json | kubectl 获取节点 -o json | jq '.items[].spec.taints' jq '.items[].spec.taints'

[
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/unschedulable",
    "timeAdded": "2022-03-29T15:13:37Z"
  }
]
null
null
[
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/unschedulable",
    "timeAdded": "2022-03-29T15:13:37Z"
  }
]

Try this试试这个

kubectl describe node <nodename> | grep Taints

you will get something like this,你会得到这样的东西,

node-role.kubernetes.io/master:NoSchedule

To remove taint去除污点

kubectl taint node <master-nodename> node-role.kubernetes.io/master:NoSchedule-

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 节点在部署到 Kubernetes 集群时存在 pod 无法容忍错误的污点 - Node had taints that the pod didn't tolerate error when deploying to Kubernetes cluster 1 个节点具有 pod 无法容忍的污点并且 pod 无法启动 - 1 node(s) had taints that the pod didn't tolerate and the pods could not start 0/3 个节点可用:1 个节点有 pod 不能容忍的污点,2 个 CPU 不足。 MR3蜂巢 - 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 Insufficient cpu. MR3 Hive 1 个节点有污染 {node.kubernetes.io/disk-pressure: },这是 Pod 无法容忍的 - 1 node(s) had taint {node.kubernetes.io/disk-pressure: }, that the pod didn't tolerate Pod 可以容忍一组污点中的一个吗 - Can a Pod tolerate one of a set of taints 1 个节点发生卷节点关联冲突,3 个节点与 Pod 的节点关联/选择器不匹配 - 1 node(s) had volume node affinity conflict, 3 node(s) didn't match Pod's node affinity/selector Kube.netes Pod 警告:1 个节点有卷节点关联冲突 - Kubernetes Pod Warning: 1 node(s) had volume node affinity conflict Kubernetes Pod 无法启动 - 1 个节点发生卷关联冲突 - Kubernetes Pod won't start - 1 node(s) had a volume affinity conflict kubernetes 主节点上没有污点 - No taints on kubernetes master node cluster-autoscaler 部署失败,并显示“1 pod 太多,3 个节点与 Pod 的节点关联/选择器不匹配” - cluster-autoscaler deployment fails with "1 Too many pods, 3 node(s) didn't match Pod's node affinity/selector"
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM