简体   繁体   English

Kubernetes 吊舱未就绪

[英]Kubernetes pod not READY

I am super new to Kubernetes.我对 Kubernetes 非常陌生。 I have inherited a side project - really an in progress POC - from another developer that recently left the team.我从最近离开团队的另一位开发人员那里继承了一个副项目——实际上是一个正在进行的 POC。 He did a demo from a VM that we still have access to before he abruptly left.在他突然离开之前,他从我们仍然可以访问的 VM 上做了一个演示。 After he left we were able to go through his demo and things were working.他离开后,我们能够通过他的演示 go 并且一切正常。 One of the team members restarted the VM and now things are broken.一名团队成员重新启动了虚拟机,现在事情已经坏了。 I've been assigned to figure things out.我被指派去解决问题。 I've been able to bring all the components back up aside from the Kubernetes part which all stack traces point to being the issue at the moment.除了 Kubernetes 部分之外,我已经能够恢复所有组件,所有堆栈跟踪都指向目前的问题。

As mentioned I am new to Kubernetes, so I lack the vocabulary to do proper searches online.如前所述,我是 Kubernetes 的新手,所以我缺乏在线进行适当搜索的词汇。

I have ran a few commands have pasted their output below.我已经运行了一些命令,将它们的 output 粘贴在下面。 If I understand correctly the issue is with the k8s deployment not running: kubectl get all如果我理解正确,问题在于 k8s 部署未运行:kubectl get all

NAME                        TYPE        CLUSTER-IP      EXTERNAL-IPPORT(S)  AGE                                                 AGE
service/kubernetes          ClusterIP   10.96.0.1       <none>              443/TCP                                             14d
service/app-service-5x7z    NodePort    10.96.215.11    <none>              3000:32155/TCP,3001:32762/TCP,27017:30770/TCP       3d

NAME                                    READY   UP-TO-DATE  AVAILABLE   AGE
deployment.apps/app-deployment-5x7z     0/1     1           0           3d

NAME                                    DESIRED     CURRENT     READY       AGE
replicaset.apps/app-deployment-5x7z     1           1           0           3d

I'm guessing that the issue is with the fact that the READY state is 0/1我猜问题在于 READY state 是0/1

Can someone please guide me as to how I can bring this guy back up?有人可以指导我如何让这个人恢复原状吗? Also, I see a lot of heavy documentation online, is there a place with a shallow bank that I can dive into the work of Kubernetes.另外,我在网上看到很多繁重的文档,有没有浅滩的地方可以深入研究 Kubernetes 的工作。 I'm very excited about this opportunity, but it just hasn't been a smooth start.我对这个机会感到非常兴奋,但这并不是一个顺利的开始。

Reset the node by重置节点

kubeadm reset -f

Cleanup old flannel installation清理旧法兰绒安装

rm -rf /var/lib/cni/
rm -rf /run/flannel
rm -rf /etc/cni/

ip link delete cni0
ip link delete flannel.1

and then install Kubernetes然后安装 Kubernetes

kubeadm init --pod-network-cidr=10.244.0.0/16

Install flannel安装法兰绒

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml

Writing down my two cents on your solution:在您的解决方案上写下我的两分钱:

  • After you did the kubeadm reset and kubeadm init your cluster was empty.在您完成kubeadm resetkubeadm init之后,您的集群为空。

1st Problem:第一个问题:

I applied those changes, but now when i run kubectl get all i only get the first line "service/kubernetes" and i no longer get anything regarding app-service-5x7z any chance you could give me a hint as to how to accomplish that [getting back the deployment]?我应用了这些更改,但是现在当我运行kubectl get all时,我只得到第一行“service/kubernetes”,我不再得到任何关于app-service-5x7z任何信息,你可以给我一个关于如何完成的提示[取回部署]?

Solution:解决方案:

  • You need to find the yaml files responsible for deploying the cluster, it's probably called app-deployment or something similar.您需要找到负责部署集群的 yaml 文件,它可能称为app-deployment或类似名称。

2nd Issue:第二期:

Here's my current situation.这是我目前的情况。 Since I have no idea what this guy used, i built the docker images, and i updated the service and deployment yaml files to use them.由于我不知道这个人使用了什么,我构建了 docker 映像,并更新了服务和部署 yaml 文件以使用它们。 I then ran kubectl apply -f <yaml_folder> which succeeds, but when i run kubectl get pods --watch i see the following: justpaste.it/3p9r1 any suggestions how i could debug and get to the root cause?然后我运行kubectl apply -f <yaml_folder>成功,但是当我运行kubectl get pods --watch时,我看到以下内容: justpaste.it/3p9r1有什么建议可以调试并找到根本原因吗? my understanding is that it's not able to pull the docker image.我的理解是它无法提取 docker 图像。 but since i just created it and it's located on the same machine (not in a registry), not sure what the problem is.但由于我刚刚创建它并且它位于同一台机器上(不在注册表中),所以不确定问题是什么。

Solution:解决方案:

From PrePulledImages Documentation:来自PrePulledImages文档:

By default, the kubelet will try to pull each image from the specified registry.默认情况下,kubelet 会尝试从指定的注册表中拉取每个镜像。 However, if the imagePullPolicy property of the container is set to IfNotPresent or Never , then a local image is used (preferentially or exclusively, respectively).但是,如果容器的imagePullPolicy属性设置为IfNotPresentNever ,则使用本地图像(分别优先或独占)。 All pods will have read access to any pre-pulled images.所有 pod 都将具有对任何预拉图像的读取权限。

If a docker image is in the local registry of the node, you have to set imagePullPolicy: Never on the deployment file.如果 docker 映像在节点的本地注册表中,则必须在部署文件上设置imagePullPolicy: Never Note that the image must be present on all nodes local repositories to ensure availability.请注意,该映像必须存在于所有节点的本地存储库中以确保可用性。

It's good also to create a Docker Hub Private Repository to ensure availability and integrity.最好创建一个Docker Hub Private Repository以确保可用性和完整性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM