[英]Kubernetes' container creation with flannel gets stuck in “ContainerCreating”-state
Context语境
I installed Docker
following this instruction on my Ubuntu 18.04 LTS (Server)
and later on Kubernetes
followed via kubeadm
.我按照此说明在我的
Ubuntu 18.04 LTS (Server)
上安装了Docker
,随后在Kubernetes
通过kubeadm
。 After initializing ( kubeadm init --pod-network-cidr=10.10.10.10/24
) and joining a second node (I got a two node cluster for the start) I cannot get my coredns as well as the later applied Web UI (Dashboard) to actually go into status Running .初始化(
kubeadm init --pod-network-cidr=10.10.10.10/24
)并加入第二个节点(我有一个双节点集群作为开始)后,我无法获得我的coredns以及后来应用的Web UI(仪表板) )实际进入Running状态。
As pod network I tried both, Flannel ( kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
) and Weave Net - Nothing changed.作为 pod 网络,我尝试了Flannel (
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
)和Weave Net - 没有任何改变。 It still shows status ContainerCreating , even after hours of waiting:即使经过数小时的等待,它仍然显示状态ContainerCreating :
Question问题
Why doesn't the container creation work as expected and what might be the root cause for this?为什么容器创建没有按预期工作,这可能是什么根本原因? And most importantly: How do I solve this?
最重要的是:我该如何解决这个问题?
Edit编辑
Summing up my answer below, here are the reasons why:总结一下我的回答,原因如下:
cgroups
instead of systemd
cgroups
而不是systemd
iptables
correctlyiptables
kubeadm init
since flannels standard-yaml requires --pod-network-cidr
to be 10.244.0.0/16
kubeadm init
因为法兰绒标准YAML需要--pod-network-cidr
是10.244.0.0/16
Since answering this questions took me a lot of time, I wanted to share what got me out of this.由于回答这些问题花了我很多时间,我想分享一下是什么让我摆脱了这个问题。 There might be some more code than necessary, but I also want this to be in one place if I or someone else has to redo all steps.
可能有一些不必要的代码,但如果我或其他人必须重做所有步骤,我也希望将其放在一个地方。
First it all started with Docker...首先,一切都始于 Docker……
I figured out that it presumably all started with the way I installed Docker .我发现这大概都是从我安装Docker的方式开始的。 Following the linked online-instructions I used
sudo apt-get install docker.io
in order to install Docker and used it with cgroups
by doing sudo usermod -aG docker $USER
.按照链接的在线说明,我使用
sudo apt-get install docker.io
来安装Docker并通过执行sudo usermod -aG docker $USER
将其与cgroups
一起使用。
Well, taking a look at the official instructions from Kubernetes this was a mistake: systemd
is the recommended way to go!好吧,看看Kubernetes的官方说明,这是一个错误:
systemd
是推荐的方法!
So I completly purged all I ever did with docker by following these great instructions from Mayur Bhandare:因此,我遵循 Mayur Bhandare 的这些重要说明,彻底清除了我对 docker 所做的一切:
sudo apt-get purge -y docker-engine docker docker.io docker-ce
sudo apt-get autoremove -y --purge docker-engine docker docker.io docker-ce
sudo rm -rf /var/lib/docker /etc/docker
sudo rm /etc/apparmor.d/docker
sudo groupdel docker
sudo rm -rf /var/run/docker.sock
# Reboot to be sure
Afterwards I installed reinstalled the official way (keep in mind that this might change in the future):之后我以官方方式安装了重新安装(请记住,这将来可能会改变):
# Install Docker CE
## Set up the repository:
### Install packages to allow apt to use a repository over HTTPS
apt-get update && apt-get install -y \
apt-transport-https ca-certificates curl software-properties-common gnupg2
### Add Docker’s official GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
### Add Docker apt repository.
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
## Install Docker CE.
apt-get update && apt-get install -y \
containerd.io=1.2.10-3 \
docker-ce=5:19.03.4~3-0~ubuntu-$(lsb_release -cs) \
docker-ce-cli=5:19.03.4~3-0~ubuntu-$(lsb_release -cs)
# Setup daemon.
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
mkdir -p /etc/systemd/system/docker.service.d
# Restart docker.
systemctl daemon-reload
systemctl restart docker
Note that this explicitly uses systemd
!请注意,这明确使用
systemd
!
... and then it went on with Flannel... ......然后它继续与法兰绒......
Above I wrote my sudo kubeadm init
was done with --pod-network-cidr=10.10.10.10/24
since the latter was the IP of my master.上面我写了我的
sudo kubeadm init
是用--pod-network-cidr=10.10.10.10/24
因为后者是我主人的 IP。 Well, as pointed out here not using the official recommended --pod-network-cidr=10.244.0.0/16
results in an error for example using kubectl proxy
or the container-creation when using the provided kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
.好吧,正如这里所指出的,不使用官方推荐的
--pod-network-cidr=10.244.0.0/16
导致错误,例如在使用提供的kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
时使用kubectl proxy
或容器创建kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
。 This is due to the fact that 10.244.0.0/16
is hard-linked in the .yaml
and, hence, mandatory - Or you just change it in the .yaml
.这是由于这样的事实:
10.244.0.0/16
在硬链接.yaml
,因此,强制性的-或者你只是改变它在.yaml
。
In order to get rid of the false configuration I did a full reset.为了摆脱错误的配置,我进行了完全重置。 This can be achieved using
sudo kubeadm reset
and by deleting the config with sudo rm -r ~/.kube/config
.这可以使用
sudo kubeadm reset
并使用sudo rm -r ~/.kube/config
删除配置来实现。 Anyhow, since I screwed it so much, I did a full reset by uninstalling and reinstalling kubeadm
and making sure it did use iptables
this time (which I also forgot to do before...).无论如何,由于我把它搞砸了,我通过卸载并重新安装
kubeadm
并确保它这次确实使用了iptables
(我之前也忘记这样做了......)来完全重置。
Here is a nice link how to fully uninstall all kubeadm-parts.这是一个很好的链接如何完全卸载所有 kubeadm-parts。
kubeadm reset
sudo apt-get purge kubeadm kubectl kubelet kubernetes-cni kube*
sudo apt-get autoremove
sudo rm -rf ~/.kube
For the sake of completeness, here is the reinstall as well:为了完整起见,这里也是重新安装:
# ensure legacy binaries are installed
sudo apt-get install -y iptables arptables ebtables
# switch to legacy versions
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
sudo update-alternatives --set arptables /usr/sbin/arptables-legacy
sudo update-alternatives --set ebtables /usr/sbin/ebtables-legacy
# Install Kubernetes with kubeadm
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
#reboot
... and finally it worked! ……终于成功了!
After the clean reinstallation I did the following:干净重新安装后,我执行了以下操作:
# Initialize with correct cidr
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
And then be astouned by the result:然后对结果感到震惊:
kubectl get pods --all-namespaces
On a site note: This also resolved the /run/flannel/subnet.env: no such file or directory
-error I encountered prior to these steps when describing the uncreated coredns.在站点注释上:这也解决了
/run/flannel/subnet.env: no such file or directory
- 在描述未创建的 coredns 时我在这些步骤之前遇到的错误。
So I had the same issue as stated above.所以我遇到了与上述相同的问题。 For me, this was the perfect solution to fix this, but also other pods were stuck on either pending or ContainerCreating.
对我来说,这是解决此问题的完美解决方案,但其他 pod 也被卡在挂起或 ContainerCreating 上。 In addition as the fix above, my flannel encountered an unnoticed error, so I needed to rerun the flannel create.
另外作为上面的修复,我的法兰绒遇到了一个未被注意到的错误,所以我需要重新运行法兰绒创建。
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.