Kubelet failed to get cgroup stats for “/system.slice/docker.service”

Question

Question

What the kubectl (1.8.3 on CentOS 7) error massage actually means and how to resolve.

Nov 19 22:32:24 master kubelet[4425]: E1119 22:32:24.269786 4425 summary.go:92] Failed to get system container stats for
"/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get con

Nov 19 22:32:24 master kubelet[4425]: E1119 22:32:24.269802 4425 summary.go:92] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get conta

Research

Found the same error and followed the workaround by updating the service unit of kubelet as below but did not work.

kubelet fails to get cgroup stats for docker and kubelet services

/etc/systemd/system/kubelet.service

[Unit]
Description=kubelet: The Kubernetes Node Agent
Documentation=http://kubernetes.io/docs/

[Service]
ExecStart=/usr/bin/kubelet --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice
Restart=always
StartLimitInterval=0
RestartSec=10

[Install]
WantedBy=multi-user.target

Background

Setting up Kubernetes cluster by following Install kubeadm . The section in the document Installing Docker says about aligning the cgroup driver as below.

Note: Make sure that the cgroup driver used by kubelet is the same as the one used by Docker. To ensure compatability you can either update Docker, like so:

cat << EOF > /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"]
}
EOF

But doing so caused docker service failed to start with:

unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag".
Nov 19 16:55:56 localhost.localdomain systemd 1 : docker.service: main process exited, code=exited, status=1/FAILURE.

Maser node is in ready with all system pods are running.

$ kubectl get pods --all-namespaces
NAMESPACE     NAME                             READY     STATUS    RESTARTS   AGE
kube-system   etcd-master                      1/1       Running   0          39m
kube-system   kube-apiserver-master            1/1       Running   0          39m
kube-system   kube-controller-manager-master   1/1       Running   0          39m
kube-system   kube-dns-545bc4bfd4-mqqqk        3/3       Running   0          40m
kube-system   kube-flannel-ds-fclcs            1/1       Running   2          13m
kube-system   kube-flannel-ds-hqlnb            1/1       Running   0          39m
kube-system   kube-proxy-t7z5w                 1/1       Running   0          40m
kube-system   kube-proxy-xdw42                 1/1       Running   0          13m
kube-system   kube-scheduler-master            1/1       Running   0          39m

Environment

Kubernetes 1.8.3 on CentOS with Flannel.

$ kubectl version -o json | python -m json.tool
{
    "clientVersion": {
        "buildDate": "2017-11-08T18:39:33Z",
        "compiler": "gc",
        "gitCommit": "f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd",
        "gitTreeState": "clean",
        "gitVersion": "v1.8.3",
        "goVersion": "go1.8.3",
        "major": "1",
        "minor": "8",
        "platform": "linux/amd64"
    },
    "serverVersion": {
        "buildDate": "2017-11-08T18:27:48Z",
        "compiler": "gc",
        "gitCommit": "f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd",
        "gitTreeState": "clean",
        "gitVersion": "v1.8.3",
        "goVersion": "go1.8.3",
        "major": "1",
        "minor": "8",
        "platform": "linux/amd64"
    }
}


$ kubectl describe node master
Name:               master
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/hostname=master
                    node-role.kubernetes.io/master=
Annotations:        flannel.alpha.coreos.com/backend-data={"VtepMAC":"86:b6:7a:d6:7b:b3"}
                    flannel.alpha.coreos.com/backend-type=vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager=true
                    flannel.alpha.coreos.com/public-ip=10.0.2.15
                    node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:             node-role.kubernetes.io/master:NoSchedule
CreationTimestamp:  Sun, 19 Nov 2017 22:27:17 +1100
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  OutOfDisk        False   Sun, 19 Nov 2017 23:04:56 +1100   Sun, 19 Nov 2017 22:27:13 +1100   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure   False   Sun, 19 Nov 2017 23:04:56 +1100   Sun, 19 Nov 2017 22:27:13 +1100   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Sun, 19 Nov 2017 23:04:56 +1100   Sun, 19 Nov 2017 22:27:13 +1100   KubeletHasNoDiskPressure     kubelet has no disk pressure
  Ready            True    Sun, 19 Nov 2017 23:04:56 +1100   Sun, 19 Nov 2017 22:32:24 +1100   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  192.168.99.10
  Hostname:    master
Capacity:
 cpu:     1
 memory:  3881880Ki
 pods:    110
Allocatable:
 cpu:     1
 memory:  3779480Ki
 pods:    110
System Info:
 Machine ID:                 ca0a351004604dd49e43f8a6258ddd77
 System UUID:                CA0A3510-0460-4DD4-9E43-F8A6258DDD77
 Boot ID:                    e9060efa-42be-498d-8cb8-8b785b51b247
 Kernel Version:             3.10.0-693.el7.x86_64
 OS Image:                   CentOS Linux 7 (Core)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://1.12.6
 Kubelet Version:            v1.8.3
 Kube-Proxy Version:         v1.8.3
PodCIDR:                     10.244.0.0/24
ExternalID:                  master
Non-terminated Pods:         (7 in total)
  Namespace                  Name                              CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                              ------------  ----------  ---------------  -------------
  kube-system                etcd-master                       0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-apiserver-master             250m (25%)    0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-controller-manager-master    200m (20%)    0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-dns-545bc4bfd4-mqqqk         260m (26%)    0 (0%)      110Mi (2%)       170Mi (4%)
  kube-system                kube-flannel-ds-hqlnb             0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-proxy-t7z5w                  0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-scheduler-master             100m (10%)    0 (0%)      0 (0%)           0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  810m (81%)    0 (0%)      110Mi (2%)       170Mi (4%)
Events:
  Type    Reason                   Age                From                Message
  ----    ------                   ----               ----                -------
  Normal  Starting                 38m                kubelet, master     Starting kubelet.
  Normal  NodeAllocatableEnforced  38m                kubelet, master     Updated Node Allocatable limit across pods
  Normal  NodeHasSufficientDisk    37m (x8 over 38m)  kubelet, master     Node master status is now: NodeHasSufficientDisk
  Normal  NodeHasSufficientMemory  37m (x8 over 38m)  kubelet, master     Node master status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    37m (x7 over 38m)  kubelet, master     Node master status is now: NodeHasNoDiskPressure
  Normal  Starting                 37m                kube-proxy, master  Starting kube-proxy.
  Normal  Starting                 32m                kubelet, master     Starting kubelet.
  Normal  NodeAllocatableEnforced  32m                kubelet, master     Updated Node Allocatable limit across pods
  Normal  NodeHasSufficientDisk    32m                kubelet, master     Node master status is now: NodeHasSufficientDisk
  Normal  NodeHasSufficientMemory  32m                kubelet, master     Node master status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    32m                kubelet, master     Node master status is now: NodeHasNoDiskPressure
  Normal  NodeNotReady             32m                kubelet, master     Node master status is now: NodeNotReady
  Normal  NodeReady                32m                kubelet, master     Node master status is now: NodeReady

Answer 1

the reason for this problem is that the nodes docker version diff the kubernetes need docker version .

You can directly uninstall docker, reinstall the specified version of docker on each nodes , next step restart docker, and node will be back online immediately.

And the docker-images and pods installed in this judge will not be affected because the physical folder is still there.

yum remove -y docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine

yum install -y docker-ce-18.09.7 docker-ce-cli-18.09.7 containerd.io
systemctl enable docker
systemctl start docker

Answer 2

I had exactly same issue, I've added parameters to ExecStart as mentioned above, but still getting same error. Then I've did kubeadm reset and systemctl daemon-reload and recreated cluster. This error seems to be gone. Testing now...

Kubelet failed to get cgroup stats for “/system.slice/docker.service”

Question

Question

Research

Background

Environment

2 answers

solution1
0 2019-12-11 04:22:01

solution2
-1 2017-11-19 17:24:41

Kubelet failed to get cgroup stats for “/system.slice/docker.service”

Question

Question

Research

Background

Environment

2 answers

solution1 0 2019-12-11 04:22:01

solution2 -1 2017-11-19 17:24:41

solution1
0 2019-12-11 04:22:01

solution2
-1 2017-11-19 17:24:41