简体   繁体   中英

Can't install third kubernetes master node: Kubelet TLS bootstrapping timeout in kubeadm join

When trying to set up an HA cluster in Kubernetes 1.12 with external etcd I experienced a timeout when using the following command:

kubeadm join <load balancer>:443 --token <token> --discovery-token-ca-cert-hash sha256:3dfa042fcc28a26da9335c14802718bbc36b82bb71b4e5dfaa70c004454932da --experimental-control-plane

Output:

[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "<load balancer>:443"
[discovery] Created cluster-info discovery client, requesting info from "https://<load balancer>:443"
[discovery] Requesting info from "https://<load balancer>:443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "<load balancer>:443"
[discovery] Successfully established connection with API Server "<load balancer>:443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
I1005 12:48:29.896403    8131 join.go:334] [join] running pre-flight checks before initializing the new control plane instance
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[certificates] Using the existing apiserver certificate and key.
[certificates] Using the existing apiserver-kubelet-client certificate and key.
[certificates] Using the existing front-proxy-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certificates] Using the existing sa key.
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'
timed out waiting for the condition

Two master nodes were installed successfully before experiencing this error. I used this as installation guideline: https://kubernetes.io/docs/setup/independent/high-availability/#set-up-the-cluster

My Load Balancer is running on the same node that I'm trying to install the cluster on, but I don't see why it might be an issue (maybe it is?).

kubelet logs don't show me anything critical:

   kubelet[26132]: I1005 09:34:32.667360   26132 server.go:408] Version: v1.12.0
   kubelet[26132]: I1005 09:34:32.667520   26132 plugins.go:99] No cloud provider specified.
   kubelet[26132]: W1005 09:34:32.667553   26132 server.go:553] standalone mode, no API client
   kubelet[26132]: W1005 09:34:32.745120   26132 server.go:465] No api server defined - no events will be sent to API server.
   kubelet[26132]: I1005 09:34:32.745178   26132 server.go:667] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
   kubelet[26132]: I1005 09:34:32.745944   26132 container_manager_linux.go:247] container manager verified user specified cgroup-root exists: []
   kubelet[26132]: I1005 09:34:32.745974   26132 container_manager_linux.go:252] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: En
   kubelet[26132]: I1005 09:34:32.746237   26132 container_manager_linux.go:271] Creating device plugin manager: true
   kubelet[26132]: I1005 09:34:32.746368   26132 state_mem.go:36] [cpumanager] initializing new in-memory state store
   kubelet[26132]: I1005 09:34:32.747800   26132 kubelet.go:279] Adding pod path: /etc/kubernetes/manifests
   kubelet[26132]: I1005 09:34:32.752107   26132 client.go:75] Connecting to docker on unix:///var/run/docker.sock
   kubelet[26132]: I1005 09:34:32.752172   26132 client.go:104] Start docker client with request timeout=2m0s
   kubelet[26132]: W1005 09:34:32.754889   26132 docker_service.go:540] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
   kubelet[26132]: I1005 09:34:32.754954   26132 docker_service.go:236] Hairpin mode set to "hairpin-veth"
   kubelet[26132]: W1005 09:34:32.755195   26132 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
   kubelet[26132]: W1005 09:34:32.759325   26132 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
   kubelet[26132]: I1005 09:34:32.762094   26132 docker_service.go:251] Docker cri networking managed by kubernetes.io/no-op
   kubelet[26132]: I1005 09:34:32.789329   26132 docker_service.go:256] Docker Info: &{ID:LJUT:6WWB:WNW2:UJGM:R5HT:4POO:QL2M:PFOI:OKZN:OBP2:ODQS:SSJU Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:19 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan nul
   kubelet[26132]: I1005 09:34:32.789503   26132 docker_service.go:269] Setting cgroupDriver to cgroupfs
   kubelet[26132]: I1005 09:34:32.820067   26132 kuberuntime_manager.go:197] Container runtime docker initialized, version: 17.06.2-ce, apiVersion: 1.30.0
   kubelet[26132]: I1005 09:34:32.822547   26132 server.go:1013] Started kubelet
   kubelet[26132]: W1005 09:34:32.822599   26132 kubelet.go:1387] No api server defined - no node status update will be sent.
   kubelet[26132]: E1005 09:34:32.822622   26132 kubelet.go:1287] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
   kubelet[26132]: I1005 09:34:32.822624   26132 server.go:133] Starting to listen on 127.0.0.1:10250
   kubelet[26132]: I1005 09:34:32.823855   26132 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
   kubelet[26132]: I1005 09:34:32.823900   26132 status_manager.go:148] Kubernetes client is nil, not starting status manager.
   kubelet[26132]: I1005 09:34:32.823919   26132 kubelet.go:1804] Starting kubelet main sync loop.
   kubelet[26132]: I1005 09:34:32.823971   26132 kubelet.go:1821] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
   kubelet[26132]: I1005 09:34:32.824016   26132 volume_manager.go:248] Starting Kubelet Volume Manager
   kubelet[26132]: I1005 09:34:32.824094   26132 desired_state_of_world_populator.go:130] Desired state populator starts to run
   kubelet[26132]: I1005 09:34:32.824656   26132 server.go:318] Adding debug handlers to kubelet server.
   kubelet[26132]: I1005 09:34:32.924253   26132 kubelet.go:1821] skipping pod synchronization - [container runtime is down]
   kubelet[26132]: I1005 09:34:33.072557   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.077937   26132 cpu_manager.go:155] [cpumanager] starting with none policy
   kubelet[26132]: I1005 09:34:33.077967   26132 cpu_manager.go:156] [cpumanager] reconciling every 10s
   kubelet[26132]: I1005 09:34:33.077976   26132 policy_none.go:42] [cpumanager] none policy: Start
   kubelet[26132]: W1005 09:34:33.078616   26132 manager.go:527] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
   kubelet[26132]: I1005 09:34:33.078989   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.124726   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.130955   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.136320   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.136580   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.142780   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.143667   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.224945   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "ca-certs" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-ca-certs") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
   kubelet[26132]: I1005 09:34:33.225058   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etcd-certs-0" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-etcd-certs-0") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
   kubelet[26132]: I1005 09:34:33.225200   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etc-pki" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-etc-pki") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
   kubelet[26132]: I1005 09:34:33.325745   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "flexvolume-dir" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-flexvolume-dir") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.325834   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etc-pki" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-etc-pki") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.325890   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kubeconfig" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-kubeconfig") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.326047   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kubeconfig" (UniqueName: "kubernetes.io/host-path/dd3b0cd7d636afb2b116453dc6524f26-kubeconfig") pod "kube-scheduler-" (UID: "dd3b0cd7d636afb2b116453dc6524f26")
   kubelet[26132]: I1005 09:34:33.326393   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "k8s-certs" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-k8s-certs") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
   kubelet[26132]: I1005 09:34:33.326524   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "k8s-certs" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-k8s-certs") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.326645   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "ca-certs" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-ca-certs") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.326693   26132 reconciler.go:154] Reconciler: start to sync state
   dockerd[24966]: time="2018-10-05T09:34:33.789690025+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 40806fa9041d3a65d39fdc1a68e2415f0d77f84e0c4f8c163d3bd48fec0d763f"
   kubelet[26132]: W1005 09:34:33.792727   26132 docker_container.go:202] Deleted previously existing symlink file: "/var/log/pods/92f250670b6bc27fc8b90703d1196aa3/kube-controller-manager/0.log"
   dockerd[24966]: time="2018-10-05T09:34:33.820145872+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 19328df83a640d71faf86310d1a4052f3af42e75513d9745a2775532803ba122"
   kubelet[26132]: W1005 09:34:33.822612   26132 docker_container.go:202] Deleted previously existing symlink file: "/var/log/pods/dd3b0cd7d636afb2b116453dc6524f26/kube-scheduler/0.log"
   dockerd[24966]: time="2018-10-05T09:34:33.836511632+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 6b9e3036a5027b42a4340ad0779be6030593d1a10df4367c0a0ca54ff1345f16"
   kubelet[26132]: I1005 09:34:33.851661   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.865408   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.874766   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: W1005 09:34:34.841803   26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/43fc349d-c86e-11e8-a0aa-001018759bc8/volumes" does not exist
   kubelet[26132]: W1005 09:34:34.841888   26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/7c7d1db45cb11bf12de2eac803da8b77/volumes" does not exist
   kubelet[26132]: W1005 09:34:34.841935   26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/43fbcf1b-c86e-11e8-a0aa-001018759bc8/volumes" does not exist
   kubelet[26132]: I1005 09:34:34.880168   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:34.880564   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:34.880645   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:43.121992   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:53.165661   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   sshd[26621]: Connection closed by 172.29.2.56 port 50080 [preauth]
   kubelet[26132]: I1005 09:35:03.210021   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:35:13.252179   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:35:23.295605   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach

Any ideas?

EDIT:

When comparing the kubelets on the nodes I discobered, that kubelet was started like this on the other two nodes:

kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni

After the TLS timeout, I used this command on the third node which led to:

  I1005  .008343  server.go:408] Version: v1.12.0
  I1005  .008857  plugins.go:99] No cloud provider specified.
  I1005  .045644  certificate_store.go:131] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
  I1005  .134861  server.go:667] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
  I1005  .135501  container_manager_linux.go:247] container manager verified user specified cgroup-root exists: []
  I1005  .135551  container_manager_linux.go:252] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms}
  I1005  .135777  container_manager_linux.go:271] Creating device plugin manager: true
  I1005  .135829  state_mem.go:36] [cpumanager] initializing new in-memory state store
  I1005  .136055  state_mem.go:84] [cpumanager] updated default cpuset: ""
  I1005  .136084  state_mem.go:92] [cpumanager] updated cpuset assignments: "map[]"
  I1005  .136410  kubelet.go:279] Adding pod path: /etc/kubernetes/manifests
  I1005  .136461  kubelet.go:304] Watching apiserver
  I1005  .141009  client.go:75] Connecting to docker on unix:///var/run/docker.sock
  I1005  .141054  client.go:104] Start docker client with request timeout=2m0s
  W1005  .143351  docker_service.go:540] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
  I1005  .143395  docker_service.go:236] Hairpin mode set to "hairpin-veth"
  W1005  .143618  cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
  W1005  .147722  hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
  W1005  .147880  cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
  I1005  .147944  docker_service.go:251] Docker cri networking managed by cni
  I1005  .177322  docker_service.go:256] Docker Info: &{ID:LJUT:6WWB:WNW2:UJGM:R5HT:4POO:QL2M:PFOI:OKZN:OBP2:ODQS:SSJU Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:19 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:18 OomKillDisable:true NGoroutines:27 SystemTime:2018-10-05T .158551524+02:00 LoggingDriver:json-file CgroupDriver:cgroupfs NEventsListener:0 KernelVersion:4.18.5-1.el7.elrepo.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc4201e65b0 NCPU:40 MemTotal:134664974336 GenericResources:[] DockerRootDir:/export/data/docker HTTPProxy: HTTPSProxy: NoProxy: Name:dax Labels:[] ExperimentalBuild:false ServerVersion:17.06.2-ce ClusterStore: ClusterAdvertise: Runtimes:map[runc:{Path:docker-runc Args:[]}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil>} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:6e23458c129b551d5c9871e5174f6b1b7f6d1170 Expected:6e23458c129b551d5c9871e5174f6b1b7f6d1170} RuncCommit:{ID:810190ceaa507aa2727d7ae6f4790c76ec150bd2 Expected:810190ceaa507aa2727d7ae6f4790c76ec150bd2} InitCommit:{ID:949e6fa Expected:949e6fa} SecurityOptions:[name=seccomp,profile=default]}
  I1005  .177565  docker_service.go:269] Setting cgroupDriver to cgroupfs
  I1005  .211074  kuberuntime_manager.go:197] Container runtime docker initialized, version: 17.06.2-ce, apiVersion: 1.30.0
  I1005  .213560  server.go:1013] Started kubelet
  E1005  .213611  kubelet.go:1287] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
  I1005  .213712  server.go:133] Starting to listen on 0.0.0.0:10250
  I1005  .216143  fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
  I1005  .216334  status_manager.go:152] Starting to sync pod status with apiserver
  I1005  .216447  kubelet.go:1804] Starting kubelet main sync loop.
  I1005  .216962  kubelet.go:1821] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
  I1005  .218285  volume_manager.go:248] Starting Kubelet Volume Manager
  I1005  .218904  desired_state_of_world_populator.go:130] Desired state populator starts to run
  I1005  .220387  server.go:318] Adding debug handlers to kubelet server.
  W1005  .221605  cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
  E1005  .221954  kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
  E1005  .317227  kubelet.go:2236] node "dax" not found
  I1005  .317229  kubelet.go:1821] skipping pod synchronization - [container runtime is down]
  I1005  .318558  kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
  I1005  .323926  kubelet_node_status.go:70] Attempting to register node dax
  I1005  .332022  kubelet_node_status.go:73] Successfully registered node dax
  I1005  .417546  kuberuntime_manager.go:910] updating runtime config through cri with podcidr 10.244.3.0/24
  I1005  .418060  docker_service.go:345] docker cri received runtime config &RuntimeConfig{NetworkConfig:&NetworkConfig{PodCidr:10.244.3.0/24,},}
  I1005  .418505  kubelet_network.go:75] Setting Pod CIDR: -> 10.244.3.0/24
  I1005  .465985  cpu_manager.go:155] [cpumanager] starting with none policy
  I1005  .466004  cpu_manager.go:156] [cpumanager] reconciling every 10s
  I1005  .466012  policy_none.go:42] [cpumanager] none policy: Start
  W1005  .466606  manager.go:527] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
  W1005  .467018  container_manager_linux.go:803] CPUAccounting not enabled for pid: 
  W1005  .467029  container_manager_linux.go:806] MemoryAccounting not enabled for pid: 
  W1005  .467770  cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
  E1005  .467952  kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
  I1005  .520111  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-lib-modules") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520186  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "var-run-calico" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-var-run-calico") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520296  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "run" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-run") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520485  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-net-dir" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-cni-net-dir") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520581  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-proxy" (UniqueName: "kubernetes.io/configmap/dde74f33-c893-11e8-a0aa-001018759bc8-kube-proxy") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
  I1005  .520641  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/dde74f33-c893-11e8-a0aa-001018759bc8-lib-modules") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
  I1005  .520697  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "var-lib-calico" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-var-lib-calico") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520755  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "flannel-cfg" (UniqueName: "kubernetes.io/configmap/dde7c5af-c893-11e8-a0aa-001018759bc8-flannel-cfg") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520855  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-bin-dir" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-cni-bin-dir") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520952  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "canal-token-nsdwz" (UniqueName: "kubernetes.io/secret/dde7c5af-c893-11e8-a0aa-001018759bc8-canal-token-nsdwz") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .521094  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "xtables-lock" (UniqueName: "kubernetes.io/host-path/dde74f33-c893-11e8-a0aa-001018759bc8-xtables-lock") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
  I1005  .521160  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-proxy-token-zjtdh" (UniqueName: "kubernetes.io/secret/dde74f33-c893-11e8-a0aa-001018759bc8-kube-proxy-token-zjtdh") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
  I1005  .521232  reconciler.go:154] Reconciler: start to sync state
  E1005  .537905  summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
  E1005  .574965  summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
  E1005  .613275  summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
  E1005  .656607  summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"

I found the solution myself - a config file in /etc/systemd/system/kubelet.service.d used wrong startup parameters - I changed them and it resolved my problem

The file 20-etcd-service-manager.conf containing the values

ExecStart=/usr/bin/kubelet --address=127.0.0.1
--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true

caused my problem. I changed it to

ExecStart=/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni

because these were the parameters for my other nodes. It might be even better to just delete the file so it doesn't override any other settings

Thanks so much for adding your solution! This is why I did in my case:

  1. Uninstall and purge kubelet, kubeadm and kubectl.
  2. Clear /etc/systemd/system/kubelnet.service.d
  3. Reinstall and retry.

On Ubuntu:

apt-get remove --purge kubelet kubeadm kubectl
rm -rf /etc/systemd/system/kubelnet.service.d
apt-get install kubelet kubeadm kubectl
kubeadm join ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM