简体   繁体   English

Google Cloud中的Kubernetes问题停留在ContainerCreating状态

[英]Problem with Kubernetes in Google Cloud stuck with ContainerCreating status

I'm having a problem with my GKE cluster, all the pods are stuck with ContainerCreating status. 我的GKE群集出现问题,所有吊舱均处于ContainerCreating状态。 When I run the kubectl get events I see this error: 当我运行kubectl get事件时,我看到此错误:

Failed create pod sandbox: rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Anyone knows what the hell is happening? 有人知道发生了什么吗? I can't find this solution anywhere. 我在任何地方都找不到此解决方案。

EDIT I saw this post https://github.com/kubernetes/kubernetes/issues/44273 saying that the GKE instances that are small than the default google instance for GKE(n1-standard-1) can have network problems. 编辑我看到这篇文章https://github.com/kubernetes/kubernetes/issues/44273,说比GKE(n1-standard-1)的默认google实例小的GKE实例可能有网络问题。 So I changed my instances to the default type, but without success. 所以我将实例更改为默认类型,但没有成功。 Here are my node and pod descriptions: 这是我的节点和吊舱说明:

Name:               gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-bgb6
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/fluentd-ds-ready=true
                    beta.kubernetes.io/instance-type=n1-standard-1
                    beta.kubernetes.io/os=linux
                    cloud.google.com/gke-nodepool=pool-nodes-dev
                    failure-domain.beta.kubernetes.io/region=southamerica-east1
                    failure-domain.beta.kubernetes.io/zone=southamerica-east1-a
                    kubernetes.io/hostname=gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-bgb6
Annotations:        node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
CreationTimestamp:  Thu, 27 Sep 2018 20:27:47 -0300
Taints:             <none>
Unschedulable:      false
Conditions:
  Type                          Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                          ------  -----------------                 ------------------                ------                       -------
  KernelDeadlock                False   Fri, 28 Sep 2018 09:58:58 -0300   Thu, 27 Sep 2018 20:27:16 -0300   KernelHasNoDeadlock          kernel has no deadlock
  FrequentUnregisterNetDevice   False   Fri, 28 Sep 2018 09:58:58 -0300   Thu, 27 Sep 2018 20:32:18 -0300   UnregisterNetDevice          node is functioning properly
  NetworkUnavailable            False   Thu, 27 Sep 2018 20:27:48 -0300   Thu, 27 Sep 2018 20:27:48 -0300   RouteCreated                 NodeController create implicit route
  OutOfDisk                     False   Fri, 28 Sep 2018 09:59:03 -0300   Thu, 27 Sep 2018 20:27:47 -0300   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure                False   Fri, 28 Sep 2018 09:59:03 -0300   Thu, 27 Sep 2018 20:27:47 -0300   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure                  False   Fri, 28 Sep 2018 09:59:03 -0300   Thu, 27 Sep 2018 20:27:47 -0300   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure                   False   Fri, 28 Sep 2018 09:59:03 -0300   Thu, 27 Sep 2018 20:27:47 -0300   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                         True    Fri, 28 Sep 2018 09:59:03 -0300   Thu, 27 Sep 2018 20:28:07 -0300   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  10.0.0.2
  ExternalIP:
  Hostname:    gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-bgb6
Capacity:
 cpu:                1
 ephemeral-storage:  98868448Ki
 hugepages-2Mi:      0
 memory:             3787608Ki
 pods:               110
Allocatable:
 cpu:                940m
 ephemeral-storage:  47093746742
 hugepages-2Mi:      0
 memory:             2702168Ki
 pods:               110
System Info:
 Machine ID:                 1e8e0ecad8f5cc7fb5851bc64513d40c
 System UUID:                1E8E0ECA-D8F5-CC7F-B585-1BC64513D40C
 Boot ID:                    971e5088-6bc1-4151-94bf-b66c6c7ee9a3
 Kernel Version:             4.14.56+
 OS Image:                   Container-Optimized OS from Google
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://17.3.2
 Kubelet Version:            v1.10.7-gke.2
 Kube-Proxy Version:         v1.10.7-gke.2
PodCIDR:                     10.0.32.0/24
ProviderID:                  gce://aditumpay/southamerica-east1-a/gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-bgb6
Non-terminated Pods:         (11 in total)
  Namespace                  Name                                                              CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                                                              ------------  ----------  ---------------  -------------
  kube-system                event-exporter-v0.2.1-5f5b89fcc8-xsvmg                            0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                fluentd-gcp-scaler-7c5db745fc-vttc9                               0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                fluentd-gcp-v3.1.0-sz8r8                                          0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                heapster-v1.5.3-75486b456f-sj7k8                                  138m (14%)    138m (14%)  301856Ki (11%)   301856Ki (11%)
  kube-system                kube-dns-788979dc8f-99xvh                                         260m (27%)    0 (0%)      110Mi (4%)       170Mi (6%)
  kube-system                kube-dns-788979dc8f-9sz2b                                         260m (27%)    0 (0%)      110Mi (4%)       170Mi (6%)
  kube-system                kube-dns-autoscaler-79b4b844b9-6s8x2                              20m (2%)      0 (0%)      10Mi (0%)        0 (0%)
  kube-system                kube-proxy-gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-bgb6    100m (10%)    0 (0%)      0 (0%)           0 (0%)
  kube-system                kubernetes-dashboard-598d75cb96-6nhcd                             50m (5%)      100m (10%)  100Mi (3%)       300Mi (11%)
  kube-system                l7-default-backend-5d5b9874d5-8wk6h                               10m (1%)      10m (1%)    20Mi (0%)        20Mi (0%)
  kube-system                metrics-server-v0.2.1-7486f5bd67-fvddz                            53m (5%)      148m (15%)  154Mi (5%)       404Mi (15%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource  Requests        Limits
  --------  --------        ------
  cpu       891m (94%)      396m (42%)
  memory    817952Ki (30%)  1391392Ki (51%)
Events:     <none>

The other node: 另一个节点:

Name:               gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-m7bz
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/fluentd-ds-ready=true
                    beta.kubernetes.io/instance-type=n1-standard-1
                    beta.kubernetes.io/os=linux
                    cloud.google.com/gke-nodepool=pool-nodes-dev
                    failure-domain.beta.kubernetes.io/region=southamerica-east1
                    failure-domain.beta.kubernetes.io/zone=southamerica-east1-a
                    kubernetes.io/hostname=gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-m7bz
Annotations:        node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
CreationTimestamp:  Thu, 27 Sep 2018 20:30:05 -0300
Taints:             <none>
Unschedulable:      false
Conditions:
  Type                          Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                          ------  -----------------                 ------------------                ------                       -------
  KernelDeadlock                False   Fri, 28 Sep 2018 10:11:03 -0300   Thu, 27 Sep 2018 20:29:34 -0300   KernelHasNoDeadlock          kernel has no deadlock
  FrequentUnregisterNetDevice   False   Fri, 28 Sep 2018 10:11:03 -0300   Thu, 27 Sep 2018 20:34:36 -0300   UnregisterNetDevice          node is functioning properly
  NetworkUnavailable            False   Thu, 27 Sep 2018 20:30:06 -0300   Thu, 27 Sep 2018 20:30:06 -0300   RouteCreated                 NodeController create implicit route
  OutOfDisk                     False   Fri, 28 Sep 2018 10:11:49 -0300   Thu, 27 Sep 2018 20:30:05 -0300   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure                False   Fri, 28 Sep 2018 10:11:49 -0300   Thu, 27 Sep 2018 20:30:05 -0300   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure                  False   Fri, 28 Sep 2018 10:11:49 -0300   Thu, 27 Sep 2018 20:30:05 -0300   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure                   False   Fri, 28 Sep 2018 10:11:49 -0300   Thu, 27 Sep 2018 20:30:05 -0300   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                         True    Fri, 28 Sep 2018 10:11:49 -0300   Thu, 27 Sep 2018 20:30:25 -0300   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  10.0.0.3
  ExternalIP:
  Hostname:    gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-m7bz
Capacity:
 cpu:                1
 ephemeral-storage:  98868448Ki
 hugepages-2Mi:      0
 memory:             3787608Ki
 pods:               110
Allocatable:
 cpu:                940m
 ephemeral-storage:  47093746742
 hugepages-2Mi:      0
 memory:             2702168Ki
 pods:               110
System Info:
 Machine ID:                 f1d5cf2a0b2c5472cf6509778a7941a7
 System UUID:                F1D5CF2A-0B2C-5472-CF65-09778A7941A7
 Boot ID:                    f35bebb8-acd7-4a2f-95d6-76604638aef9
 Kernel Version:             4.14.56+
 OS Image:                   Container-Optimized OS from Google
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://17.3.2
 Kubelet Version:            v1.10.7-gke.2
 Kube-Proxy Version:         v1.10.7-gke.2
PodCIDR:                     10.0.33.0/24
ProviderID:                  gce://aditumpay/southamerica-east1-a/gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-m7bz
Non-terminated Pods:         (7 in total)
  Namespace                  Name                                                              CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                                                              ------------  ----------  ---------------  -------------
  default                    aditum-payment-7d966c494c-wpk2t                                   100m (10%)    0 (0%)      0 (0%)           0 (0%)
  default                    aditum-portal-dev-5c69d76bb6-n5d5b                                100m (10%)    0 (0%)      0 (0%)           0 (0%)
  default                    aditum-vtexapi-5c758fcfb7-rhvsn                                   100m (10%)    0 (0%)      0 (0%)           0 (0%)
  default                    admin-mongo-dev-7d9f7f7d46-rrj42                                  100m (10%)    0 (0%)      0 (0%)           0 (0%)
  default                    mongod-0                                                          200m (21%)    0 (0%)      200Mi (7%)       0 (0%)
  kube-system                fluentd-gcp-v3.1.0-pgwfx                                          0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-proxy-gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-m7bz    100m (10%)    0 (0%)      0 (0%)           0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource  Requests    Limits
  --------  --------    ------
  cpu       700m (74%)  0 (0%)
  memory    200Mi (7%)  0 (0%)
Events:     <none>

All the cluster's pods are stucked. 群集的所有吊舱都卡住了。

NAMESPACE     NAME                                                             READY     STATUS              RESTARTS   AGE
default       aditum-payment-7d966c494c-wpk2t                                  0/1       ContainerCreating   0          13h
default       aditum-portal-dev-5c69d76bb6-n5d5b                               0/1       ContainerCreating   0          13h
default       aditum-vtexapi-5c758fcfb7-rhvsn                                  0/1       ContainerCreating   0          13h
default       admin-mongo-dev-7d9f7f7d46-rrj42                                 0/1       ContainerCreating   0          13h
default       mongod-0                                                         0/1       ContainerCreating   0          13h
kube-system   event-exporter-v0.2.1-5f5b89fcc8-xsvmg                           0/2       ContainerCreating   0          13h
kube-system   fluentd-gcp-scaler-7c5db745fc-vttc9                              0/1       ContainerCreating   0          13h
kube-system   fluentd-gcp-v3.1.0-pgwfx                                         0/2       ContainerCreating   0          16h
kube-system   fluentd-gcp-v3.1.0-sz8r8                                         0/2       ContainerCreating   0          16h
kube-system   heapster-v1.5.3-75486b456f-sj7k8                                 0/3       ContainerCreating   0          13h
kube-system   kube-dns-788979dc8f-99xvh                                        0/4       ContainerCreating   0          13h
kube-system   kube-dns-788979dc8f-9sz2b                                        0/4       ContainerCreating   0          13h
kube-system   kube-dns-autoscaler-79b4b844b9-6s8x2                             0/1       ContainerCreating   0          13h
kube-system   kube-proxy-gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-bgb6   0/1       ContainerCreating   0          13h
kube-system   kube-proxy-gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-m7bz   0/1       ContainerCreating   0          13h
kube-system   kubernetes-dashboard-598d75cb96-6nhcd                            0/1       ContainerCreating   0          13h
kube-system   l7-default-backend-5d5b9874d5-8wk6h                              0/1       ContainerCreating   0          13h
kube-system   metrics-server-v0.2.1-7486f5bd67-fvddz                           0/2       ContainerCreating   0          13h

A stucked pod. 卡住的豆荚。

Name:           aditum-payment-7d966c494c-wpk2t
Namespace:      default
Node:           gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-m7bz/10.0.0.3
Start Time:     Thu, 27 Sep 2018 20:30:47 -0300
Labels:         io.kompose.service=aditum-payment
                pod-template-hash=3852270507
Annotations:    kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container aditum-payment
Status:         Pending
IP:
Controlled By:  ReplicaSet/aditum-payment-7d966c494c
Containers:
  aditum-payment:
    Container ID:
    Image:          gcr.io/aditumpay/aditumpaymentwebapi:latest
    Image ID:
    Port:           5000/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:  100m
    Environment:
      CONNECTIONSTRING:  <set to the key 'CONNECTIONSTRING' of config map 'aditum-payment-config'>  Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qsc9k (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  default-token-qsc9k:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-qsc9k
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                  From                                                          Message
  ----     ------                  ----                 ----                                                          -------
  Warning  FailedCreatePodSandBox  3m (x1737 over 13h)  kubelet, gke-aditum-k8scluster--pool-nodes-dev-500ebc8b-m7bz  Failed create pod sandbox: rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Thanks! 谢谢!

Sorry for taking to long to respond. 很抱歉花了很长时间回复。 It was a very silly problem. 这是一个非常愚蠢的问题。 After I reach the google cloud support, I notice that my NAT machine was not working properly. 在获得Google Cloud支持后,我注意到我的NAT计算机无法正常工作。 The PrivateAccess route was passing thougth my NAT. PrivateAccess路由正在通过我的NAT。 Thanks everyone for the help. 谢谢大家的帮助。

In addition of the description of your nodes, it ca depend from where you are launching them. 除了对节点的描述之外,它还取决于您从何处启动它们。

As mentioned in kubernetes/minikube issue 2148 or kubernetes/minikube issue 3142 , that won't work from China. kubernetes / minikube第2148期kubernetes / minikube第3142期中所述 ,在中国无法使用。

The workaround in that case is to find another source, pull it and tag it: 在这种情况下,解决方法是找到另一个来源,将其提取并标记:

minikube ssh \
"docker pull registry.cn-hangzhou.aliyuncs.com/google-containers/pause-amd64:3.0
docker tag registry.cn-hangzhou.aliyuncs.com/google-containers/pause-amd64:3.0 gcr.io/google_containers/pause-amd64:3.0"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM