简体   繁体   English

GCloud kubernetes集群出现1个CPU错误不足

[英]GCloud kubernetes cluster with 1 Insufficient cpu error

I created a Kubernetes cluster on Google Cloud using: 我使用以下方法在Google Cloud上创建了Kubernetes集群:

gcloud container clusters create my-app-cluster --num-nodes=1

Then I deployed my 3 apps (backend, frontend and a scraper) and created a load balancer. 然后,我部署了3个应用程序(后端,前端和刮板),并创建了一个负载平衡器。 I used the following configuration file: 我使用了以下配置文件:

apiVersion: apps/v1
kind: Deployment
metadata:
    name: my-app-deployment
    labels:
        app: my-app
spec:
    replicas: 1
    selector:
        matchLabels:
            app: my-app
    template:
        metadata:
            labels:
                app: my-app
        spec:
            containers:
              - name: my-app-server
                image: gcr.io/my-app/server
                ports:
                  - containerPort: 8009
                envFrom:
                  - secretRef:
                        name: my-app-production-secrets
              - name: my-app-scraper
                image: gcr.io/my-app/scraper
                ports:
                  - containerPort: 8109
                envFrom:
                  - secretRef:
                        name: my-app-production-secrets
              - name: my-app-frontend
                image: gcr.io/my-app/frontend
                ports:
                  - containerPort: 80
                envFrom:
                  - secretRef:
                        name: my-app-production-secrets

---

apiVersion: v1
kind: Service
metadata:
    name: my-app-lb-service
spec:
    type: LoadBalancer
    selector:
        app: my-app
    ports:
      - name: my-app-server-port
        protocol: TCP
        port: 8009
        targetPort: 8009
      - name: my-app-scraper-port
        protocol: TCP
        port: 8109
        targetPort: 8109
      - name: my-app-frontend-port
        protocol: TCP
        port: 80
        targetPort: 80

When typing kubectl get pods I get: 键入kubectl get pods我得到:

NAME                                   READY     STATUS    RESTARTS   AGE
my-app-deployment-6b49c9b5c4-5zxw2   0/3       Pending   0          12h

When investigation i Google Cloud I see "Unschedulable" state with "insufficient cpu" error on pod: 在调查我的Google Cloud时,我在Pod上看到“计划外”状态和“ CPU不足”错误:

由于CPU不足导致无法计划的状态

When going to Nodes section under my cluster in the Clusters page, I see 681 mCPU requested and 940 mCPU allocated: 当转到“群集”页面中群集下的“节点”部分时,我看到请求的681 mCPU和分配的940 mCPU: 在此处输入图片说明

What is wrong? 怎么了? Why my pod doesn't start? 为什么我的吊舱无法启动?

Every container has a default CPU request (in GKE I've noticed it's 0.1 CPU or 100m). 每个容器都有一个默认的CPU请求(在GKE中,我注意到它是0.1 CPU或100m)。 Assuming these defaults you have three containers in that pod so you're requesting another 0.3 CPU. 假定这些默认值,那么您在该容器中有三个容器,因此您要再请求0.3个CPU。

The node has 0.68 CPU (680m) requested by other workloads and a total limit (allocatable) on that node of 0.94 CPU (940m). 该节点具有其他工作负载请求的0.68 CPU(680m),该节点上的总限制(可分配)为0.94 CPU(940m)。

If you want to see what workloads are reserving that 0.68 CPU, you need to inspect the pods on the node. 如果要查看哪些工作负载在保留该0.68 CPU,则需要检查节点上的Pod。 In the page on GKE where you see the resource allocations and limits per node, if you click the node it will take you to a page that provides this information. 在GKE的页面上,您可以看到每个节点的资源分配和限制,如果单击该节点,它将带您到一个提供此信息的页面。
In my case I can see 2 pods of kube-dns taking 0.26 CPU each, amongst others. 在我的情况下,我可以看到2个kube-dns Pod, kube-dns占用0.26 CPU。 These are system pods that are needed to operate the cluster correctly. 这些是正确操作集群所需的系统Pod。 What you see will also depend on what add-on services you have selected, for example: HTTP Load Balancing (Ingress), Kubernetes Dashboard and so on. 您看到的内容还将取决于您选择的附加服务,例如:HTTP负载平衡(入口),Kubernetes仪表板等。

Your pod would take CPU to 0.98 CPU for the node which is more than the 0.94 limit, which is why your pod cannot start. 对于超过0.94限制的节点,您的Pod会将CPU占用的CPU提升到0.98 CPU,这就是您的pod无法启动的原因。

Note that the scheduling is based on the amount of CPU requested for each workload, not how much it actually uses, or the limit. 请注意,调度是基于每个工作负载请求的CPU数量,而不是实际使用的数量或限制。

Your options: 您的选择:

  1. Turn off any add-on service which is taking CPU resource that you don't need. 关闭所有占用您不需要的CPU资源的附加服务。
  2. Add more CPU resource to your cluster. 向群集添加更多的CPU资源。 To do that you will either need to change your node pool to use VMs with more CPU, or increase the number of nodes in your existing pool. 为此,您将需要更改节点池以使用具有更多CPU的VM,或者增加现有池中的节点数。 You can do this in GKE console or via the gcloud command line. 您可以在GKE控制台中或通过gcloud命令行执行此操作。
  3. Make explicit requests in your containers for less CPU that will override the defaults. 在容器中提出显式请求,以减少将覆盖默认值的CPU。
apiVersion: apps/v1
kind: Deployment
...
        spec:
            containers:
              - name: my-app-server
                image: gcr.io/my-app/server
                ...
                resources:
                  requests:
                     cpu: "50m"
              - name: my-app-scraper
                image: gcr.io/my-app/scraper
                ...
                resources:
                  requests:
                     cpu: "50m"
              - name: my-app-frontend
                image: gcr.io/my-app/frontend
                ...
                resources:
                  requests:
                     cpu: "50m"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM