簡體   English   中英

Kube.netes Pod 的資源監控

[英]Resource monitoring for Kubernetes Pods

我正在為 K8s REST API 使用 kube.netes-client java 庫。我想探索此處描述的資源監控功能https://kube.netes.io/docs/concepts/configuration/manage-compute-resources-container /

我在創建像這樣的部署時為 Pod 設置資源

// ******************* RESOURCES*********************

    Quantity memLimit = new Quantity();
    memLimit.setAmount("400");
    Map<String, Quantity> memMap = new HashMap<String,Quantity>();
    memMap.put("memory", memLimit);
    ResourceRequirements resourceRequirements = new ResourceRequirementsBuilder()
      .withRequests(memMap)
      .build();

    // ******************* DEPLOYMENT *********************
    Deployment deployment = new DeploymentBuilder()
        .withNewMetadata()
        .withName("first-deployment")
        .endMetadata()
        .withNewSpec()
        .withReplicas(3)
        .withNewTemplate()
        .withNewMetadata()
        .addToLabels(namespaceID, "hello-world-example")
        .endMetadata()
        .withNewSpec()
        .addNewContainer()      
        .withName("nginx-one")
        .withImage("nginx")
        .addNewPort()
        .withContainerPort(80)
        .endPort()
        .withResources(resourceRequirements)
        .endContainer()
        .endSpec()
        .endTemplate()
        .endSpec()
        .build();
    deployment = client.extensions().deployments().inNamespace(namespace).create(deployment);

我現在怎么知道分配給 pod 的 memory 中有多少 memory 被使用了? 該文檔說它是 pod 狀態的一部分,但 pod 狀態是以下形式

     (conditions=
    [PodCondition
    (lastProbeTime=null, lastTransitionTime=2018-01-09T15:53:28Z, 
    message=null, reason=null, 
status=True, type=PodScheduled, 
    additionalProperties={})],
 containerStatuses=[], hostIP=null, 
    initContainerStatuses=[],
 message=null, phase=Pending, podIP=null,
 qosClass=Burstable, reason=null, 
startTime=null, additionalProperties={})

和容器狀態

(containerID=null, image=nginx, 
imageID=, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}),
 name=nginx-one, ready=false, restartCount=0, state=ContainerState(running=null, terminated=null, waiting=
ContainerStateWaiting(message=null, reason=ContainerCreating, additionalProperties={}), additionalProperties={}), 
additionalProperties={})

有沒有監控 Pod 資源的例子?

花一個小時觀看視頻: 負載測試Kubernetes:如何在生產中優化集群資源分配,其中介紹了兩種技術以及有關如何基於負載測試調整資源配置大小的建議。 視頻中的示例利用了cAdvisor,因此一旦Pod /容器啟動並運行,您就可以利用該機制至少捕獲容器占用多少資源的基本視圖。

我不確定k8 api服務器是否提供端點來獲取與性能相關的指標,但是使用fabric8時,即使Pod處於運行狀態,您也不應該監視資源消耗。

這是Pod響應json:

{
  "kind": "Pod",
  "apiVersion": "v1",
  "metadata": {
    "name": "nginx-41cbe3-10-json-9cc655bcc-w576m",
    "generateName": "nginx-41cbe3-10-json-9cc655bcc-",
    "namespace": "default",
    "selfLink": "/api/v1/namespaces/default/pods/nginx-41cbe3-10-json-9cc655bcc-w576m",
    "uid": "e14a955f-18b7-11e8-a642-42010a800090",
    "resourceVersion": "12765988",
    "creationTimestamp": "2018-02-23T16:37:47Z",
    "labels": {
      "app": "nginx",
      "cliqr": "99911519403865240",
      "pod-template-hash": "577211677"
    },
    "annotations": {
      "kubernetes.io/created-by": "{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",\"namespace\":\"default\",\"name\":\"nginx-41cbe3-10-json-9cc655bcc\",\"uid\":\"e1493bd0-18b7-11e8-a642-42010a800090\",\"apiVersion\":\"extensions\",\"resourceVersion\":\"12765971\"}}\n",
      "kubernetes.io/limit-ranger": "LimitRanger plugin set: cpu request for container nginx"
    },
    "ownerReferences": [
      {
        "apiVersion": "extensions/v1beta1",
        "kind": "ReplicaSet",
        "name": "nginx-41cbe3-10-json-9cc655bcc",
        "uid": "e1493bd0-18b7-11e8-a642-42010a800090",
        "controller": true,
        "blockOwnerDeletion": true
      }
    ]
  },
  "spec": {
    "volumes": [
      {
        "name": "default-token-zrhj5",
        "secret": {
          "secretName": "default-token-zrhj5",
          "defaultMode": 420
        }
      }
    ],
    "containers": [
      {
        "name": "nginx",
        "image": "nginx:latest",
        "ports": [
          {
            "containerPort": 80,
            "protocol": "TCP"
          }
        ],
        "resources": {
          "requests": {
            "cpu": "100m"
          }
        },
        "volumeMounts": [
          {
            "name": "default-token-zrhj5",
            "readOnly": true,
            "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
          }
        ],
        "terminationMessagePath": "/dev/termination-log",
        "terminationMessagePolicy": "File",
        "imagePullPolicy": "Always"
      }
    ],
    "restartPolicy": "Always",
    "terminationGracePeriodSeconds": 30,
    "dnsPolicy": "ClusterFirst",
    "serviceAccountName": "default",
    "serviceAccount": "default",
    "nodeName": "gke-rishi-k8-cluster-default-pool-6ca1467e-xtmw",
    "securityContext": {},
    "schedulerName": "default-scheduler",
    "tolerations": [
      {
        "key": "node.alpha.kubernetes.io/notReady",
        "operator": "Exists",
        "effect": "NoExecute",
        "tolerationSeconds": 300
      },
      {
        "key": "node.alpha.kubernetes.io/unreachable",
        "operator": "Exists",
        "effect": "NoExecute",
        "tolerationSeconds": 300
      }
    ]
  },
  "status": {
    "phase": "Running",
    "conditions": [
      {
        "type": "Initialized",
        "status": "True",
        "lastProbeTime": null,
        "lastTransitionTime": "2018-02-23T16:37:47Z"
      },
      {
        "type": "Ready",
        "status": "True",
        "lastProbeTime": null,
        "lastTransitionTime": "2018-02-23T16:37:53Z"
      },
      {
        "type": "PodScheduled",
        "status": "True",
        "lastProbeTime": null,
        "lastTransitionTime": "2018-02-23T16:37:47Z"
      }
    ],
    "hostIP": "10.240.0.23",
    "podIP": "10.20.3.164",
    "startTime": "2018-02-23T16:37:47Z",
    "containerStatuses": [
      {
        "name": "nginx",
        "state": {
          "running": {
            "startedAt": "2018-02-23T16:37:52Z"
          }
        },
        "lastState": {},
        "ready": true,
        "restartCount": 0,
        "image": "nginx:latest",
        "imageID": "docker-pullable://nginx@sha256:600bff7fb36d7992512f8c07abd50aac08db8f17c94e3c83e47d53435a1a6f7c",
        "containerID": "docker://2c227a901bcde4705c5b79aedf1963079dfb345fae5849616d29e8cc7af0fd74"
      }
    ],
    "qosClass": "Burstable"
  }
}

我知道這個問題已經有兩年了,但這里的答案並沒有提供這個問題的實際答案。

為了獲得 CPU 和 memory 利用率,您需要在 Kube.netes 集群上安裝kube.netes 指標服務器(如果您使用 helm,另請參閱官方Helm 圖表)。 一旦安裝了指標服務器,您就可以運行 kube.netes 命令來報告指標使用情況。 例如,運行kubectl top pods -A將按 CPU 使用率對所有 pod 進行排序,或者kubectl top nodes將列出每個節點的使用率。 安裝指標服務器后, kubectl describe pods以及Kube.netes 儀表板還將報告 CPU 和 memory 利用率數字。

要回答有關fabric8的特定問題,一旦指標服務器運行,您可以使用以下代碼獲取 CPU 和 memory 利用率:

KubernetesClient k8s = new KubernetesClientBuilder().build()
NodeMetricsList nodeMetricsList = k8s.top().nodes().metrics();
for (NodeMetrics nodeMetrics : nodeMetricsList.getItems()) {
    logger.info("{} {} {}",
        nodeMetrics.getMetadata().getName(),
        nodeMetrics.getUsage().get("cpu"),
        nodeMetrics.getUsage().get("memory")
    );
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM