繁体   English   中英

Kube.netes Pod 的资源监控

[英]Resource monitoring for Kubernetes Pods

我正在为 K8s REST API 使用 kube.netes-client java 库。我想探索此处描述的资源监控功能https://kube.netes.io/docs/concepts/configuration/manage-compute-resources-container /

我在创建像这样的部署时为 Pod 设置资源

// ******************* RESOURCES*********************

    Quantity memLimit = new Quantity();
    memLimit.setAmount("400");
    Map<String, Quantity> memMap = new HashMap<String,Quantity>();
    memMap.put("memory", memLimit);
    ResourceRequirements resourceRequirements = new ResourceRequirementsBuilder()
      .withRequests(memMap)
      .build();

    // ******************* DEPLOYMENT *********************
    Deployment deployment = new DeploymentBuilder()
        .withNewMetadata()
        .withName("first-deployment")
        .endMetadata()
        .withNewSpec()
        .withReplicas(3)
        .withNewTemplate()
        .withNewMetadata()
        .addToLabels(namespaceID, "hello-world-example")
        .endMetadata()
        .withNewSpec()
        .addNewContainer()      
        .withName("nginx-one")
        .withImage("nginx")
        .addNewPort()
        .withContainerPort(80)
        .endPort()
        .withResources(resourceRequirements)
        .endContainer()
        .endSpec()
        .endTemplate()
        .endSpec()
        .build();
    deployment = client.extensions().deployments().inNamespace(namespace).create(deployment);

我现在怎么知道分配给 pod 的 memory 中有多少 memory 被使用了? 该文档说它是 pod 状态的一部分,但 pod 状态是以下形式

     (conditions=
    [PodCondition
    (lastProbeTime=null, lastTransitionTime=2018-01-09T15:53:28Z, 
    message=null, reason=null, 
status=True, type=PodScheduled, 
    additionalProperties={})],
 containerStatuses=[], hostIP=null, 
    initContainerStatuses=[],
 message=null, phase=Pending, podIP=null,
 qosClass=Burstable, reason=null, 
startTime=null, additionalProperties={})

和容器状态

(containerID=null, image=nginx, 
imageID=, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}),
 name=nginx-one, ready=false, restartCount=0, state=ContainerState(running=null, terminated=null, waiting=
ContainerStateWaiting(message=null, reason=ContainerCreating, additionalProperties={}), additionalProperties={}), 
additionalProperties={})

有没有监控 Pod 资源的例子?

花一个小时观看视频: 负载测试Kubernetes:如何在生产中优化集群资源分配,其中介绍了两种技术以及有关如何基于负载测试调整资源配置大小的建议。 视频中的示例利用了cAdvisor,因此一旦Pod /容器启动并运行,您就可以利用该机制至少捕获容器占用多少资源的基本视图。

我不确定k8 api服务器是否提供端点来获取与性能相关的指标,但是使用fabric8时,即使Pod处于运行状态,您也不应该监视资源消耗。

这是Pod响应json:

{
  "kind": "Pod",
  "apiVersion": "v1",
  "metadata": {
    "name": "nginx-41cbe3-10-json-9cc655bcc-w576m",
    "generateName": "nginx-41cbe3-10-json-9cc655bcc-",
    "namespace": "default",
    "selfLink": "/api/v1/namespaces/default/pods/nginx-41cbe3-10-json-9cc655bcc-w576m",
    "uid": "e14a955f-18b7-11e8-a642-42010a800090",
    "resourceVersion": "12765988",
    "creationTimestamp": "2018-02-23T16:37:47Z",
    "labels": {
      "app": "nginx",
      "cliqr": "99911519403865240",
      "pod-template-hash": "577211677"
    },
    "annotations": {
      "kubernetes.io/created-by": "{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",\"namespace\":\"default\",\"name\":\"nginx-41cbe3-10-json-9cc655bcc\",\"uid\":\"e1493bd0-18b7-11e8-a642-42010a800090\",\"apiVersion\":\"extensions\",\"resourceVersion\":\"12765971\"}}\n",
      "kubernetes.io/limit-ranger": "LimitRanger plugin set: cpu request for container nginx"
    },
    "ownerReferences": [
      {
        "apiVersion": "extensions/v1beta1",
        "kind": "ReplicaSet",
        "name": "nginx-41cbe3-10-json-9cc655bcc",
        "uid": "e1493bd0-18b7-11e8-a642-42010a800090",
        "controller": true,
        "blockOwnerDeletion": true
      }
    ]
  },
  "spec": {
    "volumes": [
      {
        "name": "default-token-zrhj5",
        "secret": {
          "secretName": "default-token-zrhj5",
          "defaultMode": 420
        }
      }
    ],
    "containers": [
      {
        "name": "nginx",
        "image": "nginx:latest",
        "ports": [
          {
            "containerPort": 80,
            "protocol": "TCP"
          }
        ],
        "resources": {
          "requests": {
            "cpu": "100m"
          }
        },
        "volumeMounts": [
          {
            "name": "default-token-zrhj5",
            "readOnly": true,
            "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
          }
        ],
        "terminationMessagePath": "/dev/termination-log",
        "terminationMessagePolicy": "File",
        "imagePullPolicy": "Always"
      }
    ],
    "restartPolicy": "Always",
    "terminationGracePeriodSeconds": 30,
    "dnsPolicy": "ClusterFirst",
    "serviceAccountName": "default",
    "serviceAccount": "default",
    "nodeName": "gke-rishi-k8-cluster-default-pool-6ca1467e-xtmw",
    "securityContext": {},
    "schedulerName": "default-scheduler",
    "tolerations": [
      {
        "key": "node.alpha.kubernetes.io/notReady",
        "operator": "Exists",
        "effect": "NoExecute",
        "tolerationSeconds": 300
      },
      {
        "key": "node.alpha.kubernetes.io/unreachable",
        "operator": "Exists",
        "effect": "NoExecute",
        "tolerationSeconds": 300
      }
    ]
  },
  "status": {
    "phase": "Running",
    "conditions": [
      {
        "type": "Initialized",
        "status": "True",
        "lastProbeTime": null,
        "lastTransitionTime": "2018-02-23T16:37:47Z"
      },
      {
        "type": "Ready",
        "status": "True",
        "lastProbeTime": null,
        "lastTransitionTime": "2018-02-23T16:37:53Z"
      },
      {
        "type": "PodScheduled",
        "status": "True",
        "lastProbeTime": null,
        "lastTransitionTime": "2018-02-23T16:37:47Z"
      }
    ],
    "hostIP": "10.240.0.23",
    "podIP": "10.20.3.164",
    "startTime": "2018-02-23T16:37:47Z",
    "containerStatuses": [
      {
        "name": "nginx",
        "state": {
          "running": {
            "startedAt": "2018-02-23T16:37:52Z"
          }
        },
        "lastState": {},
        "ready": true,
        "restartCount": 0,
        "image": "nginx:latest",
        "imageID": "docker-pullable://nginx@sha256:600bff7fb36d7992512f8c07abd50aac08db8f17c94e3c83e47d53435a1a6f7c",
        "containerID": "docker://2c227a901bcde4705c5b79aedf1963079dfb345fae5849616d29e8cc7af0fd74"
      }
    ],
    "qosClass": "Burstable"
  }
}

我知道这个问题已经有两年了,但这里的答案并没有提供这个问题的实际答案。

为了获得 CPU 和 memory 利用率,您需要在 Kube.netes 集群上安装kube.netes 指标服务器(如果您使用 helm,另请参阅官方Helm 图表)。 一旦安装了指标服务器,您就可以运行 kube.netes 命令来报告指标使用情况。 例如,运行kubectl top pods -A将按 CPU 使用率对所有 pod 进行排序,或者kubectl top nodes将列出每个节点的使用率。 安装指标服务器后, kubectl describe pods以及Kube.netes 仪表板还将报告 CPU 和 memory 利用率数字。

要回答有关fabric8的特定问题,一旦指标服务器运行,您可以使用以下代码获取 CPU 和 memory 利用率:

KubernetesClient k8s = new KubernetesClientBuilder().build()
NodeMetricsList nodeMetricsList = k8s.top().nodes().metrics();
for (NodeMetrics nodeMetrics : nodeMetricsList.getItems()) {
    logger.info("{} {} {}",
        nodeMetrics.getMetadata().getName(),
        nodeMetrics.getUsage().get("cpu"),
        nodeMetrics.getUsage().get("memory")
    );
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM