Kubernetes HPA无法从Stackdriver检测到成功发布的自定义指标

Question

我正在尝试使用HorizontalPodAutoscaler扩展Kubernetes Deployment ，该Deployment通过Stackdriver监听自定义指标。

我有一个启用了Stackdriver适配器的GKE群集。 我可以将自定义指标类型发布到Stackdriver，以下是它在Stackdriver的Metric Explorer中显示的方式。

这就是我定义HPA ：

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: External
    external:
      metricName: custom.googleapis.com|worker_pod_metrics|baz
      targetValue: 400
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-app-group-1-1

成功创建example-hpa ，执行kubectl get hpa example-hpa ，始终将TARGETS显示为<unknown> ，并且永远不会从自定义指标中检测到该值。

NAME          REFERENCE                       TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
example-hpa   Deployment/test-app-group-1-1   <unknown>/400   1         10        1          18m

我正在使用在本地运行的Java客户端来发布我的自定义指标。 我已经给了这里提到的适当的资源标签（硬编码-这样它就可以在本地环境中运行而不会出现问题）。 我已按照本文档创建Java客户端。

private static MonitoredResource prepareMonitoredResourceDescriptor() {
        Map<String, String> resourceLabels = new HashMap<>();
        resourceLabels.put("project_id", "<<<my-project-id>>>);
        resourceLabels.put("pod_id", "<my pod UID>");
        resourceLabels.put("container_name", "");
        resourceLabels.put("zone", "asia-southeast1-b");
        resourceLabels.put("cluster_name", "my-cluster");
        resourceLabels.put("namespace_id", "mynamespace");
        resourceLabels.put("instance_id", "");

        return MonitoredResource.newBuilder()
                .setType("gke_container")
                .putAllLabels(resourceLabels)
                .build();
    }

请问上述步骤我做错了什么？ 预先感谢您提供的任何答案！

编辑[已解决] ：我认为我有一些错误配置，因为kubectl describe hpa [NAME] --v=9向我展示了一些403状态代码，以及我使用的是type: External而不是type: Pods （感谢MWZ您的答案，指出此错误）。

我设法通过创建一个新项目，一个新服务帐户和一个新GKE群集（基本上是从头开始的所有内容）来修复它。 然后，按照本文档的说明，如下更改了yaml文件。

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: test-app-group-1-1
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: test-app-group-1-1
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods                 # Earlier this was type: External
    pods:                      # Earlier this was external:
      metricName: baz                               # metricName: custom.googleapis.com|worker_pod_metrics|baz
      targetAverageValue: 20

我现在要导出为custom.googleapis.com/baz ，而不要导出为custom.googleapis.com/worker_pod_metrics/baz 。 另外，现在我在yaml中为我的HPA明确指定了namespace 。

Answer 1

由于您可以在Stackdriver GUI中看到自定义指标，因此我猜测指标已正确导出。 我相信基于具有自定义指标的自动扩展部署，我认为您错误定义了HPA用来扩展部署的度量。

请尝试使用此YAML：

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metricName: baz
      targetAverageValue: 400
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-app-group-1-1

请记住：

HPA使用度量标准来计算平均值并将其与目标平均值进行比较。 在应用程序到Stackdriver的导出示例中，部署包含导出指标的Pod。 以下清单文件描述了HorizontalPodAutoscaler对象，该对象根据指标的目标平均值来缩放Deployment。

上页中所述的故障排除步骤也可能有用。

旁注由于上述HPA使用的是beta API autoscaling/v2beta1在运行kubectl describe hpa [DEPLOYMENT_NAME]时出现错误。 我运行kubectl describe hpa [DEPLOYMENT_NAME] --v=9并得到了JSON响应。

Answer 2

最好放置一些唯一的标签来定位指标。 现在，基于Java客户端中标记的指标，只有pod_id看起来是唯一的，由于其无状态性质，因此无法使用。

因此，我建议您尝试引入一个部署/指标范围内的不确定标识符。

resourceLabels.put("<identifier>", "<could-be-deployment-name>");

之后，您可以尝试使用类似于以下内容的方法来修改HPA：

kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: External
    external:
      metricName: custom.googleapis.com|worker_pod_metrics|baz
      metricSelector:
        matchLabels:
          # define labels to target
          metric.labels.identifier: <deployment-name>
      # scale +1 whenever it crosses multiples of mentioned value
      targetAverageValue: "400"
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-app-group-1-1

除此之外，此设置没有问题，应该可以顺利进行。

Helper命令以查看HPA暴露了哪些指标：

 kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/custom.googleapis.com|worker_pod_metrics|baz" | jq

Kubernetes HPA无法从Stackdriver检测到成功发布的自定义指标

问题描述

2 个解决方案

解决方案1
2 已采纳 2019-04-08 14:12:26

解决方案2
1 2019-04-10 07:58:20

Kubernetes HPA无法从Stackdriver检测到成功发布的自定义指标

问题描述

2 个解决方案

解决方案1 2 已采纳 2019-04-08 14:12:26

解决方案2 1 2019-04-10 07:58:20

解决方案1
2 已采纳 2019-04-08 14:12:26

解决方案2
1 2019-04-10 07:58:20