Google Kube.netes Engine (GKE) 上的 Horizontal Pod Autoscaler (HPA) 通过 Stackdriver 外部指标使用 Ingress LoadBalancer 的后端延迟

Question

我正在尝试使用 Ingress LoadBalancer 的外部指标在 Google Kube.netes Engine (GKE) 上配置 Horizontal Pod Autoscaler (HPA)，配置基于以下说明

https://cloud.google.com/kube.netes-engine/docs/tutorials/external-metrics-autoscaling和https://blog.doit-intl.com/autoscaling-k8s-hpa-with-google-http- s-负载平衡器-rps-stackdriver-metric-92db0a28e1ea

像 HPA

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: my-api
  namespace: production
spec:
  minReplicas: 1
  maxReplicas: 20
  metrics:
  - external:
      metricName: loadbalancing.googleapis.com|https|request_count
      metricSelector:
        matchLabels:
          resource.labels.forwarding_rule_name: k8s-fws-production-lb-my-api--63e2a8ddaae70
      targetAverageValue: "1"
    type: External
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-api

当请求计数增加时，自动缩放器会启动——但是在服务上施加沉重的负载，比如每秒 100 个并发请求，不会将外部指标request_count增加太多超过 6 RPS，而在backend_latencies中观察到的 backend_latencies 指标确实显着增加； 所以我想通过添加到 HPA 配置来利用该指标，如下所示：

  - external:
      metricName: loadbalancing.googleapis.com|https|backend_latencies
      metricSelector:
        matchLabels:
          resource.labels.forwarding_rule_name: k8s-fws-production-lb-my-api--63e2a8ddaae70
      targetValue: "3000"
    type: External

但这会导致错误：

...unable to fetch metrics from external metrics API: googleapi: Error 400: Field aggregation.perSeriesAligner had an invalid value of "ALIGN_RATE": The aligner cannot be applied to metrics with kind DELTA and value type DISTRIBUTION., badRequest

可以用命令观察

$ kubectl describe hpa -n production

或通过访问

http://localhost:8080/apis/external.metrics.k8s.io/v1beta1/namespaces/default/loadbalancing.googleapis.com%7Chttps%7Cbackend_latencies

设置代理后

$ kubectl proxy --port=8080

在 GKE 的 HPA 配置中，是否不支持将https/backend_latencies或https/total_latencies作为外部 Stackdriver 指标？

Answer 1

也许有人会觉得这很有帮助，尽管这个问题很老了。

我的工作配置如下所示：

  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 95
  - type: External
    external:
      metric:
       name: loadbalancing.googleapis.com|https|backend_latencies
       selector:
         matchLabels:
           resource.labels.backend_name: frontend
           metric.labels.proxy_continent: Europe
           reducer: REDUCE_PERCENTILE_95
      target:
        type: Value
        value: "79.5"

type: Value ，因为它是不将度量值除以副本数的唯一方法。

reducer: REDUCE_PERCENTILE_95过去只使用分布的单个值 ( source )。

此外，我custom-metrics-stackdriver-adapter部署编辑为如下所示：

  - image: gcr.io/gke-release/custom-metrics-stackdriver-adapter:v0.12.2-gke.0
    imagePullPolicy: Always
    name: pod-custom-metrics-stackdriver-adapter
    command:
    - /adapter
    - --use-new-resource-model=true
    - --fallback-for-container-metrics=true
    - --enable-distribution-support=true

问题是这个关键enable-distribution-support=true ，它可以使用分布类型的指标。

Google Kube.netes Engine (GKE) 上的 Horizontal Pod Autoscaler (HPA) 通过 Stackdriver 外部指标使用 Ingress LoadBalancer 的后端延迟

问题描述

1 个解决方案

解决方案1
1 2022-02-11 15:53:04

Google Kube.netes Engine (GKE) 上的 Horizontal Pod Autoscaler (HPA) 通过 Stackdriver 外部指标使用 Ingress LoadBalancer 的后端延迟

问题描述

1 个解决方案

解决方案1 1 2022-02-11 15:53:04

解决方案1
1 2022-02-11 15:53:04