简体繁体 English

用于水平自动缩放的 kubernetes / prometheus 自定义指标

[英]kubernetes / prometheus custom metric for horizontal autoscaling

原文 2021-10-08 21:29:35 6 1 kubernetes/ horizontal-pod-autoscaling

I'm wondering about an approach one has to take for our server setup.我想知道我们的服务器设置必须采用的一种方法。 We have pods that are short lived.我们有短命的豆荚。 They are started up with 3 pods at a minimum and each server is waiting on a single request that it handles - then the pod is destroyed.它们至少以 3 个 pod 启动，每个服务器都在等待它处理的单个请求 - 然后销毁 pod。 I'm not sure of the mechanism that this pod is destroyed, but my question is not about this part anyway.我不确定这个吊舱被销毁的机制，但我的问题无论如何都不是关于这部分的。

There is an "active session count" metric that I am envisioning.我正在设想一个“活动会话计数”指标。 Each of these pod resources could make a rest call to some "metrics" pod that we would create for our cluster.这些 pod 资源中的每一个都可以对我们将为集群创建的一些“指标”pod 进行休息调用。 The metrics pod would expose a sessionStarted and sessionEnded endpoint - which would increment/decrement the kubernetes activeSessions metric.指标 pod 将公开sessionStarted和sessionEnded端点 - 这将增加/减少 kubernetes activeSessions指标。 That metric would be what is used for horizontal autoscaling of the number of pods needed.该指标将用于水平自动缩放所需的 pod 数量。

Since having a pod as "up" counts as zero active sessions, the custom event that increments the session count would update the metric server session count with a rest call and then decrement again on session end (the pod being up does not indicate whether or not it has an active session).由于将 Pod 设为“向上”计数为零活动会话，因此增加会话计数的自定义事件将使用休息调用更新度量服务器会话计数，然后在会话结束时再次减少（正在运行的 Pod 并不表示是否或不是它有一个活动会话）。

Is it correct to think that I need this metric server (and write it myself)?认为我需要这个度量服务器（并自己编写）是否正确？ Or is there something that Prometheus exposes where this type of metric is supported already - rest clients and all (for various languages), that could modify this metric?或者 Prometheus 是否在已经支持这种类型的指标的地方公开了一些东西 - 休息客户端和所有（对于各种语言），可以修改这个指标？

Looking for guidance and confirmation that I'm on the right track.寻求指导和确认我走在正确的轨道上。 Thanks!谢谢！

1 个解决方案

It's impossible to give only one way to solve this and your question is more "opinion-based".不可能只给出一种方法来解决这个问题，而且您的问题更“基于意见”。 However there is an useful similar question on StackOverFlow , please check the comments that can give you some tips.但是在 StackOverFlow 上有一个有用的类似问题，请查看可以为您提供一些提示的评论。 If nothing works, probably you should write the script.如果没有任何效果，您可能应该编写脚本。 There is no exact solution from Kubernetes's side. Kubernetes 方面没有确切的解决方案。

Please also take into the consideration of Apache Flink .还请考虑Apache Flink 。 It has Reactive Mode in combination of Kubernetes:它结合了 Kubernetes 具有响应式模式：

Reactive Mode allows to run Flink in a mode, where the Application Cluster is always adjusting the job parallelism to the available resources. Reactive Mode允许在一种模式下运行 Flink，在这种模式下，应用程序集群始终根据可用资源调整作业并行度。 In combination with Kubernetes, the replica count of the TaskManager deployment determines the available resources.结合 Kubernetes，TaskManager 部署的副本数决定了可用资源。 Increasing the replica count will scale up the job, reducing it will trigger a scale down.增加副本数将扩大作业，减少它会触发缩小。 This can also be done automatically by using a Horizontal Pod Autoscaler .这也可以通过使用Horizontal Pod Autoscaler 自动完成。