简体   繁体   English

Prometheus 在每个时间点找到最小值的平均值

[英]Prometheus find average of minimum at each point over time

we have multiple "up"-metrics from health checks of multiple services in prometheus.我们从普罗米修斯中多个服务的健康检查中获得了多个“向上”指标。 Binary values 0 is down, 1 is up.二进制值 0 表示向下,1 表示向上。 Within prometheus we would like to calculate the uptime of each service (aggregation in prometheus).在 prometheus 中,我们想计算每个服务的正常运行时间(prometheus 中的聚合)。 A service is up, if all its health checks are up, the association to services is stored as label in prometheus, thats easy, but now we want to calculate the average over the time of this.服务已启动,如果其所有健康检查都已启动,则与服务的关联在 prometheus 中存储为 label,这很容易,但现在我们要计算这段时间的平均值。

How can we build the average over time from the minimum at every point in time?我们如何才能从每个时间点的最小值建立一段时间内的平均值? I think we need some min function that uses and returns a range vector.我认为我们需要一些min的 function 来使用并返回一个范围向量。 But min_over_time is having a range vector as input, but returns a instant vector, but we need it more as min_of_each_time但是min_over_time有一个范围向量作为输入,但返回一个即时向量,但我们更需要它作为min_of_each_time

If you need any further infos, i can also provide an example, but maybe its easy for somebody?!如果您需要任何进一步的信息,我也可以提供一个例子,但也许对某人来说很容易?!

BR Marius BR马吕斯

Meanwhile, after some chats with colleagues, i found a solution.同时,在与同事交谈后,我找到了解决方案。 While scraping we aggregate the health_checks by service_id.在抓取时,我们通过 service_id 聚合 health_checks。 This is done via prometheus recording/aggregation rules.这是通过 prometheus 记录/聚合规则完成的。 Then later on we can use the avg_over_time function to calculate the availability.然后稍后我们可以使用 avg_over_time function 来计算可用性。

  #Aggregate health check availability per service
  - record: service_health_status_aggregated
    expr:  min(consul_health_status) by (serviceName, serviceId, env, dc)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM