简体   繁体   English

Prometheus / Grafana 计算服务停机时间

[英]Prometheus / Grafana count a downtime of service

I have a service metric that returns either some positive value, or 0 in case of failure.我有一个服务指标,它返回一些正值,或者在失败的情况下返回 0。 I want to count how many seconds my service was failing during some time period.我想计算我的服务在某个时间段内失败的秒数。

Eg the expression:例如表达式:

service_metric_name == 0

gives me a dashed line in Grafana:在 Grafana 中给我一条虚线:

line_of_downtime line_of_downtime

Is there any way to count how many seconds my service was down for the last 2 hours?有没有办法计算我的服务在过去 2 小时内关闭了多少秒?

I assume the service is either 0 for being down or 1 for being up.我假设该服务是 0 表示关闭或 1 表示启动。

In this case you can calculate an average over a time range.在这种情况下,您可以计算一个时间范围内的平均值。 If the result is 0.9, your service has been up for 90% of the time.如果结果为 0.9,则您的服务已运行 90%。 If you calculated the average over an hour, this would have been 6 minutes downtime out of 60 minutes.如果您计算一个小时内的平均值,这将是 60 分钟中的 6 分钟停机时间。

avg_over_time(up{service_metric_name[1h])

This will be a moving average, that is: when your service is down, the value will slowly decrease.这将是一个移动平均线,即:当您的服务宕机时,该值会缓慢下降。 Then your service is up, it will slowly increase again.然后你的服务就起来了,它会再次慢慢增加。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Kubernetes命名空间中监控Prometheus上的自定义服务的问题 - Issue with monitoring custom service on prometheus in kubernetes namespace 看不到添加到 Prometheus Operator 服务监视器的目标 - Cannot see the target added to service monitor for Prometheus Operator 检查服务中的数据库表计数 - Check database table count in a Service Azure服务结构实例计数 - Azure service fabric Instance count 如何获得每个docker swarm服务正在运行的实例数作为Prometheus指标? - How do I get the number of running instances per docker swarm service as a prometheus metric? Prometheus 2.0 centos 服务无法启动,因为“打开存储失败”、“权限被拒绝” - Prometheus 2.0 centos service won't start, because “Opening storage failed”, “permission denied” 在后台使用服务实现倒计时器 - Implementing a Count down timer using Service in the background 在Android中使用工作线程计数服务时间 - Count times in service with worker thread in Android WCF服务中无法解释的线程创建和句柄计数增加 - Unexplained thread creation and handle count increase in a WCF service 无法在服务结果内部获取for循环周期的迭代计数->然后 - Cannot get iteration count of for loop cycle inside of the service result -> then
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM