we have multiple "up"-metrics from health checks of multiple services in prometheus. Binary values 0 is down, 1 is up. Within prometheus we would like to calculate the uptime of each service (aggregation in prometheus). A service is up, if all its health checks are up, the association to services is stored as label in prometheus, thats easy, but now we want to calculate the average over the time of this.
How can we build the average over time from the minimum at every point in time? I think we need some min
function that uses and returns a range vector. But min_over_time
is having a range vector as input, but returns a instant vector, but we need it more as min_of_each_time
If you need any further infos, i can also provide an example, but maybe its easy for somebody?!
BR Marius
Meanwhile, after some chats with colleagues, i found a solution. While scraping we aggregate the health_checks by service_id. This is done via prometheus recording/aggregation rules. Then later on we can use the avg_over_time function to calculate the availability.
#Aggregate health check availability per service
- record: service_health_status_aggregated
expr: min(consul_health_status) by (serviceName, serviceId, env, dc)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.