简体   繁体   English

Prometheus 查询计算一段时间内的唯一标签值

[英]Prometheus query to count unique label values over a period

I want to calc the total distinct label counts over a period time.我想计算一段时间内不同标签的总数。

Eg.例如。 I have below 4 sample data我有以下 4 个示例数据

Metrics Labels TimeStamp Values指标标签时间戳值

cpu_usage{instance="192.168.100.10:20001",job="node2"}@1646225640 => 4
cpu_usage{instance="192.168.100.10:20001",job="node1"}@1646225700 => 5
cpu_usage{instance="192.168.100.10:20001",job="node3"}@1646225760 => 3
cpu_usage{instance="192.168.100.10:20001",job="node2"}@1646225820 => 4

So if I check startdate=1646225640, enddate=1646225700, I got 2 distinct jobs, which are node2 and node1因此,如果我检查 startdate=1646225640、enddate=1646225700,我会得到 2 个不同的作业,即 node2 和 node1

if I check startdate=1646225640, end date=1646225820, I got 3 distinct jobs, which are node1, node2, and node3.如果我检查开始日期=1646225640,结束日期=1646225820,我得到了 3 个不同的作业,它们是节点 1、节点 2 和节点 3。

Is there a way in promql can do this? promql 有没有办法做到这一点?

Find one way to achieve this找到一种方法来实现这一目标

count by (job_temp) (count_over_time(cpu_usage[1h]))

or要么

sum(count by (job) (count_over_time(cpu_usage[1h])))

PromQL is a time-series based so I find it more useful to illustrate with image: PromQL 是基于时间序列的,所以我发现用图像来说明更有用:

在此处输入图像描述

say we want to check between 1646225640 and 1646225820假设我们要在 1646225640 和 1646225820 之间进行检查

prometheus data is based on metrics{labels} for a series of timestamp, so count_over_time will return results with 3 records: prometheus 数据是基于 metrics{labels} 的一系列时间戳,因此 count_over_time 将返回包含 3 条记录的结果:

instance=192.168.100.10:20001,job=node1 values=[[1646225700, 1]] instance=192.168.100.10:20001,job=node2 values=[[1646225640, 1], [1646225820, 1]] instance=192.168.100.10:20001,job=node3 values=[[1646225760, 1]] instance=192.168.100.10:20001,job=node1 values=[[1646225700, 1]] instance=192.168.100.10:20001,job=node2 values=[[1646225640, 1], [1646225820, 1]] instance=192.168. 100.10:20001,job=node3 值=[[1646225760, 1]]

for above results, if we count by job, it's like group by job, so we have 3 results too对于上面的结果,如果我们按工作统计,就像按工作分组一样,所以我们也有 3 个结果

if we count by labels that doesn't exist in the metrics, then prometheus will just take no label as group by filter, then we will only have 1 result cause they haven't been grouped at all, which is the second query sum method effect如果我们按指标中不存在的标签进行计数,那么 prometheus 将不会将标签作为 group by filter,那么我们将只有 1 个结果,因为它们根本没有被分组,这是第二种查询求和方法影响

The following PromQL query should return unique job label values for the metric cpu_usage on a time range (td... t] :以下 PromQL 查询应返回时间范围(td... t]上指标cpu_usage的唯一job标签值:

count(last_over_time(cpu_usage[d] @ t)) by (job)

This query uses the following PromQL features:此查询使用以下 PromQL 功能:

  • count() aggregate function for counting the number of time series per each unique job value. count()聚合函数,用于计算每个唯一job值的时间序列数。
  • last_over_time() rollup function, which returns the last value per each time series on the given lookbehind window d in square brackets. last_over_time()汇总函数,它返回方括号中给定后视窗口d上每个时间序列的最后一个值。
  • @ modifier for executing the query at the given timestamp t . @ 用于在给定时间戳t执行查询的修饰符

The query can be simplified if it is executed in recording rules or alerting rules .如果在记录规则告警规则中执行,可以简化查询。 For example, the following query counts the number of unique time series per each unique job label value for the last hour:例如,以下查询计算最后一小时每个唯一job标签值的唯一时间序列数:

count(last_over_time(cpu_usage[1h])) by (job)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM