简体   繁体   English

function 率真的会随着时间的推移在普罗米修斯中给出平均值吗?

[英]Does the rate function really give average over time in prometheus?

Does rate function really give average over time?利率 function 真的会随着时间的推移给出平均值吗?

I send 20 requests into an endpoint by我通过以下方式将 20 个请求发送到端点

ab -n 20 http://0.0.0.0:8001/

Snapshot快照

So, I use rate function with a metrics over 20s, so it should give me 1 value because there are 20 requests over the last 20s.所以,我使用 function 的指标超过 20 秒,所以它应该给我 1 个值,因为在过去 20 秒内有 20 个请求。

So 20 / 20 = 1 but it provides the value 2 .所以20 / 20 = 1但它提供了值2

I believe there has no relation between scrape_interval and evaluation_interval for the result, my both interval is 10s我相信结果的scrape_intervalevaluation_interval之间没有关系,我的两个间隔都是10s

Usually rate(m[d]) returns the average per-second change rate for the counter m over the previous time interval d .通常rate(m[d])返回计数器m在前一个时间间隔d的平均每秒变化率。 But sometimes Prometheus can return unexpected results from rate() function because of extrapolation.但有时 Prometheus 会因为外推而从rate() function 返回意想不到的结果。 See this issue for details.有关详细信息,请参阅此问题 Some Prometheus-compatible query engines such as MetricsQL try solving this issue - see this comment and this article for technical details.一些与 Prometheus 兼容的查询引擎(例如MetricsQL)尝试解决此问题 - 有关技术细节,请参阅此评论本文

Prometheus is going to solve this issue too - see this design doc . Prometheus 也将解决这个问题 - 请参阅此设计文档

If your scrape interval is 10s, then this is expected. 如果您的抓取间隔是10秒,那么这是预期的。 The way this works is that Prometheus takes the 2 samples in your 20s interval (as they're 10s apart), calculates the difference (20), extrapolates that to the whole interval (40), then divides by the length of the interval (20) so you get 2. 这种方法的工作方式是Prometheus在20s的间隔(相距10s)中获取2个样本,计算出差(20),将其外推到整个间隔(40),然后除以间隔的长度( 20)所以你得到2。

I don't like this either and I've been advocating for a better rate implementation, which looks at the last sample before the range and the last sample in the range (so you would have an increase of 20 over 20s in your case, rather than an increase of 20 over 10s or possibly an increase of 0 over 10s, depending when you happen to query). 我也不喜欢这种方法, 我一直在提倡更好的rate实施,它会查看范围之前的最后一个样本和范围中的最后一个样本(因此,您的情况将在20多秒内增加20,而不是在10秒钟内增加20,或者在10秒钟内可能增加0(具体取决于您查询的时间)。 But so far that has gone nowhere. 但是到目前为止,它什么都没有。 So for now at least, welcome to the club. 所以至少现在,欢迎加入俱乐部。

One exceedingly hacky way to work against Prometheus' implementation is to reverse engineer it. 对抗Prometheus实施的一种极其骇人听闻的方法是对它进行反向工程。 Eg the expression that would give you the actual rate over 20 seconds in your case is: 例如,在您的情况下,可以在20秒内为您提供实际费率的表达式是:

rate(hello_worlds_total[30s]) / 30 * 20

Ie Prometheus takes the rate over 20 seconds, extrapolates it to 30, then you undo that extrapolation. 即Prometheus将速率超过20秒,外推至30,然后撤消该外推。 But it requires you to be aware of the scrape interval and do the math to undo Prometheus' extrapolation. 但是,这需要您了解刮擦间隔并进行数学运算以撤消Prometheus的推断。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM