简体   繁体   English

Prometheus 查询以计算具有不同标签集的两个指标的百分比

[英]Prometheus query to calculate percentage from two metrics with different set of labels

We have a service that accepts the HTTP request and responds with all the matching user data in the response and produces two metrics that are supposed to be made available as graphs/charts on the grafana.我们有一个服务,它接受 HTTP 请求并使用响应中所有匹配的用户数据进行响应,并生成两个应该在 grafana 上以图形/图表形式提供的指标。 Metrics as mentioned here -这里提到的指标 -

  1. Total requests received by the application in different data centre regions应用程序在不同数据中心区域收到的请求总数

    application_requests_total{data_center_region_id=1} //200 application_requests_total{data_center_region_id=2} //100 application_requests_total{data_center_region_id=1} //200 application_requests_total{data_center_region_id=2} //100

  2. Every request to the application will try and include all the matching user's data in the response metrics对应用程序的每个请求都将尝试在响应指标中包含所有匹配的用户数据

    application_response_total{user="user1, data_center_region_id=1} //100 application_response_total{user="user1, data_center_region_id=2} //100 application_response_total{user="user2, data_center_region_id=1} //50 application_response_total{user="user2, data_center_region_id=2} //100 application_response_total{user="user1, data_center_region_id=1} //100 application_response_total{user="user1, data_center_region_id=2} //100 application_response_total{user="user2, data_center_region_id=1} //50 application_response_total{user="user2, data_center_region_id =2} //100

Quick observations on the metrics快速观察指标

  • user label is only present in response metrics application_response_total user label 仅存在于响应指标application_response_total
  • data_center_region_id is the common label in request and response metrics data_center_region_id是请求和响应指标中常见的 label
  • One response can contain more than one user's data which is also reflected in the metrics application_response_total .一个响应可以包含多个用户的数据,这些数据也反映在指标application_response_total中。

I need to find out the percentage of the responses at the user level against the total requests made to the application in specific data centre region.我需要找出用户级别的响应与特定数据中心区域中对应用程序的总请求数的百分比。

Eg: Based on the above data the expected results would be例如:根据上述数据,预期结果将是

For data_center_region_id=1对于 data_center_region_id=1

  • user1's data responded for 100/200 = 50% of the time user1 的数据响应时间为 100/200 = 50%
  • user2's data responded for 50/200 = 25% of the time user2 的数据响应时间为 50/200 = 25%

For data_center_region_id=2对于 data_center_region_id=2

  • user1's data responded for 100/100 = 100% of the time user1 的数据响应率为 100/100 = 100%
  • user2's data responded for 100/100 = 100% of the time user2 的数据响应率为 100/100 = 100% 的时间

I tried a couple of queries based on the Prometheus vector matching documentation but couldn't achieve the expected results.我尝试了几个基于Prometheus 矢量匹配文档的查询,但无法达到预期的结果。 Few samples queries as follows;几个示例查询如下; I'm not sure but I think I messed up the ON / IGNORING and GROUP_LEFT / GROUP_RIGHT keyworkds我不确定,但我想我搞砸了ON / IGNORINGGROUP_LEFT / GROUP_RIGHT

sum(rate(application_response_total{data_center_region_id=~"$region"}[5m])) by (user, data_center_region_id) / on(user) group_left(data_center_region_id) sum(rate(application_requests_total{data_center_region_id=~"$region"}[5m])) by (user, data_center_region_id)

Also took reference from the question here but nothing is working for me.也参考了这里的问题,但没有什么对我有用。

Please guide me with above-expected result;请以超出预期的结果指导我;

Also, is this the only way to get the desired graphs?此外,这是获得所需图表的唯一方法吗?

Note that it may be a bad idea to put user into metric labels if the application is going to serve unlimited number of users.请注意,如果应用程序要为无限数量的用户提供服务,那么将user放入度量标签可能是个坏主意。 This may result in eg high cardinality issues .这可能导致例如高基数问题

As for the original question, you need to put common labels for the left-side and right-side series in the on(...) modifier, so Prometheus could find pairs of series with the given labels on both sides of / operator.至于原始问题,您需要在on(...)修饰符中放置左侧和右侧系列的公共标签,以便 Prometheus 可以在/运算符的两侧找到具有给定标签的系列对。 The following query returns per-(user, region) rps share:以下查询返回每个(用户、区域)的 rps 共享:

rate(application_response_total[5m])
  / on(data_center_region_id) group_left()
rate(application_requests_total[5m])

The group_left() modifier instructs Prometheus that the left side may contain multiple series with the identical data_center_region_id label value. group_left()修饰符指示 Prometheus 左侧可能包含具有相同data_center_region_id label 值的多个系列。 In this case Prometheus independently divides each such series by the series at the right side with the matching data_center_rrgion_id label value.在这种情况下,Prometheus 独立地将每个此类系列除以右侧具有匹配data_center_rrgion_id label 值的系列。 The resulting series will have the same set of labels as the left-side series.结果系列将具有与左侧系列相同的标签集。 See these docs for details.有关详细信息,请参阅这些文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Prometheus 查询比较具有相同标签集的不同指标 - Prometheus query comparing different metrics with same set of labels 计算多个 prometheus 指标的百分比并在 Grafana 中显示 - Calculate percentage of multiple prometheus metrics and display in Grafana 在对具有相同值的不同标签的两个指标进行数学运算时,在普罗米修斯查询中使用“label_replace”是否是一个好的解决方案 - Is it a good solution to use "label_replace" in a prometheus query when doing math operations on two metrics with different labels for the same value 无法计算 Prometheus 中两个指标的比率 - Unable to calculate the ratio of two metrics in Prometheus 如何通过普罗米修斯查询将二除以计算百分比 - How to divide two by prometheus queries to calculate a percentage 在定义的时间间隔内获取两个指标之间的百分比 [Prometheus] - Get percentage rate between two metrics in a define interval [Prometheus] 如何在 Prometheus 中使用两个指标执行查询? - How to execute a query with two metrics in Prometheus? Prometheus:检测指标的创建(同名,不同标签) - Prometheus: detect creation of metrics (same name, different labels) 从 prometheus 中的不同 pod 收集指标 - collect metrics from different pods in prometheus 忽略普罗米修斯查询中的特定标签集 - Ignore specific set of labels on prometheus query
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM