简体   繁体   中英

Missing labels in prometheus alerts

I'm having issues with Prometheus alerting rules. I have various cAdvisor specific alerts set up, for example:

- alert: ContainerCpuUsage
  expr: (sum(rate(container_cpu_usage_seconds_total[3m])) BY (instance, name) * 100) > 80
  for: 2m
  labels:
    severity: warning
  annotations:
    title: 'Container CPU usage (instance {{ $labels.instance }})'
    description: 'Container CPU usage is above 80%\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}'

When the condition is met, I can see the alert in the "Alerts" tab in Prometheus, however some labels are missing thus not allowing alertmanager to send a notification via Slack. To be specific, I attach custom "env" label to each target:

 {
  "targets": [
   "localhost:8080",
  ],
  "labels": {
   "job": "cadvisor",
   "env": "production",
   "__metrics_path__": "/metrics"
  }
 }

But when the alert based on cadvisor metrics is firing, the labels are: alertname, instance and severity - no job label, no env label. All the other alerts from other exporters (fe node-exporter) work just fine and the label is present.

This is due to sum function that you use; it gathered all the time series present and added them groping BY (instance, name) . If you run the same query in Prometheus, you'll see that sum left only grouping labels:

{instance="foo", name="bar"}    135.38819037447163

Other aggregation methods like avg , max , min , etc, work in the same fashion. To bring the label back simply add env to the grouping list: by (instance, name, env) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM