繁体   English   中英

如何在稳定/普罗米修斯图表中设置prometheus规则.yaml?

[英]How to set prometheus rules in stable/prometheus chart values.yaml?

使用官方Prometheusstable/prometheus

自定义其values.yaml文件以设置alertmanager.yml文件和serverFiles区域。

rules: {}

https://github.com/kubernetes/charts/blob/master/stable/prometheus/values.yaml#L598

这是{} 如何将真正的警报规则写成官方格式

例如,我尝试过:

  serverFiles:
    alerts: {}
    rules:
    # Alert for any instance that is unreachable for >5 minutes.
    - alert: InstanceDown
      expr: up == 0
      for: 5m
      labels:
        severity: page
      annotations:
        summary: "Instance {{ $labels.instance }} down"
      description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."

并运行$ helm install my_prometheus 然后pod出现了这个错误:

PersistentVolumeClaim is not bound: "sweet-terrier-prometheus-server"
Back-off restarting failed container
Error syncing pod
serverFiles:
  alerts:
    groups:
    - name: NodeAlerts
      rules:
      - alert: NodeCPUUsage
        expr: (100 - (avg(irate(node_cpu{mode="idle"}[5m])) BY (instance) * 100)) > 75
        for: 2m
        labels:
          severity: alert
        annotations:
          description: '{{$labels.instance}}: CPU usage is above 75% (current value is:
            {{ $value }})'
          summary: '{{$labels.instance}}: High CPU usage detect

规则适用于记录规则,警报适用于警报规则。

https://prometheus.io/docs/practices/rules/

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM