
[英]Promethesus: How do I write a PromQL query to find the drastic increase or decrease by some X% in my graph and stays for 10m, need to raise an alert
[英]PromQL query to show pod label in alert
我在 Prometheus 运算符中有一个默认警报规则,如下所示,
- alert: KubePodNotReady
annotations:
message: Pod {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.pod {{`}}`}} has been in a non-ready state for longer than 15 minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready
expr: |-
sum by (namespace, pod) (
max by(namespace, pod) (
kube_pod_status_phase{job="kube-state-metrics", namespace=~".*", phase=~"Pending|Unknown"}
) * on(namespace, pod) group_left(owner_kind) topk by(namespace, pod) (
1, max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!="Job"})
)
) > 0
for: 15m
labels:
severity: warning
我希望警报显示 pod 的标签“团队名称”。
我可以使用以下表达式获取 pod 标签,
kube_pod_info * on(namespace, pod) group_left kube_pod_labels{label_teamname="example"}
kube_pod_info * on(namespace, pod) group_left(label_teamname) kube_pod_labels
但我不确定如何更新警报规则以显示标签。我只是尝试在不编辑表达式的情况下添加标签,
labels:
severity: warning
teamname: '{{ $labels.label_teamname }}'
但这没有用。
是否需要更改表达式才能在警报中包含团队名称? 如果是,请建议如何更改以下表达式..
expr: |-
sum by (namespace, pod) (
max by(namespace, pod) (
kube_pod_status_phase{job="kube-state-metrics", namespace=~".*", phase=~"Pending|Unknown"}
) * on(namespace, pod) group_left(owner_kind) topk by(namespace, pod) (
1, max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!="Job"})
)
) > 0
这个表达对我有用,
(sum by (namespace, pod) (
max by(namespace, pod) (
kube_pod_status_phase{job="kube-state-metrics", namespace=~".*", phase=~"Pending|Unknown"}
) * on(namespace, pod) group_left(owner_kind) topk by(namespace, pod) (
1, max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!="Job"})
)
) > 0) * on(namespace, pod) group_left(label_teamname) kube_pod_labels
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.