简体   繁体   中英

Prometheus AlertManager - Send Alerts to different clients based on routes

I have 2 services A and B which I want to monitor. Also I have 2 different notification channels X and Y in the form of receivers in the AlertManager config file.

I want to send to notify X if service A goes down and want to notify Y if service B goes down. How can I achieve this my configuration?

My AlertManager YAML file is:

route:
  receiver: X

receivers:
  - name: X
    email_configs:

  - name: Y
    email_configs:

And alert.rule files is:

groups:

- name: A
  rules:
    - alert: A_down
      expr: expression
      for: 1m
      labels:
         severity: critical
      annotations:
         summary: "A is down"

- name: B
  rules:
    - alert: B_down
      expr: expression
      for: 1m
      labels:
        severity: warning
      annotations:
        summary: "B is down"

The config should roughly look like this (not tested):

route:
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 2h

  receiver: 'default-receiver'

  routes:
  - match:
      alertname: A_down
    receiver: X
  - match:
      alertname: B_down
    receiver: Y

The idea is, that each route field can has a routes field, where you can put a different config, that gets enabled if the labels in match match the condition.

For clarifying - The General Flow to handle alert in Prometheus (Alertmanager and Prometheus integration) is like this:

SomeErrorHappenInYourConfiguredRule( Rule ) -> RouteToDestination( Route ) -> TriggeringAnEvent( Reciever )-> GetAMessageInSlack/PagerDuty/Mail/etc...

For example:

if my aws machine cluster production-a1 is down, I want to trigger an event sending "pagerDuty" and "Slack" to my team with the relevant error.

There's 3 files important to configure alerts on your prometheus system:

  1. alertmanager.yml - configuration of you routes (getting the triggered errors) and receivers (how to handle this errors)
  2. rules.yml - This rules will contain all the thresholds and rules you'll define in your system.
  3. prometheus.yml - global configuration to integrate your rules into routes and recivers together (the two above).

I'm attaching a Dummy example In order to demonstrate the idea, in this example I'll watch overload in my machine (using node exporter installed on it): On /var/data/prometheus-stack/alertmanager/ alertmanager.yml

global:
  # The smarthost and SMTP sender used for mail notifications.
  smtp_smarthost: 'localhost:25'
  smtp_from: 'JohnDoe@gmail.com'

route:
  receiver: defaultTrigger
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 6h
  routes:
  - match_re:
      service: service_overload
      owner: ATeam
    receiver: pagerDutyTrigger

receivers:
- name: 'pagerDutyTrigger'
  pagerduty_configs:
  - send_resolved: true
    routing_key: <myPagerDutyToken>

Add some rule On /var/data/prometheus-stack/prometheus/ yourRuleFile.yml

groups:
- name: alerts
  rules:
  - alert: service_overload_more_than_5000
    expr: (node_network_receive_bytes_total{job="someJobOrService"} / 1000) >= 5000
    for: 10m
    labels:
      service: service_overload
      severity: pager
      dev_team: myteam
    annotations:
      dev_team: myteam
      priority: Blocker
      identifier: '{{ $labels.name }}'
      description: 'service overflow'
      value: '{{ humanize $value }}%'

On /var/data/prometheus-stack/prometheus/ prometheus.yml add this snippet to integrate alertmanager:

global:

...

alerting:
  alertmanagers:
  - scheme: http
    static_configs:
    - targets:
      - "alertmanager:9093"

rule_files:
  - "yourRuleFile.yml"

...

Pay attention that the key point of this example is service_overload which connects and binds the rule into the right receiver.

Reload the config (restart the service again or stop and start your docker containers) and test it, if it's configured well you can watch the alerts in http://your-prometheus-url:9090/alerts

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM