Linux Tutorials, Guides & Latest Tech Stuffs

How to send email alerts in Prometheus

Prometheus is an open-source system monitoring and alerting toolkit. It was originally developed at SoundCloud. Prometheus scraps metrices from the assigned jobs. It stores all scrapped samples locally and runs rules over this data to generate alerts. In this blog post, I will describe my own experience of setting up email alerts in Prometheus.

Define Rules and include this to Prometheus

Open prometheus config file located at /etc/prometheus/prometheus.yml
Add rule file location to it, as defined below.

rule_files:
 - 'prometheus.rules.yml'

If you are unaware about prometheus config file then please check  prometheus demo config here.

Check Rules using promtool

After adding rules to prometheus.yml file, use promtool to check rules.

For promotheus config file

promtool check rules prometheus.yml

For prometheus rules file

promtool check rules prometheus.rules.yml

Install Alert Manager and configure it as a service

The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integrations such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.

There are several ways for installing Alert Manager, though installing from precompiled binaries are recommended way.

Step -1:  Get precompiled binaries from download section of prometheus.io

Step -2:  untar this file using tar  -xvf command

tar -xvf alertmanager-0.15.2.linux-amd64.tar.gz

Step-3: Move alertmanager binaries to /usr/local/bin/ folder

mv alertmanager /usr/local/bin

Step-4:  Add  a config file to /etc/alertmanager/config.yml ( If you are unable to find any folder /etc/alertmanager create it and then create config.yml file )

Example Alertmanager config

global:
  # The smarthost and SMTP sender used for mail notifications.
  smtp_smarthost: 'localhost:25'
  smtp_from: '[email protected]'
  smtp_auth_username: 'alertmanager'
  smtp_auth_password: 'password'
  # The auth token for Hipchat.
  hipchat_auth_token: '1234556789'
  # Alternative host for Hipchat.
  hipchat_api_url: 'https://hipchat.foobar.org/'

# The directory from which notification templates are read.
templates: 
- '/etc/alertmanager/template/*.tmpl'

# The root route on which each incoming alert enters.
route:
  # The labels by which incoming alerts are grouped together. For example,
  # multiple alerts coming in for cluster=A and alertname=LatencyHigh would
  # be batched into a single group.
  #
  # To aggregate by all possible labels use '...' as the sole label name.
  # This effectively disables aggregation entirely, passing through all
  # alerts as-is. This is unlikely to be what you want, unless you have
  # a very low alert volume or your upstream notification system performs
  # its own grouping. Example: group_by: [...]
  group_by: ['alertname', 'cluster', 'service']

  # When a new group of alerts is created by an incoming alert, wait at
  # least 'group_wait' to send the initial notification.
  # This way ensures that you get multiple alerts for the same group that start
  # firing shortly after another are batched together on the first 
  # notification.
  group_wait: 30s

  # When the first notification was sent, wait 'group_interval' to send a batch
  # of new alerts that started firing for that group.
  group_interval: 5m

  # If an alert has successfully been sent, wait 'repeat_interval' to
  # resend them.
  repeat_interval: 3h 

  # A default receiver
  receiver: team-X-mails

  # All the above attributes are inherited by all child routes and can 
  # overwritten on each.

  # The child route trees.
  routes:
  # This routes performs a regular expression match on alert labels to
  # catch alerts that are related to a list of services.
  - match_re:
      service: ^(foo1|foo2|baz)$
    receiver: team-X-mails
    # The service has a sub-route for critical alerts, any alerts
    # that do not match, i.e. severity != critical, fall-back to the
    # parent node and are sent to 'team-X-mails'
    routes:
    - match:
        severity: critical
      receiver: team-X-pager
  - match:
      service: files
    receiver: team-Y-mails

    routes:
    - match:
        severity: critical
      receiver: team-Y-pager

  # This route handles all alerts coming from a database service. If there's
  # no team to handle it, it defaults to the DB team.
  - match:
      service: database
    receiver: team-DB-pager
    # Also group alerts by affected database.
    group_by: [alertname, cluster, database]
    routes:
    - match:
        owner: team-X
      receiver: team-X-pager
      continue: true
    - match:
        owner: team-Y
      receiver: team-Y-pager


# Inhibition rules allow to mute a set of alerts given that another alert is
# firing.
# We use this to mute any warning-level notifications if the same alert is 
# already critical.
inhibit_rules:
- source_match:
    severity: 'critical'
  target_match:
    severity: 'warning'
  # Apply inhibition if the alertname is the same.
  equal: ['alertname', 'cluster', 'service']


receivers:
- name: 'team-X-mails'
  email_configs:
  - to: '[email protected]'

- name: 'team-X-pager'
  email_configs:
  - to: '[email protected]'
  pagerduty_configs:
  - service_key: <team-X-key>

- name: 'team-Y-mails'
  email_configs:
  - to: '[email protected]'

- name: 'team-Y-pager'
  pagerduty_configs:
  - service_key: <team-Y-key>

- name: 'team-DB-pager'
  pagerduty_configs:
  - service_key: <team-DB-key>
  
- name: 'team-X-hipchat'
  hipchat_configs:
  - auth_token: <auth_token>
    room_id: 85
    message_format: html
    notify: true

Define Alert Manager as a service

Go to /etc/systemd/systme and create a file called alertmanager.service

add following code to alertmanager.service

[Unit]
 Description=Alert Manager for Prometheus
 Wants=network-online.target
 After=network-online.target

[Service]
 User=root
 Group=root
 Type=simple
 ExecStart=/usr/local/bin/alertmanager --config.file /etc/alertmanager/config/config.yml

[Install]
 WantedBy=multi-user.target

Reload Service daemon for new changes to take effect.

systemctl daemon-reload

Start Alert Manager service

sudo service alertmanager start

Restart Prometeus

Sudo service prometheus restart

Check Status of both the services i.e, Prometheus and Alert Manager using command

sudo service alertmanager status
sudo service prometheus status

If these services are working fine and there is no issue. Now all you have to verify your alerts.

To verify alerts please visit to prometehus installation url like prometheus.exampledomain.com/alerts

and you should see loaded alerts like this.

configure-alerts-prometheus

Congratulations!  You have successfully installed email alerts at your Prometheus server.

Pranav K

Pranav K is a software engineer and all-round computer geek. His interests include AWS, Ubuntu and Wordpress

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.