Prometheus is an open-source system monitoring and alerting toolkit. It was originally developed at SoundCloud. Prometheus scraps metrices from the assigned jobs. It stores all scrapped samples locally and runs rules over this data to generate alerts. In this blog post, I will describe my own experience of setting up email alerts in Prometheus.
Define Rules and include this to Prometheus
Open prometheus config file located at /etc/prometheus/prometheus.yml
Add rule file location to it, as defined below.
rule_files: - 'prometheus.rules.yml'
If you are unaware about prometheus config file then please check prometheus demo config here.
Check Rules using promtool
After adding rules to prometheus.yml file, use promtool to check rules.
For promotheus config file
promtool check rules prometheus.yml
For prometheus rules file
promtool check rules prometheus.rules.yml
Install Alert Manager and configure it as a service
The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integrations such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.
There are several ways for installing Alert Manager, though installing from precompiled binaries are recommended way.
Step -1: Get precompiled binaries from download section of prometheus.io
Step -2: untar this file using tar -xvf command
tar -xvf alertmanager-0.15.2.linux-amd64.tar.gz
Step-3: Move alertmanager binaries to /usr/local/bin/ folder
mv alertmanager /usr/local/bin
Step-4: Add a config file to /etc/alertmanager/config.yml ( If you are unable to find any folder /etc/alertmanager create it and then create config.yml file )
Example Alertmanager config
global: # The smarthost and SMTP sender used for mail notifications. smtp_smarthost: 'localhost:25' smtp_from: '[email protected]' smtp_auth_username: 'alertmanager' smtp_auth_password: 'password' # The auth token for Hipchat. hipchat_auth_token: '1234556789' # Alternative host for Hipchat. hipchat_api_url: 'https://hipchat.foobar.org/' # The directory from which notification templates are read. templates: - '/etc/alertmanager/template/*.tmpl' # The root route on which each incoming alert enters. route: # The labels by which incoming alerts are grouped together. For example, # multiple alerts coming in for cluster=A and alertname=LatencyHigh would # be batched into a single group. # # To aggregate by all possible labels use '...' as the sole label name. # This effectively disables aggregation entirely, passing through all # alerts as-is. This is unlikely to be what you want, unless you have # a very low alert volume or your upstream notification system performs # its own grouping. Example: group_by: [...] group_by: ['alertname', 'cluster', 'service'] # When a new group of alerts is created by an incoming alert, wait at # least 'group_wait' to send the initial notification. # This way ensures that you get multiple alerts for the same group that start # firing shortly after another are batched together on the first # notification. group_wait: 30s # When the first notification was sent, wait 'group_interval' to send a batch # of new alerts that started firing for that group. group_interval: 5m # If an alert has successfully been sent, wait 'repeat_interval' to # resend them. repeat_interval: 3h # A default receiver receiver: team-X-mails # All the above attributes are inherited by all child routes and can # overwritten on each. # The child route trees. routes: # This routes performs a regular expression match on alert labels to # catch alerts that are related to a list of services. - match_re: service: ^(foo1|foo2|baz)$ receiver: team-X-mails # The service has a sub-route for critical alerts, any alerts # that do not match, i.e. severity != critical, fall-back to the # parent node and are sent to 'team-X-mails' routes: - match: severity: critical receiver: team-X-pager - match: service: files receiver: team-Y-mails routes: - match: severity: critical receiver: team-Y-pager # This route handles all alerts coming from a database service. If there's # no team to handle it, it defaults to the DB team. - match: service: database receiver: team-DB-pager # Also group alerts by affected database. group_by: [alertname, cluster, database] routes: - match: owner: team-X receiver: team-X-pager continue: true - match: owner: team-Y receiver: team-Y-pager # Inhibition rules allow to mute a set of alerts given that another alert is # firing. # We use this to mute any warning-level notifications if the same alert is # already critical. inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' # Apply inhibition if the alertname is the same. equal: ['alertname', 'cluster', 'service'] receivers: - name: 'team-X-mails' email_configs: - to: '[email protected]' - name: 'team-X-pager' email_configs: - to: '[email protected]' pagerduty_configs: - service_key: <team-X-key> - name: 'team-Y-mails' email_configs: - to: '[email protected]' - name: 'team-Y-pager' pagerduty_configs: - service_key: <team-Y-key> - name: 'team-DB-pager' pagerduty_configs: - service_key: <team-DB-key> - name: 'team-X-hipchat' hipchat_configs: - auth_token: <auth_token> room_id: 85 message_format: html notify: true
Define Alert Manager as a service
Go to /etc/systemd/systme and create a file called alertmanager.service
add following code to alertmanager.service
[Unit] Description=Alert Manager for Prometheus Wants=network-online.target After=network-online.target [Service] User=root Group=root Type=simple ExecStart=/usr/local/bin/alertmanager --config.file /etc/alertmanager/config/config.yml [Install] WantedBy=multi-user.target
Reload Service daemon for new changes to take effect.
systemctl daemon-reload
Start Alert Manager service
sudo service alertmanager start
Restart Prometeus
Sudo service prometheus restart
Check Status of both the services i.e, Prometheus and Alert Manager using command
sudo service alertmanager status
sudo service prometheus status
If these services are working fine and there is no issue. Now all you have to verify your alerts.
To verify alerts please visit to prometehus installation url like prometheus.exampledomain.com/alerts
and you should see loaded alerts like this.
Congratulations! You have successfully installed email alerts at your Prometheus server.
Leave a Reply