In this article, let’s see how to set up alerts and notifications in Grafana. This data visualization tool allows you to monitor key system performance indicators. At the same time, it is important to receive alerts in time to quickly react to errors and take necessary corrective actions. To learn how to install Grafana, read the Enable monitoring parameters in BRIX Enterprise article.
Setting up notifications and alerts in Grafana consists of three steps:
Step 1. Configure how notifications are received
In Grafana settings, go to Alerting > Contact points, and in the Integration field, specify how you want to receive notifications, and then fill in the appropriate fields.
To receive notifications via email, select the Email option. Then, in the appeared Addresses field, specify a list of email addresses where notifications will be sent.
To receive notifications via Telegram, select the appropriate option in the Integration field. After that, the BOT API Token and Chat ID fields will appear. To fill them in, follow the steps below:
- Go to Telegram and create a bot.
- Copy the bot token and enter it in the BOT API Token field.
- Create a group in Telegram and add your bot to it.
- Obtain the group ID and enter it in the Chat ID field.
Use the official documentation to perform actions in Telegram.
Step 2. Create alert rules
Based on metrics using Prometheus or tracking logs using Loki, you can create alert rules.
For this example, let’s create a metrics-based alert rule using the overview dashboard and configure notifications when the CPU load on a node exceeds 80% for 5 minutes. To do this, open the required section in the dashboard, in our case, it is Nodes info. Then in the upper right corner, click on the three dots and select More > New alert rule. In the window that opens, perform the following steps:
- In the Name field, enter the name of the alert rule, for example, CPU Usage. The second item will automatically display the rule script. At the end of the script text, set the threshold value for triggering the rule to > 0.8.
- Next, in the same window, create a new evaluation group with a value of 5m and give it a name. Then the alert will be received if the CPU load exceeds 80% for the whole specified time.
- In the same window in the Folder field, select a folder where alerts on the created rule will be saved and displayed. In the Evaluation group field, add the previously created evaluation group 5m.
Additionally, in the New alert rule window, you can add labels and annotations for your alert rule. Labels help you categorize and filter alerts, while annotations allow you to add a description of the problem or action to be taken.
Step 3. Set up notification routing
Go to Alerting – Notification policies. In the opened window:
- Fill in Matching labels fields.
- In the Contact point field, select how you want to receive notifications.
Example of filling in:
Routing is configured. If the threshold value exceeds the 0.8 factor for 5 minutes, a notification will be sent to the specified email address.