Customizing Signoz Error Alert Descriptions

TLDR Sumanth needed help with making Signoz error alert descriptions more descriptive. Amol suggested using {{.Labels.serviceName}} and discussed limitations of adding labels in queries.

Photo of Sumanth
Sumanth
Mon, 26 Jun 2023 06:05:50 UTC

A bit new to signoz, unable to get answers to some questions i have over the internet. Please help if possible I want to set up alerts on general exceptions received by Signoz. Currently this is the existing alert description: ```This alert is fired when the error count (current value: {{$value}}) crosses the threshold ({{$threshold}}) ``` I want to make the description more descriptive with information on what the error is. I would like to know what all possible variables i can access in the description with ${{}}. Where can i find the list of variables?

Photo of Sumanth
Sumanth
Mon, 26 Jun 2023 06:09:39 UTC

attaching also the clickhouse query used and error alert message i have received. Would love to hear any tips or improvements i could make ```SELECT count() as value, toStartOfInterval(timestamp, toIntervalMinute(1)) AS interval, serviceName FROM signoz_traces.distributed_signoz_error_index_v2 WHERE timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} GROUP BY serviceName, interval; -- available variables: -- {{.start_datetime}} -- {{.end_datetime}} -- required column alias: -- value -- interval``` The example error alert is as follows: ```Signoz-Alerts APP 12:06 PM [FIRING:1] General Error Catch for (serviceName="Dev-L1Enricher", severity="error") Alert: General Error Catch - error Summary: The rule threshold is set to 2, and the observed metric value is 4 Description: This alert is fired when the error count (current value: 4) crosses the threshold (2) Details: • alertname: General Error Catch • ruleId: 1 • ruleSource: Show more Signoz-Alerts APP 12:16 PM [RESOLVED] General Error Catch for (serviceName="Dev-L1Enricher", severity="error") Alert: General Error Catch - error Summary: The rule threshold is set to 2, and the observed metric value is 4 Description: This alert is fired when the error count (current value: 4) crosses the threshold (2) Details: • alertname: General Error Catch • ruleId: 1 • ruleSource: • serviceName: Dev-L1Enricher • severity: error Show less``` I am unable to see what the error is. Also the message says the error was resolved without any action from my part.

Photo of Pranay
Pranay
Mon, 26 Jun 2023 07:38:08 UTC

Amol Do you have more ideas on this?

Photo of Amol
Amol
Mon, 26 Jun 2023 08:06:59 UTC

each column in select clause other than value and timestamp is treated as label. so in your query serviceName would be available under {{.Labels.serviceName}} if you add this go template in the description, the service name would appear in place of {{.Labels.serviceName}}

Photo of Amol
Amol
Mon, 26 Jun 2023 08:09:17 UTC

the caveat is labels are also used to uniquely identify the row, so there is a limitation on what you can pick in the select query. so let's say you want to group by servicename and apply alert condition on the resulting row but you add error desc in select, the result of query would change as well so make sure your query result has same no of rows after inclusion of new labels