Trace based alerts
A Trace-based alert in SigNoz allows you to define conditions based on trace data, triggering alerts when these conditions are met. Here's a breakdown of the various sections and options available when configuring a Trace-based alert:
Step 1: Define the Trace Metric
In this step, you use the Traces Query Builder to perform operations on your Traces to define conditions based on traces data. Some of the fields that are available in Traces Query Builder includes
Traces: A field to filter the trace data to monitor.
Aggregate Attribute: Allows you to choose how the trace data should be aggregated. You can use functions like "Count"
Group by: Lets you group trace data by different span/trace attributes, like "serviceName", "Status" or other custom attributes.
Legend Format: An optional field to define the format for the legend in the visual representation of the alert.
Having: Apply conditions to filter the results further based on aggregate value.

Step 2: Define Alert Conditions
In this step, you define the specific conditions for triggering the alert, as well as the frequency of checking those conditions. The condition configuration of an alert in SigNoz consists of 5 core parts:
Query
An alert can consist of multiple queries and formulas. But only 1 of them can be put into consideration while determining the alert condition.
You can define one or more queries or formulas to fetch the data you want to evaluate. However, only one of them can be used as the trigger for the alert condition.
For example:
A
= Total request countB
= Total error countC
=B / A
(Error rate)
You can use query C
as the evaluation target to trigger alerts based on error rate.
Condition
This defines the logical condition to check against the selected query’s value.
Operator | Description | Example Usage |
---|---|---|
Above | Triggers if the value is greater than | CPU usage Above 90 (%) |
Below | Triggers if the value is less than | Apdex score Below 0.8 |
Equal to | Triggers if the value is exactly equal | Request count Equal to 0 |
Not equal to | Triggers if the value is not equal | Instance status Not Equal to 1 |
Match Type
Specifies how the condition must hold over the evaluation window. This allows for flexible evaluation logic.
Match Type | Description | Example Use Case |
---|---|---|
at least once | Trigger if condition matches even once in the window | Detect spikes or brief failures |
all the times | Trigger only if condition matches at all points in the window | Ensure stable violations before alerting |
on average | Evaluate the average value in the window | Average latency Above 500ms |
in total | Evaluate the total sum over the window | Total errors Above 100 |
last | Only the last data point is evaluated | Used when only latest status matters |
Evaluation Window (For)
Specifies how long the condition must be true before the alert is triggered.
e.g. For 5 minutes
= The condition must remain true continuously for 5 minutes before the alert is triggered.
This helps reduce false positives due to short-lived spikes.
Threshold
This is the value you are comparing the query result against.
e.g. If you choose Condition = Above
and set Threshold = 500
, the alert will fire when the query result exceeds 500.
Threshold Unit
Specifies the unit of the threshold, such as:
- ms (milliseconds) for latency
- % for CPU usage
- Count for request totals
Helps interpret the threshold in the correct context and also for correct scaling while comparing 2 values.
Advanced Options
In addition, there are 3 more advanced options:
Alert Frequency
- How frequently SigNoz evaluates the alert condition.
- Default is
1 min
- e.g. If set to
1 min
the alert will run once every minute.
Notification for missing data points
- Triggers an alert if no data is received for the configured time period.
- Useful for services where consistent data is expected.
- E.g. If set to
5 minutes
, and no metric data is received during that period, the alert will fire.
Minimum Data Points in Result Group
- Ensures the alert condition is evaluated only when there's enough data for statistical significance.
- Helps avoid false alerts due to missing or sparse data points.
- E.g. If set to
3
, the query must return at least 3 data points in the evaluation window for the alert to be considered.

Step 3: Alert Configuration
In this step, you set the alert's metadata, including severity, name, and description:
Severity
Set the severity level for the alert (e.g., "Warning" or "Critical").
Alert Name
A field to name the alert for easy identification.
Alert Description
Add a detailed description for the alert, explaining its purpose and trigger conditions.
You can incorporate result attributes in the alert descriptions to make the alerts more informative:
Syntax: Use $<attribute-name>
to insert attribute values. Attribute values can be any attribute used in group by.
Example: If you have a query that has the attribute service.name
in the group by clause then to use it in the alert description, you will use $service.name
.
Slack alert format
Using advanced slack formatting is supported if you are using Slack as a notification channel.
Test Notification
A button to test the alert to ensure that it works as expected.

Examples
1. Alert when external API latency (P90) is over 1 second for last 5 mins
Here's a video tutorial for creating this alert:
Step 1: Write Query Builder query to define alert metric

Using externalHttpUrl
attribute we can filter specific external API endpoint and then set aggregation attribute to durationNano with P90 aggregation operation to plot a chart which measures 90th percentile latency. You can also choose Avg
or anyother operation as aggregate operation depending on your needs.

Remember to select y-axis unit as nanoseconds as our aggregate key is durationNano.
Step 2: Set alert conditions

The condition is set to trigger a notification if the per-minute external API latency (P90) exceeds the threshold of 1 second all the time in the last five minutes.
Step 3: Set alert configuration

At last configure the alert as Warning
, add a name and notification channel.
Last updated: June 6, 2024
Edit on GitHub