Overview
When monitoring EC2 instances and viewing CPU utilization metrics in SigNoz, you may notice discrepancies between what you see in SigNoz and AWS CloudWatch. This guide explains how to properly configure CPU metrics for EC2 instances and understand aggregation intervals.
Configuring CPU metrics for EC2 instance-level monitoring
By default, CPU metrics may be grouped by individual CPU cores, which shows utilization per core rather than overall EC2 instance utilization. To view the maximum CPU utilization for an entire EC2 instance:
- Remove the "Group by CPU" option from your metric configuration
- Keep only the EC2 instance identifier (such as
ec2.tag.Name) in your grouping
This will aggregate CPU utilization across all cores and show the overall EC2 instance performance.
Example
When monitoring an EC2 instance with multiple CPU cores:
- Per-core view: Shows individual utilization for each CPU core (e.g., CPU0: 30%, CPU1: 45%, CPU2: 20%)
- EC2 instance-level view: Shows maximum or average utilization across all cores (e.g., EC2 Instance: 45%)
To get the EC2 instance-level view, ensure your query groups only by the EC2 instance identifier and not by individual CPU attributes.
Understanding aggregation intervals for EC2 monitoring
Differences in EC2 CPU utilization values between SigNoz and AWS CloudWatch often stem from different aggregation intervals:
- Short time periods: When viewing recent EC2 data (hours to 1 day), metrics use shorter aggregation intervals that capture more granular spikes
- Longer time periods: When viewing EC2 data over several days, the aggregation interval automatically increases to 15 minutes to manage data volume
For comparison, AWS CloudWatch typically uses 5-minute aggregation intervals for EC2 metrics, while SigNoz may use 15-minute intervals for longer time ranges.
How aggregation intervals work
Aggregation intervals determine how raw metric data points are grouped and averaged over time:
- Raw data collection: Metrics are collected at regular intervals (e.g., every 10-60 seconds depending on your configuration)
- Temporal aggregation: For display purposes, these raw data points are aggregated into larger time windows
- Automatic adjustment: The aggregation interval increases as you view longer time ranges to keep the number of data points manageable
This means:
- Viewing last 1 hour: May use 1-minute aggregation intervals
- Viewing last 24 hours: May use 5-minute aggregation intervals
- Viewing last 7 days: May use 15-minute aggregation intervals
Impact on metrics
Longer aggregation intervals can smooth out short-lived spikes:
- A 2-minute CPU spike to 90% may appear as only 60% when averaged over a 15-minute window
- The same spike would be more visible with a 1-minute aggregation interval
Viewing detailed EC2 CPU spikes
To see more detailed EC2 CPU spikes that may be averaged out in longer time periods:
- Zoom into specific time ranges where you expect to see EC2 CPU spikes
- Use shorter time periods (hours instead of days) for more granular EC2 data
- Consider the collection interval of your EC2 metrics (configured in the hostmetrics receiver)
This will reveal EC2 CPU spikes that may be smoothed out when viewing longer time periods with larger aggregation intervals.
Current limitations
Aggregation intervals are currently not configurable and automatically adjust based on the selected time range to optimize performance and data visualization.
Related resources
- Hostmetrics Configuration - Configure CPU metric collection
- Metric Types and Aggregation - Learn about temporal and spatial aggregation
- EC2 Infrastructure Metrics - Monitor EC2 instances with SigNoz
- Infrastructure Monitoring Overview - Explore infrastructure monitoring features