This dashboard provides comprehensive monitoring of AWS ElastiCache Redis performance and resource utilization, offering detailed visibility into cache hits/misses, CPU usage, memory consumption, and network traffic for optimal cache performance monitoring.
To use this dashboard, you need to set up the data source and send telemetry to SigNoz. Follow the AWS ElastiCache Redis Integration guide to get started.
Dashboards → + New dashboard → Import JSON
What This Dashboard Monitors
This dashboard tracks essential AWS ElastiCache Redis metrics to help you:
- Performance Monitoring: Track CPU utilization, memory usage, and cache hit rates
- Capacity Planning: Monitor memory usage percentages and capacity utilization
- Network Analysis: Analyze network traffic patterns and bandwidth utilization
- Cache Efficiency: Track cache hit rates, evictions, and memory fragmentation
- Replication Health: Monitor replication lag and connection metrics
- Resource Optimization: Identify performance bottlenecks and optimize configurations
Metrics Included
CPU and Engine Performance
CPU Utilization
- Description: Host-level CPU utilization percentage for the ElastiCache node
- Use Case: Monitor overall system performance and identify CPU bottlenecks
- Grouping: By cache cluster ID for multi-cluster monitoring
Engine CPU Utilization
- Description: Redis engine-specific CPU utilization percentage
- Use Case: Track Redis process CPU consumption separate from system overhead
- Grouping: By cache cluster ID to compare engine performance across clusters
Memory Management
Database Memory Usage Percentage
- Description: Percentage of allocated memory currently used by Redis
- Use Case: Monitor memory consumption and plan for capacity needs
- Critical Thresholds: High values (>80%) may indicate need for scaling
Database Capacity Usage Percentage
- Description: Percentage of total available memory capacity in use
- Use Case: Track capacity utilization for scaling decisions
- Planning: Essential for understanding growth patterns
Memory Fragmentation Ratio
- Description: Ratio of memory allocated by Redis vs. memory used by the OS
- Use Case: Identify memory fragmentation issues that can impact performance
- Optimization: Values significantly above 1.0 may indicate fragmentation
Freeable Memory
- Description: Amount of memory available for allocation (in bytes)
- Use Case: Monitor available memory before hitting limits
- Unit: Displayed in decimal bytes for precise capacity tracking
Cache Performance
Cache Hit Rate
- Description: Number of successful cache lookups (cache hits)
- Use Case: Measure cache effectiveness and application performance
- Optimization: Higher hit rates indicate better cache utilization
Evictions
- Description: Number of items evicted from cache due to memory pressure
- Use Case: Monitor memory pressure and optimize cache policies
- Troubleshooting: High eviction rates may indicate insufficient memory
Network Activity
Network Bytes In
- Description: Number of bytes received by the ElastiCache node
- Use Case: Monitor inbound network traffic and bandwidth utilization
- Unit: Displayed in decimal bytes for traffic analysis
Network Bytes Out
- Description: Number of bytes sent from the ElastiCache node
- Use Case: Track outbound network traffic and response data volume
- Capacity Planning: Essential for understanding network requirements
Connection and System Metrics
Current Connections
- Description: Number of active client connections to the Redis instance
- Use Case: Monitor connection usage and identify connection leaks
- Capacity Planning: Track connection patterns for scaling decisions
Swap Usage
- Description: Amount of swap space used by the ElastiCache node
- Use Case: Identify memory pressure that forces swapping
- Performance Impact: High swap usage can significantly degrade performance
Replication Lag
- Description: Maximum lag time between master and replica nodes (in seconds)
- Use Case: Monitor replication health in Redis clusters
- Reliability: Essential for ensuring data consistency across replicas
Dashboard Variables
This dashboard includes filtering capabilities:
- cache_cluster_id: Filter metrics by specific ElastiCache cluster ID
- Multi-select: Monitor multiple clusters simultaneously
- All option: View aggregate metrics across all clusters
- Dynamic: Automatically populates from available cluster IDs
Monitoring Best Practices
Performance Optimization
- Monitor Cache Hit Rate to ensure efficient cache utilization
- Track CPU Utilization metrics to identify processing bottlenecks
- Watch Memory Fragmentation Ratio for memory efficiency
Capacity Planning
- Use Database Memory Usage Percentage for scaling decisions
- Monitor Freeable Memory to prevent out-of-memory conditions
- Track Network Bytes for bandwidth planning
Troubleshooting
- High Evictions may indicate insufficient memory allocation
- Elevated Swap Usage suggests memory pressure
- Increased Replication Lag indicates replication issues
Alerting Recommendations
- CPU Utilization > 80% - Consider scaling or optimization
- Memory Usage > 85% - Plan for memory scaling
- Cache Hit Rate < 90% - Review cache strategy
- Replication Lag > 5 seconds - Investigate replication health