AWS SQS Prometheus Dashboard

This dashboard provides comprehensive monitoring of AWS Simple Queue Service (SQS) using Prometheus metrics collected via a third-party exporter. It offers detailed visibility into queue message states, processing status, and capacity utilization to help optimize queue performance and troubleshoot message processing issues.

Dashboards → + New dashboard → Import JSON

What This Dashboard Monitors

This dashboard tracks essential AWS SQS metrics to help you:

  • Queue Health Monitoring: Track message states across different queue phases
  • Message Flow Analysis: Monitor visible, delayed, and in-flight message counts
  • Processing Efficiency: Identify bottlenecks and processing delays
  • Capacity Planning: Understand queue utilization patterns and scaling needs
  • Multi-Environment Support: Monitor queues across different deployment environments
  • Regional Monitoring: Track SQS performance across AWS regions

Prerequisites

This dashboard is built using metrics from the third-party SQS Prometheus exporter:

  • Repository: https://github.com/jmal98/sqs-exporter
  • Setup Required: Deploy the SQS exporter to collect metrics from your AWS SQS queues
  • Prometheus Integration: Ensure the exporter is configured to send metrics to your Prometheus instance

Metrics Included

Queue Message Status Overview

Status of Messages in Queue

  • Description: Comprehensive view of all message states within selected queues
  • Metrics Tracked:
    • Delayed Messages: Messages waiting to be available for processing
    • Visible Messages: Messages ready for immediate consumption
    • Not Visible Messages: Messages currently being processed (in-flight)
  • Use Case: Get a complete picture of queue processing state at a glance
  • Grouping: By queue name for multi-queue monitoring

Individual Message State Metrics

Approximate Number of Messages

  • Description: Count of messages available for retrieval from the queue
  • Metric: sqs_approximatenumberofmessages
  • Use Case: Monitor queue depth and consumer demand
  • Critical for: Understanding processing backlog and scaling decisions

Approximate Number of Messages Delayed

  • Description: Count of messages that are delayed and not yet available for reading
  • Metric: sqs_approximatenumberofmessagesdelayed
  • Use Case: Track scheduled or delayed message processing
  • Planning: Essential for understanding future processing load

Approximate Number of Messages Not Visible

  • Description: Count of messages that have not timed-out and aren't deleted (in-flight)
  • Metric: sqs_approximatenumberofmessagesnotvisible
  • Use Case: Monitor concurrent message processing and consumer activity
  • Performance: High values may indicate slow processing or consumer issues

Dashboard Variables

This dashboard includes comprehensive filtering capabilities:

Region Selection

  • Variable: region
  • Description: Select specific AWS region for monitoring
  • Type: Single selection
  • Dynamic: Populated from available SQS metrics

Queue Selection

  • Variable: queue.name
  • Description: Select one or multiple SQS queues to monitor
  • Type: Multi-select with "All" option
  • Flexibility: Monitor specific queues or all queues simultaneously

Environment Selection

  • Variable: deployment.environment
  • Description: Filter by deployment environment (staging, production, etc.)
  • Type: Single selection
  • Use Case: Environment-specific monitoring and troubleshooting

Message State Definitions

Visible Messages

  • State: Ready for immediate processing
  • Consumer Action: Can be received and processed
  • Monitoring: Track for queue backlog and consumer capacity

Delayed Messages

  • State: Scheduled for future availability
  • Consumer Action: Not yet available for processing
  • Monitoring: Track scheduled workload and timing patterns

Not Visible (In-Flight) Messages

  • State: Currently being processed by consumers
  • Consumer Action: Received but not yet deleted
  • Monitoring: Indicates active processing and consumer performance

Last updated: January 3, 2025

Edit on GitHub

Was this page helpful?