SigNoz
Docs
PricingCustomers
Get Started - Free
Docs
IntroductionContributingMigrate from DatadogSigNoz API
OpenTelemetry
What is OpenTelemetryOpenTelemetry Collector GuideOpenTelemetry Demo
Community
Support
Slack
X
Launch Week
Changelog
Dashboard Templates
DevOps Wordle
Newsletter
KubeCon, Atlanta 2025
More
SigNoz vs DatadogSigNoz vs New RelicSigNoz vs GrafanaSigNoz vs Dynatrace
Careers
AboutTermsPrivacySecurity & Compliance
SigNoz - Open Source Datadog Alternative
SigNoz
All systems operational
HIPAASOC-2
  1. ...
  2. Docs
  3. AI Tools and Skills
  4. MCP Use Cases
  5. Incident Specific Dashboard Spin-Up

Incident Specific Dashboard Spin-Up

SigNoz Cloud - This page applies to SigNoz Cloud editions.
Self-Host - This page applies to self-hosted SigNoz editions.

The checkout service is throwing errors. Your error rate alert fired and you need visibility into what's happening right now. Instead of manually parsing through all your telemetry data, you want a focused dashboard that shows exactly what's broken.

You open your AI assistant, connected to SigNoz via the MCP server, and ask it to build an incident-specific dashboard.

Prerequisites

  • Connect your AI assistant to SigNoz using the MCP Server guide.

Step 1: Create the Incident Dashboard from a Natural Language Prompt

Our checkout service error alert is firing. Create an incident dashboard for the last 1 hour with:

Checkout error rate, latency (p95/p99), and throughput vs last week's baseline
Top error messages and failing traces
Health of key dependencies services
Infrastructure: CPU, memory, pod restarts

Your assistant will analyze the checkout service, identify its dependencies, gather the relevant metrics, and build a comprehensive incident dashboard:

✓ Service confirmation: Checkout service detected
  - Error rate: 75% (27 errors / 36 calls in last hour)
  - Dependencies identified: payment, cart, product-catalog, currency, shipping, email

✓ Root cause identified: Payment service
  - Error rate: 52% (26 errors / 50 calls)
  - Top failing operations:
    - oteldemo.CheckoutService/PlaceOrder: 27 failures
    - oteldemo.PaymentService/Charge: 26 failures
  - Pattern: Nearly identical error counts suggest every checkout that reaches payment fails

✓ Dashboard created: "Checkout Service — Incident Response (Last 1h)"
  - Dashboard ID: 019dad99-26b3-75aa-a76d-9060062cbadb

Dashboard organized into four sections:

Section 1: Golden Signals vs 7-day Baseline
- Error Rate % (current vs same window 7 days ago)
- Throughput (req/s) with baseline overlay
- P95 Latency with baseline comparison
- P99 Latency with baseline comparison

Section 2: Top Errors & Failing Traces
- Table of failing operations grouped by span name
- List of recent error traces with traceID and status messages for one-click drill-down

Section 3: Dependency Health
- Error rate % for each dependency: payment, cart, product-catalog, currency, shipping, email
- P95 latency broken out by each downstream service

Section 4: Infrastructure
- CPU utilization for checkout container (grouped by container.id)
- Memory usage for checkout container (grouped by container.id)
- Note: Pod restart metrics not available (requires k8s.pod.* metrics from k8s-infra collector)

Open the dashboard in SigNoz under Dashboards → "Checkout Service — Incident Response (Last 1h)"

The dashboard is now live and provides a complete incident view.

Final Summary

You now have a fully functional incident dashboard created from just using a simple prompt.

Incident Service Dashboard
Incident Service Dashboard Overview
Incident Service Dashboard Detailed View
Incident Service Dashboard Detailed View

The dashboard clearly shows that payment-service is the likely root cause with elevated errors and high latency.

Under the Hood

During this workflow, the MCP server called these tools:

StepMCP ToolWhat It Did
1signoz_list_servicesVerified the checkout service exists and retrieved initial error rate statistics
1signoz_get_service_top_operationsIdentified checkout service dependencies (payment, cart, product-catalog, currency, shipping, email) and top failing operations
1signoz_aggregate_tracesRetrieved error rates, latency percentiles (p95/p99), throughput metrics, and compared against 7-day baseline
1signoz_create_dashboardCreated the incident dashboard with four sections covering golden signals, errors, dependency health, and infrastructure

Related Use Cases

  • Dashboard Creation from Natural Language - Create custom dashboards by describing what you want to visualize in plain English.
  • Alert Correlation Analysis - When multiple services alert simultaneously, identify whether it's a cascade from one failure or separate incidents.
  • On-Call Handoff Brief - Generate a handoff summary of recent incidents and ongoing issues for the next on-call engineer.

If you need help with the steps in this topic, please reach out to us on SigNoz Community Slack.

If you are a SigNoz Cloud user, please use in product chat support located at the bottom right corner of your SigNoz instance or contact us at cloud-support@signoz.io.

Last updated: May 27, 2026

Edit on GitHub

Was this page helpful?

Your response helps us improve this page.

Prev
Dashboard Creation from Natural Language
Next
Alert Creation from Natural Language
On this page
Prerequisites
Step 1: Create the Incident Dashboard from a Natural Language Prompt
Final Summary
Under the Hood
Related Use Cases

Is this page helpful?

Your response helps us improve this page.