Dify Dashboard

SigNoz Cloud - This page applies to SigNoz Cloud editions.
Self-Host - This page applies to self-hosted SigNoz editions.

Before using this dashboard, instrument your Dify applications with OpenTelemetry and configure export to SigNoz. See the Dify observability guide for complete setup instructions.

This dashboard provides a comprehensive view of the langgenius/dify service using trace data. It covers RED metrics (Rate, Errors, Duration), HTTP endpoint performance, LLM/GenAI activity (token usage, model latency, and provider breakdown), Dify app activity, streaming mode distribution, and Postgres/Redis dependency health.

Dashboard Preview

Dify Dashboard
Dify Dashboard Template

Dashboards → + New dashboard → Import JSON

What This Dashboard Monitors

This dashboard tracks critical performance metrics for your Dify LLM service using OpenTelemetry trace data to help you:

  • Measure Traffic and Errors: See the overall request rate (spans/min) and error span count at a glance so you can detect incidents immediately.
  • Track End-to-End Latency: Monitor p95 latency on root spans to surface regressions and ensure consistent user-facing responsiveness.
  • Optimize LLM Cost: Break down input and output token consumption per model to understand cost drivers and track usage trends.
  • Identify Slow Operations: Drill into per-operation request counts and p95 latency to pinpoint the slowest HTTP endpoints.
  • Monitor LLM Providers: Compare call volumes and response latency across every AI model in use to guide provider selection.
  • Understand App Activity: See which Dify apps and streaming configurations are generating the most traffic.
  • Watch Dependencies: Track Postgres query latency and Redis operation counts to catch database bottlenecks early.

Panels Included

RED Metrics (Summary Row)

PanelTypeWhat It Shows
Request Rate (spans/min)StatAverage spans per minute, showing the overall traffic level for the service
Error SpansStatCount of spans with has_error = true; turns red when any errors are present
p95 Latency (root spans)Stat95th-percentile latency for root spans (parent_span_id = '')
Total LLM TokensStatSum of gen_ai.usage.total_tokens across all LLM spans in the selected window

HTTP Operation Breakdown

  • Requests by Operation: Table showing span count per operation name to reveal which endpoints receive the most traffic.
  • p95 Latency by Operation: Table of p95 duration per operation name to quickly identify the slowest operations.

LLM / GenAI Activity

  • LLM p95 Latency by Model: Time-series graph of p95 latency broken down by gen_ai.request.model for comparing response time across providers.
  • LLM Calls by Model: Bar chart of LLM invocation counts per model over time, showing model adoption and call volume trends.
  • Token Usage: Input vs Output (by model): Table of total input and output tokens per model, useful for per-model cost tracking.

Dify App Activity

  • Activity by Dify App: Table of request counts grouped by dify.app_id, showing which Dify applications are most active.
  • Streaming vs Non-Streaming Requests: Pie chart of the dify.streaming attribute, showing the proportion of streaming vs. non-streaming traffic.

Dependency Health

  • Postgres Query p95 Latency: Line graph of p95 latency for PostgreSQL operations (db.system = 'postgresql'), grouped by db.operation, to help catch slow queries.
  • Redis Operations by Type: Line graph of Redis operation counts (db.system = 'redis'), grouped by span name, revealing cache and queue activity patterns.

Endpoint Performance Table

  • Slowest Endpoints (by p95): Full-width table listing HTTP endpoints filtered by http.method EXISTS, showing p95 latency and call count side by side for quick performance triage.

Last updated: May 20, 2026

Edit on GitHub

Was this page helpful?

Your response helps us improve this page.