SigNoz Cloud - This page is relevant for SigNoz Cloud editions.
Self-Host - This page is relevant for self-hosted SigNoz editions.

LiteLLM SDK Dashboard

This dashboard offers a clear view into LiteLLM SDK usage and performance. It highlights key metrics such as token consumption, model distribution, error rates, request volumes, and latency trends. Teams can also track which services and languages are leveraging the LiteLLM SDK, along with detailed records of errors, to better understand adoption patterns and optimize reliability and efficiency.

Dashboard Preview

LiteLLM Dashboard
LiteLLM SDK Dashboard Template

Dashboards → + New dashboard → Import JSON

What This Dashboard Monitors

This dashboard tracks critical performance metrics for your LiteLLM SDK usage using OpenTelemetry to help you:

  • Monitor Token Consumption: Track input tokens (user prompts) and output tokens (model responses) to monitor system workload, efficiency trends, and consumption across different workloads.
  • Track Reliability: Monitor error rates to identify reliability issues and ensure applications maintain a smooth, dependable experience.
  • Analyze Model Adoption: Understand which LiteLLM model variants are being used most often to track preferences and measure adoption of newer releases.
  • Monitor Usage Patterns: Observe token consumption and request volume trends over time to spot adoption curves, peak cycles, and unusual spikes.
  • Ensure Responsiveness: Track P95 latency to surface potential slowdowns, spikes, or regressions and maintain consistent user experience.
  • Understand Service Distribution: See which services and programming languages are leveraging the LiteLLM SDK across your stack.

Metrics Included

Token Usage Metrics

  • Total Token Usage (Input & Output): Displays the split between input tokens (user prompts) and output tokens (model responses), showing exactly how much work the system is doing over time.
  • Token Usage Over Time: Time series visualization showing token consumption trends to identify adoption patterns, peak cycles, and baseline activity.

Performance & Reliability

  • Total Error Rate: Tracks the percentage of LiteLLM SDK calls that return errors, providing a quick way to identify reliability issues.
  • Latency (P95 Over Time): Measures the 95th percentile latency of requests over time to surface potential slowdowns and ensure consistent responsiveness.
  • HTTP Request Duration: Monitors the duration of outbound HTTP requests made during LLM calls, helping identify network bottlenecks and API response time patterns that impact overall LiteLLM SDK performance.

Usage Analysis

  • Model Distribution: Shows which LiteLLM model variants are being called most often, helping track preferences and measure adoption across different models.
  • Requests Over Time: Captures the volume of requests sent to LiteLLM SDK over time, revealing demand patterns and high-traffic windows.
  • Services and Languages Using LiteLLM: Breakdown showing where the SDK is being adopted across different services and programming languages in your stack.
  • LiteLLM SDK Request Logs: Comprehensive list of all generated logs for LiteLLM SDK requests associated with the given service name, providing detailed visibility into API call patterns.

Error Tracking

  • Error Records: Table logging all recorded errors with clickable records that link to the originating trace for detailed error investigation.

Last updated: October 29, 2025

Edit on GitHub

Was this page helpful?