Baseten Dashboard

This dashboard offers a clear view into Baseten usage and performance. It highlights key metrics such as token consumption, model distribution, error rates, request volumes, and latency trends. Teams can also track detailed records of errors, to better understand adoption patterns and optimize reliability and efficiency.

Before using this dashboard, instrument your Baseten applications with OpenTelemetry and configure export to SigNoz. See the Baseten observability guide for complete setup instructions.

Dashboard Preview

Dashboards → + New dashboard → Import JSON

What This Dashboard Monitors

This dashboard tracks critical performance metrics for your Baseten usage using OpenTelemetry to help you:

Track Reliability: Monitor error rates to identify reliability issues and ensure applications maintain a smooth, dependable experience.
Analyze Model Adoption: Understand which Baseten models are being used to track preferences and measure adoption of different models.
Monitor Usage Patterns: Observe token consumption and request volume trends over time to spot adoption curves, peak cycles, and unusual spikes.
Ensure Responsiveness: Track P95 latency to surface potential slowdowns, spikes, or regressions and maintain consistent user experience.
Understand Service Distribution: See which services and programming languages are leveraging Baseten across your stack.

Metrics Included

Token Usage Metrics

Total Token Usage (Input & Output): Displays the split between input tokens (user prompts) and output tokens (model responses), showing exactly how much work the system is doing over time.
Token Usage Over Time: Time series visualization showing token consumption trends to identify adoption patterns, peak cycles, and baseline activity.

Performance & Reliability

Total Error Rate: Tracks the percentage of Baseten calls that return errors, providing a quick way to identify reliability issues.
Latency (P95 Over Time): Measures the 95th percentile latency of requests over time to surface potential slowdowns and ensure consistent responsiveness.
Request Duration: Monitors the duration of LLM requests from beginning to end.

Usage Analysis

Model Distribution: Shows which Baseten model variants are being called, helping track preferences and measure adoption across different models.
Token Distribution by Model: Shows the total token usage grouped by model to determine which model is using the most tokens.
Requests Over Time: Captures the volume of requests sent to Baseten over time, revealing demand patterns and high-traffic windows.
Services and Languages Using Baseten: Breakdown showing where Baseten is being adopted across different services and programming languages in your stack.
Baseten Logs: Comprehensive list of all generated logs for Baseten applications associated with the given service name.

Error Tracking

Error Records: Table logging all recorded errors with clickable records that link to the originating trace for detailed error investigation.