Google Gemini Dashboard

This dashboard offers a clear view into Google Gemini API usage and performance. It highlights key metrics such as token consumption, model distribution, error rates, request volumes, and latency trends. Teams can also track which services and languages are leveraging the Gemini API, along with detailed records of errors, to better understand adoption patterns and optimize reliability and efficiency.

Dashboard Preview

Google Gemini Dashboard
Google Gemini Dashboard Template

Dashboards → + New dashboard → Import JSON

What This Dashboard Monitors

This dashboard tracks critical performance metrics for your Google Gemini API usage using OpenTelemetry to help you:

  • Monitor Token Consumption: Track input tokens (user prompts) and output tokens (model responses) to monitor system workload, efficiency trends, and consumption across different workloads.
  • Track Reliability: Monitor error rates to identify reliability issues and ensure applications maintain a smooth, dependable experience.
  • Analyze Model Adoption: Understand which Gemini model variants are being used most often to track preferences and measure adoption of newer releases.
  • Monitor Usage Patterns: Observe token consumption and request volume trends over time to spot adoption curves, peak cycles, and unusual spikes.
  • Ensure Responsiveness: Track P95 latency to surface potential slowdowns, spikes, or regressions and maintain consistent user experience.
  • Understand Service Distribution: See which services and programming languages are leveraging the Gemini API across your stack.

Metrics Included

Token Usage Metrics

  • Total Token Usage (Input & Output): Displays the split between input tokens (user prompts) and output tokens (model responses), showing exactly how much work the system is doing over time.
  • Token Usage Over Time: Time series visualization showing token consumption trends to identify adoption patterns, peak cycles, and baseline activity.

Performance & Reliability

  • Total Error Rate: Tracks the percentage of Gemini API calls that return errors, providing a quick way to identify reliability issues.
  • Latency (P95 Over Time): Measures the 95th percentile latency of requests over time to surface potential slowdowns and ensure consistent responsiveness.
  • HTTP Request Duration: Monitors the duration of outbound HTTP requests made during LLM calls, helping identify network bottlenecks and API response time patterns that impact overall Gemini API performance.

Usage Analysis

  • Model Distribution: Shows which Gemini model variants are being called most often, helping track preferences and measure adoption across different models.
  • Requests Over Time: Captures the volume of requests sent to Gemini API over time, revealing demand patterns and high-traffic windows.
  • Services and Languages Using Gemini: Breakdown showing where the API is being adopted across different services and programming languages in your stack.

Error Tracking

  • Error Records: Table logging all recorded errors with clickable records that link to the originating trace for detailed error investigation.

Last updated: September 5, 2025

Edit on GitHub

Was this page helpful?