SigNoz Cloud - This page is relevant for SigNoz Cloud editions.
Self-Host - This page is relevant for self-hosted SigNoz editions.

LlamaIndex Observability with SigNoz

Overview

This guide walks you through enabling observability and monitoring for your Python-based LlamaIndex application and streaming telemetry data to SigNoz Cloud using OpenTelemetry. By the end of this setup, you'll be able to monitor AI-specific operations such as document ingestion, document retrieval, user querying, text generation, and user feedback within LlamaIndex, with detailed spans capturing request durations, node and query inputs, model outputs, retrieval scores, metadata, and intermediate steps throughout the pipeline.

Instrumenting your RAG workflows with telemetry enables full observability across the retrieval and generation pipeline. This is especially valuable when building production-grade developer-facing tools, where insight into model behavior, latency bottlenecks, and retrieval accuracy is essential. With SigNoz, you can trace each user question end-to-end, from prompt to response, and continuously improve performance and reliability.

To get started, check out our example LlamaIndex RAG Q&A bot, complete with OpenTelemetry-based monitoring (via OpenInference). View the full repository here.

Prerequisites

  • A Python application using Python 3.8+
  • LlamaIndex integrated into your app, with document ingestion and query interfaces set up
  • Basic understanding of RAG (Retrieval-Augmented Generation) workflows
  • SigNoz setup (choose one):
  • pip installed for managing Python packages
  • Internet access to send telemetry data to SigNoz Cloud
  • (Optional but recommended) A Python virtual environment to isolate dependencies

Instrument your LlamaIndex application

To capture detailed telemetry from LlamaIndex without modifying your core application logic, we use OpenInference, a community-driven standard that provides pre-built instrumentation for popular AI frameworks like LlamaIndex, built on top of OpenTelemetry. This allows you to trace your LlamaIndex application with minimal configuration.

Check out detailed instructions on how to set up OpenInference instrumentation in your LlamaIndex application over here.

No-code auto-instrumentation is recommended for quick setup with minimal code changes. It's ideal when you want to get observability up and running without modifying your application code and are leveraging standard instrumentor libraries.

Step 1: Install the necessary packages in your Python environment.

pip install \
  opentelemetry-distro \
  opentelemetry-exporter-otlp \
  opentelemetry-instrumentation-httpx \
  opentelemetry-instrumentation-system-metrics \
  llama-index \
  openinference-instrumentation-llama-index

Step 2: Add Automatic Instrumentation

opentelemetry-bootstrap --action=install

Step 3: Configure logging level

To ensure logs are properly captured and exported, configure the root logger to emit logs at the INFO level or higher:

import logging

logging.getLogger().setLevel(logging.INFO)

This sets the minimum log level for the root logger to INFO, which ensures that logger.info() calls and higher severity logs (WARNING, ERROR, CRITICAL) are captured by the OpenTelemetry logging auto-instrumentation and sent to SigNoz.

Step 4: Run an example

from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o")

with langfuse.start_as_current_span(name="llama-index-trace"):
    response = llm.complete("Hello, world!")
    print(response)

📌 Note: Ensure that the OPENAI_API_KEY environment variable is properly defined with your API key before running the code.

Step 5: Run your application with auto-instrumentation

OTEL_RESOURCE_ATTRIBUTES="service.name=<service_name>" \
OTEL_EXPORTER_OTLP_ENDPOINT="https://ingest.<region>.signoz.cloud:443" \
OTEL_EXPORTER_OTLP_HEADERS="signoz-ingestion-key=<your_ingestion_key>" \
OTEL_EXPORTER_OTLP_PROTOCOL=grpc \
OTEL_TRACES_EXPORTER=otlp \
OTEL_METRICS_EXPORTER=otlp \
OTEL_LOGS_EXPORTER=otlp \
OTEL_PYTHON_LOG_CORRELATION=true \
OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true \
opentelemetry-instrument <your_run_command>
  • <service_name> is the name of your service
  • Set the <region> to match your SigNoz Cloud region
  • Replace <your_ingestion_key> with your SigNoz ingestion key
  • Replace <your_run_command> with the actual command you would use to run your application. For example: python main.py
âś… Info

Using self-hosted SigNoz? Most steps are identical. To adapt this guide, update the endpoint and remove the ingestion key header as shown in Cloud → Self-Hosted.

Your LlamaIndex commands should now automatically emit traces, spans, and attributes.

Finally, you should be able to view this data in Signoz Cloud under the traces tab:

Traces View
Traces of your LlamaIndex Application

When you click on a trace ID in SigNoz, you'll see a detailed view of the trace, including all associated spans, along with their events and attributes.

Detailed Traces View
Detailed traces view of your LlamaIndex Application

Last updated: August 1, 2025

Edit on GitHub

Was this page helpful?