Understanding OpenTelemetry Demo: A Hands-on Guide

Updated Feb 17, 202616 min read

Everyone knows that debugging is twice as hard as writing a program in the first place. So, if you’re as clever as you can be when you write it, how will you ever debug it?

— Brian W. Kernighan and P. J. Plauge, The Elements of Programming Style, 2nd ed.

As developers, we often understand a concept or framework best when we can experiment with real systems built on top of it, by visualizing data flows, reading the code and documentation, and gradually forming a mental model of how everything fits together.

This is especially true for OpenTelemetry, where those new to observability tend to be confused by its many moving parts and fundamental "entities" like traces, metrics, and logs (known as the three pillars of observability).

Beyond just learning about OpenTelemetry, the demo application also showcases how observability provides deep insights into the behaviour of distributed systems and is an ideal example of what real OpenTelemetry-based implementations look like.
The OpenTelemetry community recognized that developers around the world shared this sentiment; to fill these vital gaps, the community has developed, and actively maintains, the OpenTelemetry Demo Application.

What is the OpenTelemetry Demo Application?

The source code serves as a great reference as it provides examples for a set of core microservices written in Go, Python, .NET, Java, etc., talking to each other over gRPC and HTTP, and their idiomatic OpenTelemetry instrumentation methods. Developers can use it to understand the rationale behind the instrumentation process on a per-service level, and apply the same patterns in their codebases.

What appears to be a simple astronomy online shop to purchase stargazing tools is a full-fledged project developed with the intention of allowing developers to explore a production-lite deployment with built-in capability to simulate failures with flags and a load generator to fake user traffic.
It comes fully instrumented with OpenTelemetry client libraries and an OpenTelemetry Collector to collect all the telemetry data, aggregate it, process it, and forward it to a backend of your choice. Here, we will be using SigNoz as our backend for visualizing the telemetry generated by the demo.

OpenTelemetry (OTel), the open-source observability standard, aims to standardize how telemetry is generated and exported, reducing the engineering bandwidth required to maintain telemetry logic across complex application systems. Since it makes it trivial to define the target for your application telemetry data, OpenTelemetry eliminates vendor lock-in and forces observability vendors to compete on features to retain customers.

If you wish to refresh your knowledge, or want a companion guide as we walk you through the OpenTelemetry Demo, feel free to check out our detailed write-up on OpenTelemetry.

Architecture Overview for OpenTelemetry Demo

If you want to jump directly to the implementation, skip to the Send OpenTelemetry Demo App Telemetry to SigNoz section.

Before we start sending data, it helps to understand the architecture we are dealing with. Understanding the service dependencies is key when using this project as an OpenTelemetry example for debugging.

The application simulates an e-commerce store with a complex microservices-based architecture. For detailed information, refer to the official services documentation and architecture overview.

Application Services

Service	Language	Description
Frontend	TypeScript	Serves the web UI and handles user requests via HTTP and gRPC.
Frontend Proxy	Envoy (C++)	Manages incoming HTTP traffic and routes to frontend services.
Checkout	Go	Orchestrates the order process across payment, shipping, and email services.
Cart	.NET	Stores and retrieves shopping cart items using cache.
Payment	JavaScript	Processes credit card payments and returns transaction IDs.
Product Catalog	Go	Provides a searchable list of products from a JSON file.
Product Reviews	Python	Returns reviews and AI-powered answers to product questions.
Recommendation	Python	Suggests products based on cart contents.
Ad	Java	Provides text ads based on context tags.
Shipping	Rust	Calculates shipping costs and provides delivery estimates.
Quote	PHP	Calculates shipping costs based on item count.
Email	Ruby	Sends order confirmation emails.
Currency	C++	Converts currency using European Central Bank rates (highest QPS service).
Accounting	.NET	Processes and tracks incoming orders.
Fraud Detection	Kotlin	Analyzes orders to detect fraudulent activity.
Flagd-UI	Elixir	Provides UI for managing feature flags.
Load Generator	Python	Simulates realistic user traffic using Locust.

Infrastructure Services

Component	Technology	Role
Cache	Valkey	Stores session and shopping cart data.
Queue	Kafka	Enables asynchronous messaging between services.
Database	PostgreSQL	Provides persistent storage for application data.
Image Provider	nginx	Serves product images to the frontend.
Flagd	Go	Feature flag engine backend.

If the "Place Order" button fails, the error could originate in the Go checkout service, the .NET cart service, or a database lock in the product catalog. Observability tools allow you to pinpoint exactly where the system fails.
This polyglot nature ensures you can see how trace context propagates across different languages and protocols, making it a realistic OpenTelemetry example application.

Visualizing OpenTelemetry Demo's Telemetry

We will need an observability backend to ingest, process, and visualize the telemetry data generated by the OpenTelemetry Demo application. For this guide, we will use SigNoz as the backend of choice.
The easiest way to send data to SigNoz is via the cloud. This is a quick guide for sending data to the SigNoz cloud using docker deployment of the OTel Demo App. Instead, if you wish to send data to self hosted versions of SigNoz (Docker/ Kubernetes), check out the docs here.

Get your SigNoz Cloud Endpoint

Sign up or log in to SigNoz Cloud
Generate a new ingestion key within the ingestion settings. This token will serve as your authentication method for transmitting telemetry data.

Clone Github Repo for OTel Demo App

Clone the OTel demo app to any folder of your choice.

# Clone the OpenTelemetry Demo repository
git clone https://github.com/open-telemetry/opentelemetry-demo.git
cd opentelemetry-demo

Configuring OTel Collector

By default, the collector in the demo application will merge the configuration from two files:

otelcol-config.yml [we don't touch this]
otelcol-config-extras.yml [we modify this]

To add SigNoz as the backend, open the file src/otel-collector/otelcol-config-extras.yml and add the following,

exporters:
  otlp:
    endpoint: "https://ingest.{your-region}.signoz.cloud:443"
    tls:
      insecure: false
    headers:
      signoz-ingestion-key: <SIGNOZ-INGESTIONKEY>
  debug:
    verbosity: detailed

service:
  pipelines:
    metrics:
      exporters: [otlp]
    traces:
      exporters: [spanmetrics, otlp]
    logs: 
      exporters: [otlp]

Remember to replace the region and ingestion key with the proper values obtained from your account.

📝 Note

When merging extra configuration values with the existing collector config (src/otel-collector/otelcol-config.yml), objects are merged, and arrays are replaced, resulting in previous pipeline configurations being overridden. If overridden, the span metrics exporter must be included in the array of exporters for the traces pipeline. Not including this exporter will result in an error.

Start the OTel Demo App

Get the OTel Demo App running with the following command,

docker compose up -d

This spins up multiple microservices with OpenTelemetry instrumentation enabled. You can verify this by,

docker compose ps -a

The result should look similar to this,

docker-containers — *Docker containers of OTel Demo App*

Monitor with SigNoz

Open your SigNoz account
Navigate to Services to see multiple services listed down as shown in the snapshot below.

*Services Listed down in SigNoz services tab*

When you navigate to the Service Map tab, you see an automatically generated visualisation of your service architecture, as shown below.

SigNoz service map — *Service architecture as visualised by SigNoz Service map*

Debugging Real-life Failure Scenarios

As developers, we are often paged for a bug and spend hours drowning in logs and metrics trying to figure out the root cause, only to realise after 7 hours of effort that it was a missed semi-colon. At SigNoz, we’ve made an honest attempt to make your debugging and monitoring strides a tad bit easier (or a whole lot more).

Let’s debug some very common bugs by using the OTel Demo App to simulate them, and understanding what caused them, and how to diagnose them by monitoring them in our observability backend.
For this, navigate to http://localhost:8080/feature , which will look like the snapshot below. We will enable appropriate flags to simulate errors, in each of the following scenarios.

With the three scenarios below, we will look at how SigNoz implements traces, logs, metrics with dashboards and alerts, exceptions etc. using this robust opentelemetry example.

Monitoring a Kafka Consumer Lag

We are using a custom Kafka Monitoring dashboard here, which helps to monitor various metrics, including Kafka IO rate, Consumer lag, Fetch Size, etc. You can find the JSON here. Read this to understand in detail about SigNoz dashboards. Make sure to enable the KafkaQueueProblems flag as well. After a few minutes of waiting, we will observe upward spikes and downward tails in our dashboard, which points to anomalies needing inspection.

On closer inspection, the metrics going haywire indicate a Kafka Consumer lag, which results in increased time between polls. The downward tails on the Kafka IO Wait panel indicate a near-zero wait time, and the increasing fetch size on the Consumer Fetch Size panel indicates Kafka Queue Overload.

So, we have diagnosed two key anomalies - Consumer lag and Queue overload, which in a real-life scenario would require immediate debugging of the consumer/ producer code.

Kafka Monitoring Dashboard — *Dashboard with various panels monitoring different Kafka properties*

For our use case, the service map [check Service Map tab] shows that two services are consuming from Kafka, fraud detection and accounting.

Going to the Logs Tab and filtering for logs from the fraud detection service is a good move at this point. From the logs, we see a particular statement: “FeatureFlag—kafkaQueueProblems is enabled, sleeping 1 second.”

Logstab with fraud detection service logs — *Logs being used to debug Kafka lag by leveraging Filters*

This is the missing piece of our puzzle - the consumer delay caused by the service sleeping for a second, which causes the bottleneck.

SigNoz also has a dedicated tab for Messaging Queues, where we can monitor the consumers of the Kafka topics and their stats. Currently, we support both Celery and Kafka Queues.

*Messaging Queues with fraud-detection service in focus*

Investigating a Sporadic Service Failure

For this scenario, we have to enable the cartFailure flag. This flag will generate an error whenever we attempt to empty the cart. A few minutes after enabling this, we will observe that we are not able to empty the cart as expected.

In a real production environment, this type of issue represents a critical flaw in business logic that could potentially result in financial losses. The challenge is to identify and address the root cause rapidly.

The Exceptions tab is a life-saver in such a situation. We immediately notice an exception statement referring to the cart, which is a good starting point for us.

*Exceptions Tab used for debugging service failure*

Diving deeper, we notice that the cartService is unable to connect to the Redis cache.

*Exception details showing cart service cannot connect to Redis*

If you are a developer paged for this bug, the immediate next step is to restore the Redis connection and check if any other services have been similarly impacted.

But to explore SigNoz further (this is purely for exploration and not a suggested or ideal debugging flow), click on ‘See the error in trace graph’. This takes us to the corresponding trace. Since traces always carry context, it often reduces debugging time. By clicking on ‘related logs‘, we can see OpenTelemetry log correlation in action, which works by connecting logs with traces through shared context identifiers.

*Trace view with flame graph and span details*

Thus, we have not only debugged the issue quickly, but we have also drawn sufficient context around the issue.

📝 Note

When OpenTelemetry instruments your application, it creates unique trace and span IDs for tracking request flows. These same identifiers can be automatically injected into your logs, creating a direct link between your traces and related log entries.

Getting to the bottom of a ProductCatalogue Error

We are using a custom dashboard called Spanmetrics dashboard here, which helps to monitor the trace span calls. You can export the JSON here. Read this to understand in detail about SigNoz dashboards. Also, make sure to enable the productCatalogFailure flag to simulate this. A few minutes after enabling the flag, if you browse to the astronomy shop UI, you can see a lot of red in the network calls.

*Network tab of OTel Demo App after enabling productCatalogFailure flag*

When you filter for errors in the 'Span metrics by service' panel from the dashboard, you'll notice that the error rate for several services starting to creep up, especially for the frontend. [In the Span Metrics dashboard]

*The Spanmetrics Dashboard showing error rates during a productcatalog service failure*

Go to the Traces tab, let’s filter only the ERROR traces. We observe that the filter populates the query builder above as well!

*Traces explorer with Error filter applied and subsequent query output*

The output indicates that ProductCatalogueService could be a service of interest. If you explore all of the traces that call ProductCatalogService/GetProduct, you may notice that the errors all have something in common: they happen only when the attribute is a specific value.

*Detailed trace view with flame graph and span details emphasising a specific product id which likely causes failure*

📝 Note

Automatic instrumentation doesn’t know the domain-specific logic and metadata that matter to your service, like product id in this context. You have to add that in yourself by extending the instrumentation. In this case, the productcatalog service uses gRPC instrumentation. OTel Demo App already comes with a useful soft context attached to the spans that it generates, which has been leveraged for this scenario.

This is indeed a very interesting observation, but it doesn’t point us to the root cause. Let’s check the Events tab for the traces that error out.

*Events Tab for a particular trace, showing cause of failure*

This is an aha moment for you! The above snapshot clearly shows that the reason for the error is the productCatalogFailure flag being enabled. If you are a front-end developer, you can breathe.

Visualizing OpenTelemetry Data Beyond the Demo

By this point, we expect you've played around with the demo and have begun to develop a good understanding of how OpenTelemetry works, and how it enables developers to build robust observability pipelines for your application systems. Soon you'll be ready to instrument your services and adopt OTel as a core part of your architecture.
When you begin the move, you will need to choose a reliable observability backend that handles your telemetry data efficiently, and helps you derive the right insights from this data.

SigNoz, being OpenTelemetry-native and built on top of ClickHouse, can be an ideal observability backend for your needs. This native compatibility ensures that SigNoz can fully leverage the rich, standardised data produced by OpenTelemetry without requiring complex workarounds or additional translation layers. This holistic approach simplifies debugging and performance analysis by allowing developers to correlate logs, traces, and metrics effortlessly.

SigNoz gives developers full visibility into their applications with seamless correlation of metrics, logs, and traces on a single, open-source platform. We have listed down all the features which makes SigNoz a powerful observability platform:

Traces Explorer: Powerful trace explorer to drill down into requests, analyze spans, and debug performance bottlenecks.
Exceptions Tab: Automatically captures exceptions and errors from your applications to help identify failure points quickly.
Messaging Queue Visibility: Gain deep observability into asynchronous systems like Kafka, RabbitMQ, and more with support for messaging semantic conventions.
Infrastructure Monitoring: Monitor host-level metrics like CPU, memory, disk, and network usage alongside application telemetry.
Smart Alerts: Define alerts based on static thresholds or use anomaly detection to catch issues early.
Self-Host or Use SigNoz Cloud: Choose between managing your own deployment or using our fully managed cloud.

FAQs

Is there an online OpenTelemetry demo?

There is no officially hosted, permanently available online OpenTelemetry demo provided by the OpenTelemetry project. Instead, the community maintains a demo application designed to run locally on your machine or in environments such as Docker and Kubernetes. This approach allows developers to explore a production-like application system and manage its configuration.

What is the Astronomy Shop demo?

The Astronomy Shop is the name of the official OpenTelemetry demo application. It simulates a microservices-based e-commerce system where users browse products, add items to a cart, and place orders. The demo showcases OpenTelemetry implementations for multiple services, across multiple programming languages and using distributed system communication patterns.

What kind of failures does the OpenTelemetry demo simulate?

The demo includes built-in failure scenarios that can be configured to emulate real-life production failure issues, such as:

intermittent service errors
downstream dependency failures
increased latency in specific services

These scenarios are intentionally included so users can practice debugging with telemetry signals (traces, metrics, and logs), rather than only monitoring certain metrics.

Understanding OpenTelemetry Demo: A Hands-on Guide

What is the OpenTelemetry Demo Application?

Architecture Overview for OpenTelemetry Demo

Application Services

Infrastructure Services

Visualizing OpenTelemetry Demo's Telemetry

Get your SigNoz Cloud Endpoint

Clone Github Repo for OTel Demo App

Configuring OTel Collector

Start the OTel Demo App

Monitor with SigNoz

Debugging Real-life Failure Scenarios

Monitoring a Kafka Consumer Lag

Investigating a Sporadic Service Failure

Getting to the bottom of a ProductCatalogue Error

Visualizing OpenTelemetry Data Beyond the Demo

FAQs

Is there an online OpenTelemetry demo?

What is the Astronomy Shop demo?

What kind of failures does the OpenTelemetry demo simulate?

Was this page helpful?