A Fully Open Source APM Built For OpenTelemetry | SigNoz

Updated Feb 10, 202611 min read

OpenTelemetry is a Cloud Native Computing Foundation(CNCF) project aimed at standardizing the way we instrument applications for generating telemetry data(logs, metrics, and traces). However, OpenTelemetry does not provide storage and visualization for the collected telemetry data. And that's where an OpenTelemetry APM is needed.

APM stands for Application Performance Monitoring or Application Performance Management. APM tools help engineering teams effectively monitor their applications by analyzing and visualizing key metrics for application performance. With time, the way applications are built and deployed has changed. With containerization technologies, software systems have become distributed and are more dynamic than ever in production environments.

APM tools have also evolved to newer ways of reporting metrics for container-based applications. For newer age architectures like microservices and serverless, it’s difficult for engineering teams to have a central overview of how their applications are performing. But monitoring technology also evolved and gave birth to distributed tracing. With distributed tracing, you can trace user requests across services and protocols.

But setting up a robust monitoring and observability stack is challenging in distributed applications. The first step for setting up observability is to instrument your applications for generating telemetry data. OpenTelemetry provides a consistent instrumentation layer for your entire application stack, including open source frameworks and libraries. Let’s learn more about OpenTelemetry.

What is OpenTelemetry?

OpenTelemetry is an open-source collection of tools, APIs, and SDKs that aims to standardize the way we generate and collect telemetry data. The OpenTelemetry specification has design and implementation guidelines on how the instrumentation libraries should be implemented. In addition, it provides client libraries in all the major programming languages which follow the specification.

The specification is designed into distinct types of telemetry known as signals. Presently, OpenTelemetry has specifications for these three signals:

Metrics: Numerical measurements of system performance
Traces: Detailed records of request flows through distributed systems
Logs: Time-stamped records of events within your applications

Together these three signals form the three pillars of observability. OpenTelemetry is the bedrock for setting up an observability framework. The application code is instrumented using OpenTelemetry client libraries, which enables the generation of telemetry data. Once the telemetry data is generated and collected, OpenTelemetry needs a backend analysis tool to which it can send the data.

Things to keep in mind while choosing an OpenTelemetry APM

Below is the list of factors that should be taken into consideration before selecting an OpenTelemetry APM:

Support for all distinct signals of OpenTelemetry
Currently, OpenTelemetry collects telemetry data in three distinct signals, namely, logs, metrics, and traces. Setting up a robust observability framework requires the use of all three signals. An OpenTelemetry backend should be able to ingest and visualize all three signals.
Moreover, the frontend of the OpenTelemetry backend should also provide features to easily correlate the signals. This enables users to understand scenarios like: how a request led to certain events, which led to increase in request latency over the past 30 minutes.
OpenTelemetry signal correlation enables users to check span-level logs within Traces view.

Native support for OpenTelemetry semantic conventions
In OpenTelemetry, every component of a distributed system is defined as an attribute, that are nothing but key-value pairs. These attributes help describe the "entity" that they are attached to, like a span for a web request, and are defined by the OpenTelemetry specification as OpenTelemetry semantic conventions. For example, here is a glimpse of how HTTP conventions look like:

Attribute	Description	Example
http.method	HTTP request method	GET; POST; HEAD
http.target	The full request target as passed in an HTTP request line or equivalent	/blog/june/
http.scheme	The URI scheme that identifies the used protocol	http; https

An OpenTelemetry backend should have native support to store data with OpenTelemetry semantic conventions. Existing observability vendors usually transform the data collected using OpenTelemetry semantic conventions into their propriety formats. But OpenTelemetry has a huge list of semantic conventions which might not be fully utilized in such scenarios.

Should allow aggregates on trace data
Running aggregates on trace data enables you to create service-centric views. OpenTelemetry also provides you the ability to create custom tags. Combined with custom tags and aggregated trace data gives you a powerful magnifying glass to surface performance issues in your services. For example, you can get the error rate and 99th percentile latency of customer_type: gold or deployment_version: v2 or external_call: paypal.
Open Source
OpenTelemetry is an open source standard with a huge community backing. It is testimonial to the fact that community-driven projects can solve large complex engineering problems. It is not necessary for the OpenTelemetry APM to be open source.

But having an open-source OpenTelemetry APM can enable you to have a full-stack open-source solution. Open-source solutions have more flexibility, and if you self-host, you have complete control over your data.

Most SaaS APMs now claim to be 100% compatible with OpenTelemetry. But it’s difficult to move away from legacy systems. A solution built natively for OpenTelemetry can be a good choice for OpenTelemetry APM, and that’s where SigNoz comes into the picture.

SigNoz - An APM built natively for OpenTelemetry

SigNoz is a full-stack open source APM built natively to support OpenTelemetry. It serves as a backend for storing telemetry data (logs, metrics, and traces), and leverages the power of ClickHouse, a columnar database, for highly effective log analytics. At SigNoz, we believe that OpenTelemetry is going to be the world standard for instrumenting cloud-native applications.

SigNoz supports OpenTelemetry semantic conventions and provides visualization for all three distinct types of signals supported by OpenTelemetry.

The steps to send telemetry data to SigNoz involves:

Instrumenting your application's code with language-specific OpenTelemetry libraries
Configure OpenTelemetry Exporters to send data to SigNoz
Visualize and analyze telemetry data using SigNoz dashboards and signal-specific views

Here’s a picture depicting how OpenTelemetry fits within an application and SigNoz.

How OpenTelemetry fits within an application and an observability backend — *How OpenTelemetry fits within a microservice-based application and an observability backend - SigNoz*

SigNoz cloud is the easiest way to run SigNoz. You can sign up here for a free account and get 30 days of unlimited access to all features.

You can also install and self-host SigNoz yourself. Check out the installation documentation for detailed instructions.

Once your application is instrumented with OpenTelemetry client libraries, the data can be sent to the SigNoz backend by specifying a specific port on the machine where SigNoz is installed, or by using the SigNoz ingestion URL if you are using the hosted version.

You can then use it to monitor application metrics with out-of-box charts and visualization.

SigNoz dashboard showing popular RED metrics — *An OpenTelemetry backend built natively for OpenTelemetry, SigNoz provides out-of-box charts for application metrics*

The tracing signal from OpenTelemetry instrumentation helps you correlate events across services. With SigNoz, you can visualize your tracing data using Flamegraphs and Gantt charts. It shows you a complete breakdown of the request along with every bit of data collected with OpenTelemetry semantic conventions.

Detailed Flamegraphs & Gantt charts — *Tracing data collected by OpenTelemetry can be visualized with the help of Flamegraphs and Gantt charts on the SigNoz dashboard*

SigNoz also lets you run aggregates on your tracing data. Running aggregates on tracing data enables you to create service-centric views, providing insights to debug applications at the service level. It also makes sense for engineering teams as they own specific microservices.

*Running aggregates on your tracing data enables you to create service-centric views.*

Best Practices for OpenTelemetry Instrumentation

To get the most out of OpenTelemetry APM, consider these best practices:

Balance automatic and manual instrumentation

Auto-instrumentation libraries are quick to set up, and broadly cover the events happening within your applications. However, since auto-instrumentation agents capture events from all compatible libraries in a codebase, they can lead to excess spans that are often unneeded, and that you must tone down eventually.

For example, when instrumenting applications handling long-running background tasks, you might wish to disable DB instrumentation to prevent generating a large number of spans for database calls made by the application.

Manual instrumentation allows you to manage this "instrumentation scope" better. Not only that, it is necessary when you want to capture details for custom business logic that is not covered as part of auto-instrumentation, such as a function performing specific item-availability checks for an inventory microservice.

Follow semantic conventions and consistent naming

As discussed above, OpenTelemetry defines semantic conventions that dictate how telemetry data describes operations and "entities", such as the application generating telemetry data (defined via the service.name resource attribute). By following these conventions, you ensure that your data can be correctly processed by observability backends, and that it remains consistent across OpenTelemetry-compatible backends.

Further, you should ensure teams in your organization use a consistent naming schema, to prevent metadata "drift" where different teams use different keys for the same thing. Inconsistency with naming conventions can lead to silent failures and increase difficulty in maintaining dashboards, configuring alerts, and so on.

We have done a detailed write-up on the topic that you can read here.

Use the OTel Collector for centralized data processing

When instrumenting data from multiple applications, you might need to manage exported telemetry data by modifying attributes, or scrubbing PII from specific datasets. Although you can manage these configurations for each service individually, this approach requires updating application code for every configuration change, and can lead to significant code duplication.

The OpenTelemetry Collector allows you to manage telemetry in a centralized manner, and define how telemetry data should be handled before it's forwarded to your observability backend. It even supports exporting to multiple backends if you wish.

Taking a practical example, many organizations often export application logs to S3 for "cold storage", to meet business compliance requirements. This is important because some observability backends might not not support data retention for long periods of time such as six months or a year.

Key Takeaways

OpenTelemetry APM standardizes telemetry data collection across your entire stack
It offers vendor-neutral, portable observability that prevents lock-in
Implementing OpenTelemetry APM improves visibility in complex, distributed systems
Native OpenTelemetry tools like SigNoz enhance the APM experience with tailored features

FAQs

What are the main differences between OpenTelemetry and traditional APM tools?

OpenTelemetry provides a standardized, vendor-neutral approach to instrumentation and data collection. Traditional APM tools often use proprietary agents and data formats, leading to vendor lock-in. OpenTelemetry allows you to switch between different APM backends without changing your instrumentation code.

How does OpenTelemetry APM improve application performance?

OpenTelemetry APM doesn't directly improve performance, but it provides the detailed insights needed to identify and resolve performance issues. By offering comprehensive tracing, metrics, and logging capabilities, it enables developers to pinpoint bottlenecks, optimize resource usage, and enhance overall application efficiency.

Can OpenTelemetry be used with existing monitoring solutions?

Yes, most existing monitoring solutions now support for OpenTelemetry data. You can often use OpenTelemetry alongside your current tools, gradually transitioning as you become more comfortable with the new approach. Some providers offer OpenTelemetry exporters or collectors to facilitate this integration within observability pipelines.

What skills are needed to implement OpenTelemetry APM effectively?

To implement OpenTelemetry APM effectively, you should have:

Familiarity with distributed systems concepts
Knowledge of your application's programming language(s)
Understanding of observability principles (traces, metrics, logs)
Basic DevOps skills for configuring collectors and exporters
Data analysis capabilities to interpret and act on the collected telemetry

As you gain experience with OpenTelemetry, you'll develop a deeper understanding of its capabilities and best practices for application monitoring.

Remember, you can start small: auto-instrumentation at the code level and general APM dashboards alone can help you maintain reliable application systems. They might even be enough for your needs for a considerable period of time until your systems grow in complexity, and require more investment into observability!