OpenTelemetry API vs SDK - Key Differences Explained

OpenTelemetry is revolutionizing observability in modern software systems by providing a standardized method for capturing distributed tracing, metrics, and logs. As observability becomes more important in managing complex, distributed applications, understanding the differences between OpenTelemetry's API and SDK will help you select the best tool for your project. This article discusses the important differences that can help you plan your implementation strategy.

Quick Guide: API vs SDK

Aspect	OpenTelemetry API	OpenTelemetry SDK
Purpose	Provides an interface to generate telemetry data (e.g., traces, metrics, logs).	Implements the collection, processing, and exporting of telemetry data.
Audience	Application developers looking to instrument their applications.	DevOps, SREs, and developers configuring telemetry collection and exporting.
Stability	More stable, with less frequent updates.	Subject to changes, with updates to optimize telemetry processing.
Implementation	No actual data collection or export logic.	Provides the actual logic for collecting and exporting data.
Dependency	Lightweight, minimal dependencies.	Typically heavier, includes actual exporters and processors.

What is the OpenTelemetry API?

The OpenTelemetry API is a collection of language-specific APIs (designed to be language-agnostic) that standardize the way telemetry data is captured and processed. Consider it a contract between your application code and the actual implementation (SDK). The API enables developers to consistently instrument their code, while the SDK handles the actual collection and export of telemetry data.

The key components of the API are:

Tracers: It builds and manages spans for distributed tracing.
Meters: It is used to track application metrics (such as latency and failures).
Loggers: It is for creating structured log entries.

The OpenTelemetry API plays a crucial role in application development by:

Standardizing telemetry data collection across different languages and frameworks.
Enabling portability by providing instrumentation that is compatible with a variety of backend systems.
Providing consistency for developers by providing a reliable and standard interface regardless of the backend being used.

When to Use the OpenTelemetry API

Here are several areas where you might like to use the OpenTelemetry API. Let's take a look at them:

Developing libraries or frameworks: If you're developing libraries or frameworks, the OpenTelemetry API allows you to instrument your code in a way that works with any OpenTelemetry SDK. For example, if you create a Python library for database interactions and use the OpenTelemetry API to add tracing, users of your library can choose different SDKs—like the official OpenTelemetry SDK or a third-party one (e.g., SigNoz)—to collect and export telemetry data. This flexibility ensures that your library integrates seamlessly with various observability solutions without being tied to a specific SDK.
Creating portable instrumentation: The API allows you to write instrumentation once and then reuse it across several projects or backend systems. For example, if you instrument an HTTP client in your application using the OpenTelemetry API, you can reuse that same instrumentation code in another project or switch backend observability platforms (e.g., from Jaeger to Prometheus) without changing the instrumentation. This portability ensures that your telemetry code remains consistent and adaptable, regardless of the underlying systems.
Dependency minimization: The OpenTelemetry API is lightweight because it primarily provides interfaces for instrumentation, without including the additional logic for collecting, processing, or exporting telemetry data. On the other hand, the SDK is 'heavier' because it contains the full implementation needed for telemetry operations, including data buffering, exporting traces or metrics, and integrations with backends. This increases the size of the SDK, may affect the application's speed (due to processing overhead), and often requires more time to integrate due to additional configuration. The API, therefore, allows for minimal dependencies and faster integration, especially when the full telemetry stack isn’t required.
Ensuring future compatibility: By coding against the API, you ensure that your instrumentation remains compatible even if you switch or change your SDK or backend systems. This is because the API follows versioning standards and provides backward compatibility, meaning that even if newer versions of SDKs or backend systems are introduced, your instrumentation won’t break. The API is typically 'version-locked,' ensuring stability and long-term compatibility, whereas SDKs or backend tools may evolve more rapidly. This approach helps future-proof your instrumentation code against breaking changes.

Code Snippet: Basic OpenTelemetry API Usage

Here’s a simple example of how you might use the OpenTelemetry API for tracing:

// Import the OpenTelemetry API
const { trace } = require('@opentelemetry/api');

// Get a tracer instance
const tracer = trace.getTracer('example-tracer');

// Start a span
const span = tracer.startSpan('operation-name');

// Perform some operations
doWork();

// End the span when work is done
span.end();

function doWork() {
    // Simulate work
    for (let i = 0; i < 1000000; i++) {}
}

Explanation:

In this example, we utilize the OpenTelemetry API to build a tracer and start a span. The span measures the duration of the doWork function, which represents a unit of work in distributed tracing.

Understanding OpenTelemetry SDK

The OpenTelemetry SDK is a real API implementation that includes mechanisms for collecting, analyzing, and exporting telemetry data. While the API specifies "what" to track, the SDK handles "how" the data is collected and delivered to observability platforms, serving as the engine that drives your observability pipeline.

Let's look at the main features of SDK:

Data processing and aggregation: Determines how telemetry data is processed and transferred to the backend.
Sampling and filtering: Allows you to control which data is collected using configurable criteria.
Data export: Telemetry data is delivered to a variety of backends, including Prometheus, Jaeger, and SigNoz.

Key Features of the OpenTelemetry SDK

The SDK has several additional features that improve observability. Let's take a look at them:

Configurable Data Pipelines: You can create pipelines to tailor how telemetry data is handled and transmitted to your specific needs.
Built-in Exporters: Offers out-of-the-box support for numerous common observability backends, including SigNoz, Jaeger, and Zipkin.
Sampling Mechanisms: Adjust the amount of telemetry data captured to balance performance and cost. In a high-throughput application, you could, for example, gather every tenth trace.
Context Propagation: Maintains trace and span context across distributed services, maintaining tracing consistency.
Extensibility: You can customize the SDK's processors, samplers, and exporters to meet your requirements.

Let us now take a look at some of the key components of SDK:

Resource Providers: Gather and deliver metadata about the environment (e.g., cloud provider, region) in which your application operates.
Samplers: Determine the percentage of traces to sample to control the amount of trace data recorded.
Exporters are responsible for providing telemetry data (traces, metrics, and logs) to the desired observability platforms.

Code Snippet: Configuring OpenTelemetry SDK

Here’s a simple code snippet demonstrating how to configure the OpenTelemetry SDK for trace exporting:

const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
const { SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');
const { ConsoleSpanExporter } = require('@opentelemetry/sdk-trace-base');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');

// Set up a resource to provide service name metadata
const resource = new Resource({
  [SemanticResourceAttributes.SERVICE_NAME]: 'example-service',
});

// Create a tracer provider and associate it with the resource
const provider = new NodeTracerProvider({
  resource: resource,
});

// Set up an exporter (ConsoleSpanExporter for debugging purposes)
const exporter = new ConsoleSpanExporter();

// Add a span processor (SimpleSpanProcessor for direct exporting of spans)
provider.addSpanProcessor(new SimpleSpanProcessor(exporter));

// Register the provider to begin tracing
provider.register();

// Example tracer usage
const tracer = provider.getTracer('example-tracer');

// Start a span and simulate work
const span = tracer.startSpan('example-operation');
span.setAttribute('key', 'value');  // Add attributes to the span
setTimeout(() => {
  span.end();  // End the span
  console.log('Trace data sent to the console!');
}, 1000);

In this example, we use the OpenTelemetry SDK to create a tracer provider and configure it to export trace data to Jaeger. The SDK handles the actual trace processing and exporting.

API vs SDK: Main Differences

Understanding the differences between the OpenTelemetry API and SDK is essential for proper observability implementation. Here are the main distinctions:

Abstraction Level:
- API: Provides abstract APIs for capturing telemetry data.
- SDK: Implements these interfaces, including capabilities for processing and exporting telemetry data.
Dependency Management:
- API: Lightweight and minimal dependencies. It serves as the instrumentation layer without relying on large libraries.
- SDK: Includes entire implementations and other dependencies for telemetry data processing and exporting.
Functionality:
- API: Defines "what" telemetry data is gathered (e.g., beginning spans, recording metrics).
- SDK: Implements "how" the telemetry data is handled, exported, and sampled.
Use Case:
- API: Suitable for library or framework developers looking to add instrumentation without requiring a specific SDK.
- SDK: Suitable for applications requiring complete telemetry features, such as data processing and exporting.

Comparison Table of OpenTelemetry API vs SDK

Feature	OpenTelemetry API	OpenTelemetry SDK
Abstraction Level	Abstract interfaces and definitions	Concrete implementations
Dependency Management	Lightweight, minimal dependencies	Full implementation, larger footprint
Flexibility	Highly portable	Environment-specific
Functionality	Defines possible operations	Implements how operations are performed
Use Case	Ideal for libraries and frameworks	Suited for applications with full telemetry

How to Choose Between OpenTelemetry API and SDK

The nature of your project and your observability requirements will determine whether you use the OpenTelemetry API or the SDK. Let’s compare them through a table so that it can be easier to decide:

Factor	OpenTelemetry API	OpenTelemetry SDK
Project Type	Use the API to instrument code and ensure interoperability with various SDKs or backends.	Use the SDK for complete telemetry processing and exporting.
Instrumentation Needs	Suitable for lightweight instrumentation only.	Offers full trace gathering, sampling, and export options.
Performance and Resource Constraints	Minimizes your application's footprint, ideal for resource-constrained environments.	Requires more resources (memory, compute), but provides additional capabilities.
Compatibility Requirements	Maintains flexibility and portability across different observability backends.	Optimized exporters for seamless integration with specific backends (e.g., Jaeger, SigNoz).

Implementing OpenTelemetry in Your Project

To effectively implement OpenTelemetry, you can follow these steps:

Integrating the API for Basic Instrumentation:
- Add the OpenTelemetry API dependency to your project based on the language you're using (e.g., @opentelemetry/api for Node.js). For example, using npm:
```
npm install @opentelemetry/api
```
- Use tracers, meters, and loggers to instrument key areas of your code. For example, instrument-critical paths like HTTP request handling or database queries capture meaningful telemetry data.
- Keep instrumentation code independent of the SDK. This ensures that your application remains portable and can be used with different OpenTelemetry backends.
- Example:
```
const { trace } = require('@opentelemetry/api');

const tracer = trace.getTracer('example-tracer');

const span = tracer.startSpan('operation-name');
    // Add your logic here
span.end();
```
To set up the SDK for complete telemetry capabilities, include the OpenTelemetry SDK in your project to manage the entire telemetry pipeline.
- Configure resource providers, samplers, and exporters to meet your observability requirements. Set up exporters to deliver data to the appropriate backend (such as Jaeger or SigNoz).
- Create data processing pipelines to control how telemetry data is handled, sampled, and exported.
- Example:
```
const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
const { ConsoleSpanExporter, SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');

const provider = new NodeTracerProvider();
provider.addSpanProcessor(new SimpleSpanProcessor(new ConsoleSpanExporter()));
provider.register();
```

Best Practices for Using API and SDK Together

Use the API but keep it distinct from the SDK for flexibility and modularity.
Configure the SDK centrally, ideally at the application's entry point, to ensure a clear separation of concerns.
Use SDK capabilities like sampling and filtering to handle telemetry data collecting. This avoids overloading your backend with data and improves performance.

Avoiding Common Pitfalls

Prioritize critical pathways (e.g., request/response cycles, database calls) over instrumenting all functions.
Make sure to appropriately propagate trace context, especially in distributed systems, to ensure trace consistency across services.
Implement robust error handling in your instrumentation logic. For example, address probable concerns during telemetry export to avoid disrupting application performance.

Enhancing Observability with SigNoz

SigNoz is an open-source Application Performance Monitoring (APM) application that complements OpenTelemetry by providing a reliable framework for storing, analyzing, and visualizing telemetry data. It works perfectly with OpenTelemetry, making it an excellent option for improving observability in distributed systems.

Key Advantages of Using SigNoz with OpenTelemetry

Native Support for OpenTelemetry Data Formats: SigNoz can ingest OpenTelemetry traces, metrics, and logs directly without the need for sophisticated integrations.
Customizable Dashboards: You can easily design and customize dashboards to display your application's metrics and traces. This provides instant insights into your application's performance and health.
Advanced Query Capabilities: SigNoz provides robust query capabilities for deep dive analysis, allowing you to investigate telemetry data to identify performance bottlenecks and problems.
Alerting and Anomaly Detection: Set up alerts and anomaly detection to be notified of issues in real-time, helping you maintain proactive monitoring.

SigNoz cloud is the easiest way to run SigNoz. Sign up for a free account and get 30 days of unlimited access to all features.

You can also install and self-host SigNoz yourself since it is open-source. With 20,000+ GitHub stars, open-source SigNoz is loved by developers. Find the instructions to self-host SigNoz.

Key Takeaways

The OpenTelemetry API provides instrumentation interfaces, whereas the SDK includes concrete implementations for collecting, processing, and exporting telemetry data.
Use the API for libraries and portable instrumentation, and the SDK for applications that require all telemetry features and capabilities.
When picking between the API and SDK, take into account your project type, instrumentation requirements, performance limits, and compatibility needs.
Follow best practices to prevent frequent OpenTelemetry adoption issues, such as over-instrumentation and failure to consider context propagation.
Use tools such as SigNoz to improve your observability approach, delivering more insights into system performance and reliability.

FAQs

Can I use the OpenTelemetry API without the SDK?

Yes, you can use the OpenTelemetry API without the SDK, but it has limitations. The API provides interfaces for instrumenting your code, allowing you to generate telemetry data. However, without the SDK, there is no implementation for processing, collecting, or exporting that data. To fully leverage OpenTelemetry’s capabilities, including exporting metrics or traces, you will need to use the SDK alongside the API.

How does the OpenTelemetry API ensure compatibility across different implementations?

The OpenTelemetry API maintains interoperability by defining a standard set of interfaces that all complying SDKs must support. This abstraction layer enables you to develop instrumentation code once and then reuse it across any OpenTelemetry-compatible backend. The API defines common ideas and procedures for collecting telemetry data, which SDKs follow consistently.

Is it possible to move between SDKs while retaining the same API instrumentation?

Yes, it is possible to swap between SDK implementations while leaving your API instrumentation code unmodified. One of the most significant advantages of the OpenTelemetry API is its flexibility, which provides a uniform interface regardless of the underlying technology. To switch SDKs, you normally only need to change the setup code where you set up the OpenTelemetry SDK, leaving your application's instrumentation code untouched.

How can I ensure that my telemetry data is accurate and complete?

To ensure accurate and full telemetry data, adhere to best practices such as focusing on critical metrics, ensuring effective context propagation, and implementing robust error management. In addition, review and update your instrumentation code regularly to accommodate changes in your application design or telemetry requirements.

What are the performance implications of using the SDK versus just the API?

Using the SDK generally incurs a higher performance overhead compared to using just the API. This is because the SDK includes functionalities for data processing, sampling, and exporting, which require more resources. The API alone does not provide actual telemetry capabilities; it only defines interfaces. If performance is a concern, configure your SDK with appropriate samplers and processors to minimize resource usage.