Why Datadog Costs More on Kubernetes (and What Teams Switch To)

Last Updated: July 13, 202612 min read

TL;DR

SigNoz Cloud: Standard OpenTelemetry config on your pods, no vendor SDK in your application code, priced on data volume regardless of how many nodes or containers you run.
Datadog: Three Kubernetes components to operate (Agent, Cluster Agent, Operator), Datadog-specific env vars and labels on every Deployment, and per-host plus per-container plus custom-metric charges that all scale with cluster size.

Datadog is one of the most widely used platforms for monitoring Kubernetes clusters today, in large part because the install path is straightforward. A Helm chart brings up the Agent as a DaemonSet, the Cluster Agent surfaces Kubernetes events and cluster-level metrics, and pods, deployments, and traces appear in a single UI within minutes. The cost and complexity of running Datadog on Kubernetes long-term show up in three places that are not obvious at install time:

A bill that scales with cluster shape rather than telemetry volume
Vendor-specific env vars and labels on every pod template
A migration path that touches application code, workload manifests, and cluster RBAC at the same time.

This article walks through what Datadog actually deploys to your cluster, the structural reasons the cost and complexity grow with cluster size, and what changes when you replace the Datadog Agent with the OpenTelemetry Collector and send the same data to SigNoz Cloud instead.

Comparison of Datadog and OpenTelemetry Kubernetes architectures — *Datadog versus OpenTelemetry on Kubernetes*

For the step-by-step replacement path, our Datadog to SigNoz migration guide has the commands for swapping the Datadog Agent and Operator out for the OpenTelemetry Collector and pointing your data at SigNoz Cloud.

What Datadog deploys to your Kubernetes cluster

A standard Datadog install on Kubernetes is three components plus a configuration surface, and all four are Datadog-specific:

Datadog Agent (DaemonSet): One Agent pod on every node. It scrapes kubelet for container resource metrics, tails container logs from /var/log/pods, receives APM traces from your application pods over port 8126, and accepts custom metrics over DogStatsD on UDP 8125. If a node doesn't have a Datadog Agent pod, that node is invisible to Datadog.
Datadog Cluster Agent (Deployment): A centrally managed replica that watches the Kubernetes API, runs cluster-wide integrations like kube-state-metrics, distributes metadata to the node Agents so they can tag workloads correctly, and exposes the External Metrics API so Horizontal Pod Autoscalers can scale on Datadog data. If the Cluster Agent goes unhealthy, autoscaling and a lot of Kubernetes-aware tagging stop working until it recovers.
Datadog Operator (or the datadog/datadog Helm chart): Reconciles the Agent DaemonSet and the Cluster Agent Deployment through a DatadogAgent Custom Resource. The Operator is the newer, Datadog-recommended path. The Helm chart is older but still maintained, and a lot of clusters are still on it.
Pod annotations and labels: Every workload Deployment usually ends up carrying tags.datadoghq.com/env, tags.datadoghq.com/service, and tags.datadoghq.com/version labels on the pod template, plus ad.datadoghq.com/<container>.* annotations for any container that uses Datadog Autodiscovery. Changes to these apply through a normal rollout, not a hot-update.

Together with the Datadog libraries inside your application code (ddtrace, dogstatsd), this is the full surface area your platform team has to operate, and the Cluster Agent, Operator, and annotation schema are usually what makes a migration off Datadog harder than teams expect, even though the Agent is the part everyone interacts with in the UI.

Where Kubernetes makes Datadog harder than it looks

The Datadog install above works as documented, but the friction usually shows up only after you have been running it for a longer period of time.

Per-host pricing on a per-node platform

Datadog metrics summary view with queryable tags and host-based pricing context — *Datadog Kubernetes metrics summary and tag cardinality view*

Datadog APM is priced per host, Infrastructure Monitoring is charged per infra host, and per-container charges sit on top. On Kubernetes, a host is a node, so the bill grows every time the Cluster Autoscaler adds one, regardless of how much telemetry actually changed.

The model rewards packing workloads onto fewer larger nodes, which is the opposite of how Kubernetes-native architectures are designed to scale.

Cardinality from default Kubernetes tags

Datadog metric tag keys and container tag counts for container.cpu.throttled — *Datadog metric tag keys for Kubernetes container metrics*

Datadog can automatically attach pod_name, container_id, kube_deployment, and similar tags to your metrics. Billing keys off unique combinations of metric name and tag values, so every pod restart, rollout, or replica-set change produces new billable series.

One cleanly named application metric can become tens of thousands of unique time series on the invoice once Kubernetes tagging is applied.

Vendor-specific config on every pod

A Datadog-instrumented Deployment typically carries DD_* env vars on every container (DD_SERVICE, DD_ENV, DD_VERSION, DD_LOGS_INJECTION, DD_AGENT_HOST, DD_TRACE_AGENT_PORT, DD_DOGSTATSD_HOST) and tags.datadoghq.com/* labels on the pod template.

template:
  metadata:
    labels:
      app: booking-api
      tags.datadoghq.com/env: production
      tags.datadoghq.com/service: booking-api
      tags.datadoghq.com/version: "1.0.0"
  spec:
    containers:
      - name: booking-api
        env:
          - name: DD_SERVICE
            value: booking-api
          - name: DD_ENV
            value: production
          - name: DD_VERSION
            value: "1.0.0"
          - name: DD_LOGS_INJECTION
            value: "true"
          - name: DD_AGENT_HOST
            valueFrom:
              fieldRef:
                fieldPath: status.hostIP
          - name: DD_TRACE_AGENT_PORT
            value: "8126"
          - name: DD_DOGSTATSD_HOST
            valueFrom:
              fieldRef:
                fieldPath: status.hostIP

Every workload Deployment in a Datadog-instrumented cluster carries this same block. None of these env vars or labels work outside Datadog.

None of this is portable, so moving off the Agent later means editing every workload manifest in the cluster, not just the Helm release that installed the Agent.

Logs billed twice

Datadog Logs charges $0.10/GB for ingestion and $1.06 to $2.50 per million events for indexing, with retention multipliers on top.

Kubernetes naturally produces high log volume from sidecars, init containers, and platform components, and the indexing line on the invoice is usually larger than teams estimate going in.

Agent overhead on every node

A 200-node cluster runs as many Datadog Agent pods as there are nodes, plus the Cluster Agent, all consuming CPU and memory you pay your cloud provider for. The Cluster Agent is a single Deployment, which becomes an availability concern when Horizontal Pod Autoscalers depend on its External Metrics API.

$ kubectl get pods -n datadog -o wide

NAME                                       READY   STATUS    NODE
datadog-agent-7jdpx                        4/4     Running   node-1
datadog-agent-9xnqv                        4/4     Running   node-2
datadog-agent-rkb2c                        4/4     Running   node-3
datadog-cluster-agent-7c9d5b6f48-zxqkw     1/1     Running   node-2

One Agent pod per node (the DaemonSet), plus a single Cluster Agent Deployment. The pattern is identical on a 3-node cluster and a 300-node cluster, which is also exactly how the bill scales.

These are not bugs but the natural shape of an Agent-based architecture that was designed for the per-VM era and then extended onto Kubernetes, which means the friction is structural rather than something a configuration change can resolve.

Running Datadog on Kubernetes and watching the bill grow faster than the cluster? SigNoz Cloud prices on data volume, not nodes or containers. OpenTelemetry-native, with a Datadog dashboard translator built for this migration.

Get Started - Free

Common scenarios where Datadog and SigNoz behave differently

Five situations any team running Kubernetes hits sooner or later, and what each path looks like.

Adding a new service

A new Deployment on the Datadog path needs the same block from the manifest snippet above (seven DD_* env vars and three tags.datadoghq.com/* labels), plus ddtrace and DogStatsD in the application code. On the SigNoz path, the same Deployment needs three standard OpenTelemetry env vars and nothing in the application code.

template:
  spec:
    containers:
      - name: booking-api
        env:
          - name: OTEL_SERVICE_NAME
            value: booking-api
          - name: OTEL_EXPORTER_OTLP_ENDPOINT
            value: http://otel-collector:4317
          - name: OTEL_RESOURCE_ATTRIBUTES
            value: deployment.environment=production,service.version=1.0.0

Three env vars, vendor-neutral. The same Deployment works against any OTLP backend.

A custom metric with a high-cardinality tag

Both products will record bookings_total{user_id="..."} but the difference is on the invoice. Datadog counts every unique combination of metric name and tag values as a billable custom metric, while SigNoz prices by sample volume regardless of tag count.

A metric tagged with user_id on a cluster with 100,000 distinct users produces 100,000 billable series on Datadog. On SigNoz, the same data costs the same as the metric without the tag.

A cluster that autoscales

On Datadog, scaling from 20 nodes to 200 multiplies the per-host APM and Infrastructure Monitoring charges by ten, even when telemetry volume does not change. On SigNoz, the bill follows data volume, so horizontal scaling stays cheap.

Migrating to a different observability backend

Moving off ddtrace plus the Datadog Agent means rewriting application code, updating every Deployment manifest, removing the Helm release, and rebuilding dashboards. Moving an OpenTelemetry setup to a different OTLP backend means changing one exporter block in the Collector config.

exporters:
  otlp/signoz:
    endpoint: ingest.us.signoz.cloud:443
    headers:
      signoz-ingestion-key: ${env:SIGNOZ_INGESTION_KEY}

One config block changes. Application code and workload manifests stay untouched.

Sending data from Kubernetes to SigNoz Cloud

The SigNoz path on Kubernetes is two pieces:

An OpenTelemetry Collector Deployment with a ConfigMap pointing at your SigNoz Cloud OTLP endpoint and an ingestion key from SigNoz Cloud > Settings > Ingestion. It receives OTLP traffic from application pods on port 4317 (gRPC) or 4318 (HTTP) and ships it to SigNoz Cloud.
Application Deployments with three standard OpenTelemetry env vars: OTEL_SERVICE_NAME, OTEL_EXPORTER_OTLP_ENDPOINT, and OTEL_RESOURCE_ATTRIBUTES. These are vendor-neutral, so the same Deployment works against any OTLP backend.

For Kubernetes node, pod, and container metrics plus Kubernetes events, install SigNoz K8s Infra as a separate Helm release. It runs an OpenTelemetry Collector with the right Kubernetes receivers and the cluster-wide RBAC for the API server scrape. Application telemetry and cluster telemetry stay in separate Helm releases because they are different concerns and easier to operate independently.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

exporters:
  otlp/signoz:
    endpoint: ${env:SIGNOZ_OTLP_GRPC_ENDPOINT}
    headers:
      signoz-ingestion-key: ${env:SIGNOZ_INGESTION_KEY}

service:
  pipelines:
    traces:  { receivers: [otlp], exporters: [otlp/signoz] }
    metrics: { receivers: [otlp], exporters: [otlp/signoz] }
    logs:    { receivers: [otlp], exporters: [otlp/signoz] }

The Collector receives OTLP from your apps and forwards to SigNoz Cloud over the same protocol. No vendor SDK in your application code, no vendor-specific labels on pods.

Why SigNoz Cloud specifically

The OpenTelemetry community has many backends. SigNoz Cloud is the most direct landing spot from a Datadog Kubernetes setup for the reasons below.

Familiar UI for Datadog users

SigNoz Cloud's primitives map closely to Datadog's: service maps, trace waterfalls, the log explorer, and Kubernetes resource views. Engineers who already know Datadog can usually find their way around SigNoz in hours, not weeks.

SigNoz trace view showing flame graph and waterfall visualization for distributed requests — *SigNoz distributed trace view with flame graph and waterfall*

Migration tooling built for this move

SigNoz ships a Datadog dashboard JSON translator that converts Datadog dashboard exports into SigNoz dashboards, and the OpenTelemetry Collector's Datadog Receiver accepts traffic from your existing Datadog Agents and forwards it to SigNoz Cloud over OTLP. That means you can stage the migration without rewriting SDKs on day one.

receivers:
  datadog:
    endpoint: 0.0.0.0:8126

exporters:
  otlp/signoz:
    endpoint: ${env:SIGNOZ_OTLP_GRPC_ENDPOINT}
    headers:
      signoz-ingestion-key: ${env:SIGNOZ_INGESTION_KEY}

service:
  pipelines:
    traces:
      receivers: [datadog]
      exporters: [otlp/signoz]

The OpenTelemetry Collector listens on the Datadog Agent's port (8126), accepts traffic from your ddtrace-instrumented apps, and forwards it to SigNoz Cloud over OTLP. No SDK swap required to start.

Pricing follows data, not infrastructure

SigNoz Cloud prices traces and logs at \$0.30/GB and metrics at $0.10 per million samples. No per-host charge, no per-container charge, no separate ingestion-versus-indexing tier, and no surcharge for high-cardinality Kubernetes tags. The bill follows the data your workloads actually produce.

SigNoz pricing with usage-based observability costs and no host-based pricing — *Usage-based pricing for observability, not host-based billing*

Portability built into the choice

Because SigNoz Cloud is OpenTelemetry-native end to end, the instrumentation you write for SigNoz works unchanged against any OTLP-compatible backend. If you ever want to switch destinations again, only the Collector's exporter endpoint changes, and the application code and workload manifests stay the same.

Ready to replace the Datadog Agent on Kubernetes? SigNoz Cloud is OpenTelemetry-native, ships a Datadog dashboard translator, and runs without per-host or per-container charges.

Get Started - Free

FAQs

How does Datadog monitor Kubernetes?

Three components: Datadog Agent (DaemonSet), Datadog Cluster Agent (Deployment), and the Datadog Operator or Helm chart. SigNoz Cloud collapses these into one OpenTelemetry Collector.

What does the Datadog Agent need on every pod?

Seven DD_* env vars on every container plus tags.datadoghq.com/* labels on the pod template. SigNoz Cloud needs three standard OpenTelemetry env vars, vendor-neutral.

Does Datadog charge per Kubernetes node?

Yes. Infrastructure Monitoring from $15/host, APM from $31 (or $36 standalone), plus per-container and custom-metric charges. SigNoz Cloud prices on data volume, so node count doesn't move the bill.

Why does my Datadog bill grow when my cluster auto-scales?

Per-host pricing scales linearly with node count. SigNoz Cloud's volume-based pricing doesn't.

Can I send data to SigNoz from Kubernetes without changing application code?

Yes. If you already use OpenTelemetry, swap the exporter endpoint in your Collector config. If you are on ddtrace, the Collector's Datadog Receiver accepts Agent traffic and forwards it to SigNoz Cloud over OTLP while you swap SDKs at your own pace.

Do I need the Datadog Operator with SigNoz?

No. Use the OpenTelemetry Operator, a CNCF project that manages OpenTelemetry Collectors the same Kubernetes-native way, without per-host billing.

Can I keep dd-trace on Kubernetes and use SigNoz?

Yes. The OpenTelemetry Collector's Datadog Receiver forwards existing dd-trace traffic to SigNoz Cloud over OTLP, so you can stage the migration without rewriting SDKs.

Does SigNoz collect Kubernetes events and node metrics?

Yes, via SigNoz K8s Infra (a separate Helm release with the right Kubernetes receivers and cluster-wide RBAC).