k8s-infra otelDeployment and otelAgent configuration
The k8s-infra chart runs two kind of otel collectors:
- otelAgent: This collector is deployed as a daemonset and collects metrics/logs from the pods on the node. One agent is deployed per node.
- otelDeployment: This collector is deployed as a deployment and collects metrics/logs of the cluster. One deployment is deployed per cluster.
otelDeployment
The otelDeployment
collector is configured to collect cluster metrics and events by default. There are some use cases where you might want to collect metrics from components other than the ones that are being collected by default. For example, you might want to collect metrics from a redis server or a postgres database. This page describes how to override the default configuration of the otelDeployment
collector to collect metrics from additional components. The override-values.yaml
file contains the configurations that you can override.
Example 1: Configuring a redis receiver running inside a namespace "redis-ns" with the name "redis-server" and port 6379
The following configuration shows how to configure the otelDeployment
collector to collect metrics from a redis server running inside a namespace "redis-ns" with the name "redis-server" and port 6379.
otelDeployment:
config:
receivers:
redis:
endpoint: redis-server.redis-ns.svc.cluster.local:6379
service:
pipelines:
metrics/internal:
receivers: [redis]
processors: [batch]
exporters: []
Example 2: Configuring a postgres receiver running inside a namespace "postgres-ns" with the name "postgres-server" and port 5432
otelDeployment:
config:
postgres:
endpoint: postgres-server.postgres-ns.svc.cluster.local:5432
service:
pipelines:
metrics/internal:
receivers: [postgres]
processors: [batch]
exporters: []
Example 3: Configuring a redis receiver running inside a namespace "redis-ns" with the name "redis-server" and port 6379 and a postgres receiver running inside a namespace "postgres-ns" with the name "postgres-server" and port 5432
otelDeployment:
config:
receivers:
redis:
endpoint: redis-server.redis-ns.svc.cluster.local:6379
postgres:
endpoint: postgres-server.postgres-ns.svc.cluster.local:5432
service:
pipelines:
metrics/internal:
receivers: [redis, postgres]
processors: [batch]
exporters: []
The above examples are simple and show how to configure the otelDeployment
collector to collect metrics from a single component. However, in a real-world scenario, you might have to configure the receiver with a username and password and other configurations.
otelAgent
The otelAgent
collector is deployed as a daemonset, it results in duplicate metrics if the same component is running on multiple nodes and the metrics are collected from all the nodes. The otelDeployment
collector solves this problem by collecting metrics from the cluster and not from the individual nodes.
Essential Presets
This guide explains how to configure the OpenTelemetry Collector presets in SigNoz. These presets control what telemetry data you collect and how you collect it from your Kubernetes cluster.
1. OTLP Exporter
presets:
otlpExporter:
enabled: true # Must be enabled for data to reach SigNoz
What it does: Enables sending telemetry data to SigNoz backend. This is required for SigNoz to receive any data.
2. Host Metrics Collection
presets:
hostMetrics:
enabled: true
collectionInterval: 30s # Adjust based on your monitoring needs
scrapers:
cpu: {} # Collects CPU usage, time, and utilization metrics
load: {} # Collects system load averages
memory: {} # Collects memory usage metrics
disk: # Collects disk I/O metrics
exclude:
devices: # Regex patterns to exclude specific devices
- ^ram\d+$
- ^loop\d+$
# Add custom device exclusions here
filesystem: # Collects filesystem metrics
exclude_fs_types:
fs_types:
- tmpfs
- squashfs
# Add filesystem types to exclude
network: # Collects network metrics
exclude:
interfaces:
- ^veth.*$ # Excludes virtual ethernet devices
- ^docker.*$ # Excludes docker interfaces
# Add network interfaces to exclude
Key points:
- Adjust
collectionInterval
based on your monitoring granularity needs - Customize device exclusions to reduce noise from irrelevant storage devices
- Configure network interface exclusions to focus on relevant network metrics
3. Container Log Collection
presets:
logsCollection:
enabled: true
startAt: beginning # Options: beginning or end
includeFilePath: true # Adds file path as metadata
includeFileName: false # Adds file name as metadata
# Exclusion configuration
blacklist:
enabled: true
signozLogs: true # Excludes SigNoz's own logs
namespaces:
- kube-system # Add namespaces to exclude
pods:
- hotrod # Add pod names to exclude
- locust
containers: [] # Add container names to exclude
# Inclusion configuration (overrides blacklist if enabled)
whitelist:
enabled: false
signozLogs: true
namespaces: [] # Only collect logs from these namespaces
pods: [] # Only collect logs from these pods
containers: [] # Only collect logs from these containers
Best practices:
- Use blacklist for excluding noisy system pods
- Enable whitelist when you need to monitor specific applications only
- Consider disk usage implications when enabling file path/name inclusion
4. Kubernetes Metrics Collection
presets:
kubeletMetrics:
enabled: true
collectionInterval: 30s
authType: serviceAccount # Authentication method
endpoint: ${env:K8S_HOST_IP}:10250
insecureSkipVerify: true # Set to false in production
# Metrics configuration
metrics:
k8s.pod.cpu_limit_utilization:
enabled: true
k8s.pod.memory_limit_utilization:
enabled: true
# Enable other metrics as needed
Important settings:
- Set appropriate
collectionInterval
based on cluster size - Configure
insecureSkipVerify: false
in production environments - Enable specific metrics based on monitoring requirements
5. Kubernetes Metadata Enrichment
presets:
kubernetesAttributes:
enabled: true
passthrough: false # Set true to disable k8s API calls
# Control which metadata to collect
extractMetadatas:
- k8s.namespace.name
- k8s.deployment.name
- k8s.pod.name
- k8s.node.name
# Add or remove metadata fields
# Pod association configuration
podAssociation:
- sources:
- from: resource_attribute
name: k8s.pod.ip
Configuration tips:
- Enable only required metadata fields to optimize performance
- Use
passthrough: true
in very large clusters to reduce API load - Configure pod association based on your networking setup
6. Cluster-level Metrics
presets:
clusterMetrics:
enabled: true
collectionInterval: 30s
# Node conditions to monitor
nodeConditionsToReport:
- Ready
- MemoryPressure
- DiskPressure
# Add conditions based on monitoring needs
# Resource types to monitor
allocatableTypesToReport:
- cpu
- memory
# - storage # Uncomment if needed
When to use:
- Enable for cluster health monitoring
- Useful for capacity planning and resource optimization
- Essential for multi-node cluster monitoring
7. Kubernetes Events
presets:
k8sEvents:
enabled: true
authType: serviceAccount
namespaces: [] # Empty for all namespaces, or specify list
Usage scenarios:
- Enable for debugging and audit trails
- Monitor specific namespaces by listing them
- Useful for compliance and security monitoring
Common Deployment Scenarios
Production Cluster with Full Monitoring
presets:
otlpExporter:
enabled: true
hostMetrics:
enabled: true
collectionInterval: 30s
logsCollection:
enabled: true
blacklist:
enabled: true
namespaces:
- kube-system
kubeletMetrics:
enabled: true
insecureSkipVerify: false
kubernetesAttributes:
enabled: true
clusterMetrics:
enabled: true
k8sEvents:
enabled: true
Resource-Constrained Environment
presets:
otlpExporter:
enabled: true
hostMetrics:
enabled: true
collectionInterval: 60s
logsCollection:
enabled: true
whitelist:
enabled: true
namespaces:
- production
kubeletMetrics:
enabled: true
collectionInterval: 60s
kubernetesAttributes:
enabled: true
passthrough: true
Development Environment
presets:
otlpExporter:
enabled: true
hostMetrics:
enabled: true
collectionInterval: 30s
logsCollection:
enabled: true
kubeletMetrics:
enabled: true
kubernetesAttributes:
enabled: true
Resource Requirements
Recommended resource allocations based on preset combinations:
Configuration | CPU Request | Memory Request | CPU Limit | Memory Limit |
---|---|---|---|---|
Minimal | 100m | 256Mi | 200m | 512Mi |
Standard | 200m | 512Mi | 500m | 1Gi |
Full | 500m | 1Gi | 1000m | 2Gi |
Adjust these values based on your cluster size and monitoring requirements.