K8s Metrics
Kubernetes Monitoring Configuration Guide
This guide explains how to set up monitoring for your Kubernetes cluster using two essential OpenTelemetry receivers:
kubeletstats
: Collects metrics from individual Kubernetes nodesk8scluster
: Gathers cluster-wide metrics
Setting Up Kubelet Stats Monitoring
Overview
The Kubelet Stats Receiver collects four types of metrics:
- Node metrics (CPU, memory, etc.)
- Pod metrics
- Container metrics
- Volume metrics
These metrics are collected directly from the Kubelet API server running on each node.
Quick Start with k8s-infra Helm Chart
The simplest way to get started is using the k8s-infra
Helm chart, which automatically:
- Deploys agents to all cluster nodes
- Configures secure API access with appropriate RBAC permissions
- Enables essential metrics by default
Basic configuration example:
kubeletMetrics:
collectionInterval: 60s # Collect metrics every minute
metrics:
k8s.pod.cpu_request_utilization:
enabled: true # Enable specific optional metrics
Advanced Metrics Configuration
The receiver supports many optional metrics that provide deeper insights into cluster efficiency. Key metrics include:
k8s.node.cpu.usage
: Tracks CPU usage on the nodek8s.node.uptime
: Tracks node uptimek8s.pod.cpu.usage
: Tracks CPU usage on the podk8s.pod.cpu_limit_utilization
: Tracks CPU usage relative to configured limitsk8s.pod.cpu_request_utilization
: Measures actual CPU usage against requested resourcesk8s.pod.memory_limit_utilization
: Tracks memory usage relative to configured limitsk8s.pod.memory_request_utilization
: Monitors memory usage versus requestsk8s.pod.uptime
: Tracks pod stabilitycontainer.cpu.usage
: Tracks CPU usage on the containerk8s.container.cpu_limit_utilization
: Tracks CPU usage relative to configured limitsk8s.container.cpu_request_utilization
: Measures actual CPU usage against requested resourcesk8s.container.memory_limit_utilization
: Tracks memory usage relative to configured limitsk8s.container.memory_request_utilization
: Monitors memory usage versus requestscontainer.uptime
: Tracks container uptime
These are enabled by default in the k8s-infra
Helm chart. For a complete list of available optional metrics, see the OpenTelemetry documentation.
Manual Configuration Guide
If you're not using the k8s-infra
Helm chart, follow these steps for manual setup:
1. Configure Pod Access to Kubelet API
Add the node name to your pod specification:
env:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
2. Configure the Kubelet Stats Receiver
Add this configuration to your collector:
receivers:
kubeletstats:
auth_type: serviceAccount
collection_interval: 30s
endpoint: ${env:K8S_NODE_NAME}:10250
extra_metadata_labels:
- container.id
- k8s.volume.type
insecure_skip_verify: true
metric_groups:
- container
- pod
- node
- volume
metrics:
container.cpu.usage:
enabled: true
container.uptime:
enabled: true
k8s.container.cpu_limit_utilization:
enabled: true
k8s.container.cpu_request_utilization:
enabled: true
k8s.container.memory_limit_utilization:
enabled: true
k8s.container.memory_request_utilization:
enabled: true
k8s.node.cpu.usage:
enabled: true
k8s.node.uptime:
enabled: true
k8s.pod.cpu.usage:
enabled: true
k8s.pod.cpu_limit_utilization:
enabled: true
k8s.pod.cpu_request_utilization:
enabled: true
k8s.pod.memory_limit_utilization:
enabled: true
k8s.pod.memory_request_utilization:
enabled: true
k8s.pod.uptime:
enabled: true
node: ${env:K8S_NODE_NAME}
3. Enable Kubernetes Metadata
Configure the k8sattributes processor to enrich your metrics with Kubernetes context:
processors:
k8sattributes:
extract:
metadata:
- k8s.namespace.name
- k8s.deployment.name
- k8s.statefulset.name
- k8s.daemonset.name
- k8s.cronjob.name
- k8s.job.name
- k8s.node.name
- k8s.node.uid
- k8s.pod.name
- k8s.pod.uid
- k8s.pod.start_time
- k8s.container.name
filter:
node_from_env_var: K8S_NODE_NAME
passthrough: false
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
4. Pipeline Configuration
Ensure your pipeline includes both the kubeletstats receiver and k8sattributes processor:
service:
pipelines:
metrics:
receivers: [kubeletstats]
processors: [k8sattributes]
exporters: [otlp] # Add your exporters here
Read more about the kubeletstats receiver and k8sattributes processor.
Setting Up K8s Cluster Monitoring
Overview
The k8scluster receiver provides cluster-wide metrics by collecting data directly from the Kubernetes API server. Unlike the kubeletstats receiver which runs on each node, you only need a single instance of this receiver per cluster.
Quick Start with k8s-infra Helm Chart
The k8s-infra Helm chart provides a production-ready configuration by:
- Deploying a single, properly configured k8scluster receiver
- Setting up secure API access with appropriate RBAC permissions
- Enabling commonly needed metrics by default
Basic configuration example:
clusterMetrics:
collectionInterval: 60s # Collect metrics every minute
metrics:
k8s.node.condition:
enabled: true # Enable condition monitoring
Core Monitoring Features
1. Node Conditions
Monitor the health and status of your nodes with these key conditions:
Ready
: Node's ability to accept pods (enabled by default)MemoryPressure
: Memory resource constraintsDiskPressure
: Storage resource constraintsPIDPressure
: Process ID availabilityNetworkUnavailable
: Network configuration status
2. Resource Allocation Tracking
Monitor resource allocation across your cluster:
cpu
: CPU resource allocation (enabled by default)memory
: Memory resource allocation (enabled by default)ephemeral-storage
: Temporary storage allocationstorage
: Persistent storage allocation
3. Optional Metrics
Key optional metrics for enhanced monitoring:
k8s.node.condition
: Detailed node health statusk8s.pod.status_reason
: Root cause analysis for pod states
For a comprehensive list of available metrics, conditions, and resources, refer to the k8sclusterreceiver documentation.
Troubleshooting
- I am using the
k8s-infra
Helm chart, but still not seeing cluster metrics
Check the logs of the collector pod to see if there are any errors.
- I am using the latest version of the
k8s-infra
Helm chart, and there are no errors in the logs, but still not seeing cluster metrics
Verify:
- The collector has proper RBAC permissions
- The node name is correctly configured
- The kubelet API endpoint is accessible
Manual Configuration Guide
If you're not using the k8s-infra
Helm chart, follow this example:
1. Configure the K8s Cluster Receiver
receivers:
k8scluster:
collection_interval: 30s
node_conditions_to_report: [Ready, DiskPressure, MemoryPressure, PIDPressure, NetworkUnavailable]
allocatable_types_to_report: [cpu, memory, ephemeral-storage]
distribution: kubernetes
metrics:
k8s.node.condition:
enabled: true
k8s.pod.status_reason:
enabled: true
2. Enable Kubernetes Metadata
Configure the k8sattributes processor to add Kubernetes context:
processors:
k8sattributes:
extract:
metadata:
- k8s.namespace.name
- k8s.deployment.name
- k8s.statefulset.name
- k8s.daemonset.name
- k8s.cronjob.name
- k8s.job.name
- k8s.node.name
- k8s.node.uid
- k8s.pod.name
- k8s.pod.uid
- k8s.pod.start_time
- k8s.container.name
passthrough: false
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
3. Pipeline Configuration
Set up the metrics pipeline:
service:
pipelines:
metrics:
receivers: [k8scluster]
processors: [k8sattributes]
exporters: [otlp] # Add your exporters here
Best Practices
- Start with the
k8s-infra
Helm chart if possible - it provides a tested, sensible default configuration - Enable optional metrics for deeper visibility into your cluster
- Use the k8sattributes processor to add context to your metrics
Troubleshooting
- I am using the
k8s-infra
Helm chart, but still not seeing cpu/memory requests/limits
Check the version of the k8s-infra
helm chart you are using. The optional metrics are enabled by default in the latest version of the helm chart. Prior to version 0.11.10, the optional metrics were not enabled by default.
- I am using the latest version of the
k8s-infra
Helm chart, but still not seeing cpu/memory requests/limits
Verify:
- The collector has proper RBAC permissions
- The node name is correctly configured
- The kubelet API endpoint is accessible
- I am not seeing deployments, statefulsets available/desired, daemonsets, cronjobs, jobs available/desired/succeeded/failed in the metrics
Verify:
- Ensure the otel-deployment collector is running
- Check the logs of the otel-deployment collector to see if there are any errors
- Metric metadata is missing
By default, the kubeletstats/k8scluster receivers do not have rich metadata. To enable rich metadata, you need to enable the k8sattributes
processor.