Distributed ClickHouse with Kubernetes
Running ClickHouse in a distributed mode on Kubernetes allows you to scale horizontally and achieve high availability for your data workloads. By leveraging Kubernetes orchestration, you can easily manage multiple ClickHouse nodes, shards, and replicas, as well as integrate with Zookeeper for cluster coordination. This guide walks you through configuring a distributed ClickHouse cluster using Helm charts, with recommended settings for production environments.
To set up ClickHouse cluster of 2 shards with 2 replicas each and 3 nodes Zookeeper cluster, include the following in override-values.yaml
:
clickhouse:
layout:
shardsCount: 2
replicasCount: 2
zookeeper:
replicaCount: 3
schemaMigrator:
enableReplication: true
In case of single replica in distributed ClickHouse cluster, you can use replicasCount: 1
and disable replication by either removing enableReplication
or setting enableReplication: false
in schemaMigrator
.
Followed by helm upgrade
command:
helm --namespace platform upgrade my-release signoz/signoz -f override-values.yaml
To spread ClickHouse instances across multiple nodes in desired order, update clickhouse.podDistribution
in values.yaml
.
Examples:
- All instances in unique nodes:
clickhouse: podDistribution: - type: ClickHouseAntiAffinity topologyKey: kubernetes.io/hostname
- Distribute shards of replicas across nodes:
clickhouse: podDistribution: - type: ReplicaAntiAffinity topologyKey: kubernetes.io/hostname
- Distribute replicas of shards across nodes:
clickhouse: podDistribution: - type: ShardAntiAffinity topologyKey: kubernetes.io/hostname
For detailed instructions on the Pod Distribution, see here.
Replace my-release
and platform
from above with appropriate release name and SigNoz namespace respectively.