Skip to main content

How to Monitor PostgreSQL metrics with OpenTelemetry

· 18 min read
Deepam Kapur

PostgreSQL metrics monitoring is important to ensure that PostgreSQL is performing as expected and to identify and resolve problems quickly. In this tutorial, you will install OpenTelemetry Collector to collect PostgreSQL metrics and then send the collected data to SigNoz for monitoring and visualization.

Cover Image

In this tutorial, we cover:

If you want to jump straight into implementation, start with this prerequisites section.

A Brief Overview of PostgreSQL

PostgreSQL is a sophisticated open-source relational database management system catering to everyone from coding rookies to seasoned tech experts. It is often hailed as the "world's most advanced open-source relational database.” It is a robust and versatile database management system that caters to a broad spectrum of developer communities.

Renowned for its extensibility, it supports a myriad of data types and offers advanced indexing, making it adaptable to diverse applications. One of its standout features is ACID compliance, ensuring transactions are reliably processed. With a focus on extensibility, it allows users to define their data types, operators, and functions.

PostgreSQL has the tools to make managing and retrieving information a breeze. Whether you're a novice or an expert in the field, PostgreSQL stands as a reliable and feature-rich database solution.

A Brief Overview of OpenTelemetry

OpenTelemetry is a set of APIs, SDKs, libraries, and integrations aiming to standardize the generation, collection, and management of telemetry data(logs, metrics, and traces). It is backed by the Cloud Native Computing Foundation and is the leading open-source project in the observability domain.

The data you collect with OpenTelemetry is vendor-agnostic and can be exported in many formats. Telemetry data has become critical in observing the state of distributed systems. With microservices and polyglot architectures, there was a need to have a global standard. OpenTelemetry aims to fill that space and is doing a great job at it thus far.

In this tutorial, you will use OpenTelemetry Collector to collect PostgreSQL metrics for performance monitoring.

What is OpenTelemetry Collector?

OpenTelemetry Collector is a stand-alone service provided by OpenTelemetry. It can be used as a telemetry-processing system with a lot of flexible configurations to collect and manage telemetry data.

It can understand different data formats and send it to different backends, making it a versatile tool for building observability solutions.

Read our complete guide on OpenTelemetry Collector

How does OpenTelemetry Collector collect data?

A receiver is how data gets into the OpenTelemetry Collector. Receivers are configured via YAML under the top-level receivers tag. There must be at least one enabled receiver for a configuration to be considered valid.

Here’s an example of an otlp receiver:

receivers:
otlp:
protocols:
grpc:
http:

An OTLP receiver can receive data via gRPC or HTTP using the OTLP format. There are advanced configurations that you can enable via the YAML file.

Here’s a sample configuration for an otlp receiver.

receivers:
otlp:
protocols:
http:
endpoint: "localhost:4318"
cors:
allowed_origins:
- http://test.com
# Origins can have wildcards with *, use * by itself to match any origin.
- https://*.example.com
allowed_headers:
- Example-Header
max_age: 7200

You can find more details on advanced configurations here.

After configuring a receiver, you must enable it. Receivers are enabled via pipelines within the service section. A pipeline consists of a set of receivers, processors, and exporters.

The following is an example pipeline configuration:

service:
pipelines:
metrics:
receivers: [otlp, prometheus]
exporters: [otlp, prometheus]
traces:
receivers: [otlp, jaeger]
processors: [batch]
exporters: [otlp, zipkin]

Now that you understand how OpenTelemetry collector collects data, let’s see how you can collect PostgreSQL metrics with OpenTelemetry. You can find the list of metrics supported by OpenTelemetry here.

Prerequisites

This tutorial assumes that the OpenTelemetry Collector is installed on the same host as the PostgreSQL instance.

User for OpenTelemetry - The OpenTelemetry Collector for PostgreSQL needs to have permission for SELECT on pg_stat_database. You can create a dedicated user for OpenTelemetry and grant it the necessary permissions. To create a user & grant access for OpenTelemetry, you can use the following commands:

-- Create a new user with a password
CREATE USER your_username WITH PASSWORD 'your_password';

-- Grant SELECT privilege on pg_stat_database to the new user
GRANT SELECT ON pg_stat_database TO your_username;

If PostgreSQL is not on the same server as OpenTelemetry Collector

danger

It is strongly advised not to open this port to the public. You can open it for specific IPs or private cloud only.

In this article, our assumption is your PostgreSQL server, and your OpenTelemetry collector will be on the same server. If this is not the case, then you should do an extra step to open your PostgreSQL service port (assuming 5432).

Open TCP port 5432 - The OpenTelemetry Collector for PostgreSQL communicates with the databases over TCP port. Therefore, you need to open the port.

To open TCP port 5432 on a Linux server, you can use the following command:

sudo ufw allow 5432/tc

If you are using some firewall for your VMs to block inbound traffic there too you would need to allow 5432 port.

Once you have completed these preparations, you can install and run the OpenTelemetry Collector for PostgreSQL.

Setting up SigNoz

You need a backend to which you can send the collected data for monitoring and visualization. SigNoz is an OpenTelemetry-native APM that is well-suited for visualizing OpenTelemetry data.

SigNoz cloud is the easiest way to run SigNoz. You can sign up here for a free account and get 30 days of unlimited access to all features.

You can also install and self-host SigNoz yourself. Check out the docs for installing self-host SigNoz.

Setting Up OpenTelemetry Collector

The OpenTelemetry Collector offers various deployment options to suit different environments and preferences. It can be deployed using Docker, Kubernetes, Nomad, or directly on Linux systems. You can find all the installation options here.

We are going to discuss the manual installation here and resolve any hiccups that come in the way.

Step 1 - Downloading OpenTelemetry Collector

Download the appropriate binary package for your Linux or macOS distribution from the OpenTelemetry Collector releases page. We are using the latest version available at the time of writing this tutorial.

For MACOS (arm64):

curl --proto '=https' --tlsv1.2 -fOL https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.88.0/otelcol-contrib_0.88.0_darwin_arm64.tar.gz

Step 2 - Extracting the package

Create a new directory named otelcol-contrib and then extract the contents of the otelcol-contrib_0.88.0_darwin_arm64.tar.gz archive into this newly created directory with the following command:

mkdir otelcol-contrib && tar xvzf otelcol-contrib_0.88.0_darwin_arm64.tar.gz -C otelcol-contrib

Step 3 - Setting up the Configuration file

Create a config.yaml file in the otelcol-contrib folder. This configuration file will enable the collector to connect with PostgreSQL and have other settings like at what frequency you want to monitor the instance.

The config file has a receiver enabled for hostmetrics too which will enable you to

Go into the directory:

cd otelcol-contrib

Create config.yaml in folder otelcol-contrib with the below content in it.

receivers:
hostmetrics:
collection_interval: 30s
scrapers:
cpu: {}
disk: {}
load: {}
filesystem: {}
memory: {}
network: {}
paging: {}
process:
mute_process_name_error: true
mute_process_exe_error: true
mute_process_io_error: true
processes: {}
postgresql:
endpoint: <POSTGRESQL_URL>:<POSTGRESQL_PORT>
username: <POSTGRESQL_USERNAME>
password: <POSTGRESQL_PASSWORD>
postgresql/all:
endpoint: <POSTGRESQL_URL>:<POSTGRESQL_PORT>
transport: tcp
username: <POSTGRESQL_USERNAME>
password: <POSTGRESQL_PASSWORD>
databases:
- <YOUR DATABASE 1>
- <YOUR DATABASE 2>
# ... All databases you want to monitor
collection_interval: 10s
tls:
insecure: true
insecure_skip_verify:
true
# If your postgresql is connected by ssl you can use following settings
# insecure: false
# insecure_skip_verify: false
# ca_file: /home/otel/authorities.crt
# cert_file: /home/otel/mypostgrescert.crt
# key_file: /home/otel/mypostgreskey.key
processors:
batch:
send_batch_size: 1000
timeout: 10s
# Ref: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/resourcedetectionprocessor/README.md
resourcedetection:
detectors: [env, system, ec2] # include ec2 for AWS, gcp for GCP and azure for Azure.
# Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
timeout: 2s
override: false
system:
hostname_sources: [os] # alternatively, use [dns,os] for setting FQDN as host.name and os as fallback
exporters:
otlp:
endpoint: "ingest.{region}.signoz.cloud:443" # replace {region} with your region
tls:
insecure: false
headers:
"signoz-access-token": <SIGNOZ_ACCESS_TOKEN>
logging:
verbosity: detailed
service:
telemetry:
metrics:
address: localhost:8888
pipelines:
metrics:
receivers: [postgresql, postgresql/all]
processors: [batch]
exporters: [otlp]
metrics/hostmetrics:
receivers: [hostmetrics]
processors: [resourcedetection, batch]
exporters: [otlp]

You would need to replace the following details for the config to work properly:

  • Endpoint for your PostgreSQL instance. In place of POSTGRESQL_URL and port in place of POSTGRESQL_PORT
  • Details of username that can be used for monitoring. (set in prerequisites section) username: <POSTGRESQL_USERNAME> password: <POSTGRESQL_PASSWORD>
  • Add all the database names you want to monitor in place of <YOUR DATABASE 1>, <YOUR DATABASE 2>
  • Under exporters, configure the endpoint for SigNoz cloud along with the ingestion key under signoz-access-token. You can find these settings in the SigNoz dashboard.
You can find ingestion details in the SigNoz dashboard
You can find ingestion details in the SigNoz dashboard

Also note how we have set up the pipeline in the service section of the config. We have added postgresql and postgresql/all in the receiver section of metrics and set the exporter to otlp.

Step 4 - Running the collector service

Every Collector release includes an otelcol executable that you can run. Since we’re done with configurations, we can now run the collector service with the following command.

From the otelcol-contrib, run the following command:

./otelcol-contrib --config ./config.yaml

If you want to run it in the background:

./otelcol-contrib --config ./config.yaml &> otelcol-output.log & echo "$\!" > otel-pid

Step 5 - Debugging the output

If you want to see the output of the logs, we’ve just set up for the background process. You may look it up with:

tail -f -n 50 otelcol-output.log

tail 50 will give the last 50 lines from the file otelcol-output.log

You can check if your service is running successfully by seeing the last log something like -

2023-11-17T19:33:20.161Z  info  [email protected]/service.go:169  Everything is ready. Begin running and processing data.

You can stop the collector service with the following command:

kill "$(< otel-pid)"

Monitoring with SigNoz Dashboard

Once the above setup is done, you will be able to access the metrics in the SigNoz dashboard. You can go to the Dashboards tab and try adding a new panel. You can learn how to create dashboards in SigNoz here.

PostgreSQL Metrics

Signoz Panel PostgreSQL Metrics to add on Dashboard
Signoz Panel PostgreSQL Metrics to add on Dashboard

If you want to get started quickly with PostgreSQL monitoring, you can load this JSON in SigNoz dashboard and get started.

PostgreSQL Panel

Signoz Panel PostgreSQL Metrics
Signoz Panel PostgreSQL Metrics

PostgreSQL basic metrics dashboard

Signoz PostgreSQL Dashboard
Signoz PostgreSQL Dashboard

You can also create alerts on any metric. Learn how to create alerts here.

Metrics & Attributes for PostgreSQL supported by OpenTelemetry

The following metrics and resource attributes for PostgreSQL can be collected by the Opentelemetry Collector.

Metrics

These metrics are enabled by default. Collectors provide many metrics that you can use to monitor how your PostgreSQL server is performing or if something is not right.

Key Terms for Metrics & Attributes

Name: The name of the metric is a unique identifier that distinguishes it from other metrics. It helps in referencing and organizing various metrics on SigNoz as well.

Type: The type of metric defines the kind of data it represents. Common metric types include INT, DOUBLE, STRING, etc.

Unit: The unit specifies the measurement unit associated with the metric. It helps in interpreting and comparing metric values, including Bytes, NONE, etc.

Temporality: It involves understanding the temporal patterns and fluctuations within the data, providing insights into how the metric evolves over time. Temporality is crucial for analyzing trends, identifying patterns, and making informed decisions based on the temporal behavior of the observed metric.

Monotonicity: It refers to the cumulative nature of a metric, indicating that its values continuously increase or remain non-decreasing over time. This property is crucial for tracking cumulative values, such as the total count of events or occurrences.

MetricDescriptionNameTypeValue TypeUnitTemporalityMonotonic
BackendsThe number of backendspostgresql.backendsSUMINTNONECumulativefalse
Maximum ConnectionsConfigured maximum number of client connections allowedpostgresql.connection.maxGAUGEINTNONEN/AN/A
Database CountsNumber of user databasespostgresql.database.countSUMINTNONECumulativefalse
Database SizeThe database disk usagepostgresql.db_sizeSUMINTBytesCumulativefalse
Index ScansThe number of index scans on a tablepostgresql.index.scansSUMINTNONECumulativetrue
Index ScansThe number of index scans on a tablepostgresql.index.scansSUMINTNONECumulativetrue
Index SizeThe size of the index on diskpostgresql.index.sizeGAUGEINTBytesN/AN/A
Database OperationsThe number of db row operationspostgresql.operationsSUMINTNONECumulativetrue
Delay in ReplicationThe amount of data delayed in replicationpostgresql.replication.data_delayGAUGEINTBytesN/AN/A
RollbacksThe number of rollbackspostgresql.rollbacksSUMINTNONECumulativetrue
Total RowsThe number of rows in the databasepostgresql.rowsSUMINTNONECumulativefalse
Total TablesNumber of user tables in a databasepostgresql.table.countSUMINTNONECumulativefalse
Table SizesDisk space used by a tablepostgresql.table.sizeSUMINTBytesCumulativefalse
Table VacuumNumber of times a table has manually been vacuumedpostgresql.table.vacuum.countSUMINTNONECumulativetrue
WAL File AgeAge of the oldest WAL file. This metric requires WAL to be enabled with at least one replica.postgresql.wal.ageGAUGEINTsecondsN/AN/A
WAL File LagTime between flushing recent WAL locally and receiving notification that the standby server has completed an operation with it.
This metric requires WAL to be enabled with at least one replica.postgresql.wal.lagGAUGEINTsecondsN/AN/A
Background Writer Buffer AllocatedNumber of buffers allocatedpostgresql.bgwriter.buffers.allocatedSUMINTNONECumulativetrue
Background Writer Buffer WritesNumber of buffers writtenpostgresql.bgwriter.buffers.writesSUMINTNONECumulativetrue
Background Writer Checkpoint CountThe number of checkpoints performedpostgresql.bgwriter.checkpoint.countSUMINTNONECumulativetrue
Background Writer DurationTotal time spent writing and syncing files to disk by checkpointspostgresql.bgwriter.durationSUMDOUBLEMillisecondsCumulativetrue
Background Writer Maximum WrittenNumber of times the background writer stopped a cleaning scan because it had written too many bufferspostgresql.bgwriter.maxwrittenSUMINTNONECumulativetrue
Blocks ReadThe number of blocks readpostgresql.blocks_readSUMINTNONECumulativetrue
CommitsThe number of commitspostgresql.commitsSUMINTNONECumulativetrue

Optional Metrics

These metrics are not emitted by default. Each of them can be enabled by applying the following configuration:

metrics:
<metric_name>:
enabled: true
MetricDescriptionNameTypeValue TypeUnitTemporalityMonotonic
Database LocksThe number of database lockspostgresql.database.locksGAUGEINTNONEN/AN/A
DeadlocksThe number of deadlockspostgresql.deadlocksSUMINTNONECumulativetrue
Sequential ScansThe number of sequential scanspostgresql.sequential_scansSUMINTNONECumulativetrue
Temp FilesThe number of temp filespostgresql.temp_filesSUMINTNONECumulativetrue
WAL DelayThe time between flushing recent WAL locally and receiving notification that the standby server has completed an operation with it. This metric requires WAL to be enabled with at least one replicapostgresql.wal.delayGAUGEDOUBLEsecondsN/AN/A

Attributes

MetricAttributeDescriptionValue TypeValues
Background Buffer WritessourceThe source of a buffer writeSTRINGbackend, backend_fsync, checkpoints, bgwriter
Background Checkpoint CounttypeThe type of checkpoint stateSTRINGrequested, scheduled
Background Writer DurationtypeThe type of time spent during the checkpointSTRINGsync, write
Blocks ReadsourceThe block read source typeSTRINGheap_read, heap_hit, idx_read, idx_hit, toast_read, toast_hit, tidx_read, tidx_hit
Database OperationsoperationThe database operationSTRINGins, upd, del, hot_upd
Delay in Replicationreplication_clientThe IP address of the client connected to this backend. If this field is "unix", it indicates either that the client is connected via a Unix socketSTRINGANY
Total RowsstateThe tuple (row) stateSTRINGdead, live
WAL File LagoperationThe operation which is responsible for the lagSTRINGflush, replay, write
WAL File Lagreplication_clientThe IP address of the client connected to this backend. If this field is "unix", it indicates either that the client is connected via a Unix socketSTRINGANY
Database LocksrelationOID of the relation targeted by the lock, or null if the target is not a relation or part of a relationSTRINGANY
Database LocksmodeName of the lock mode held or desired by the processSTRINGANY
Database Lockslock_typeType of the lockable objectSTRINGANY
WAL DelayoperationThe operation which is responsible for the lagSTRINGflush, replay, write
WAL Delayreplication_clientThe IP address of the client connected to this backend. If this field is "unix", it indicates either that the client is connected via a Unix socketSTRINGANY

Resource Attributes

These attributes will be enabled by default:

AttributeDescriptionValue Type
postgresql.database.nameThe name of the databaseSTRING
postgresql.index.nameThe name of the index on a tableSTRING
postgresql.table.nameThe schema name followed by the table nameSTRING

These are the currently supported metrics at the time the article is written. Check the OpenTelemetry Collector Contrib repo for any updates.

Conclusion

In this tutorial, you installed an OpenTelemetry Collector to collect PostgreSQL metrics and sent the collected data to SigNoz for monitoring and alerts.

Visit our complete guide on OpenTelemetry Collector to learn more about it. OpenTelemetry is quietly becoming the world standard for open-source observability, and by using it, you can have advantages like a single standard for all telemetry signals, no vendor lock-in, etc.

SigNoz is an open-source OpenTelemetry-native APM that can be used as a single backend for all your observability needs.


Further Reading

Complete Guide on OpenTelemetry Collector

An OpenTelemetry-native APM