SigNoz Cloud - This page is relevant for SigNoz Cloud editions.
Self-Host - This page is relevant for self-hosted SigNoz editions.

SLURM metrics

This document explains how to monitor a SLURM cluster using SigNoz. You'll run a Prometheus-compatible SLURM exporter and scrape it with the OpenTelemetry Collector.

Prerequisites

  • SLURM cluster running and accessible
  • Access to SLURM CLI commands (sinfo, squeue, sdiag) on the exporter host

Setup

Step 1: Run a SLURM Prometheus Exporter

A commonly used option is prometheus-slurm-exporter, which extracts metrics from SLURM CLI commands and exposes them on a /metrics endpoint (default port :8080).

Run the exporter on a node that has access to SLURM commands.

Step 2: Setup OTel Collector

Refer to this documentation to set up the collector.

Step 3: Configure the Prometheus Receiver

Add a scrape job for the SLURM exporter in your OTel Collector config:

config.yaml
receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: "slurm-exporter"
          scrape_interval: 30s
          scrape_timeout: 30s
          static_configs:
            - targets: ["<slurm-exporter-host>:8080"]

Configuration parameters:

  • <slurm-exporter-host>: Hostname or IP of the node running the SLURM exporter
  • scrape_interval/scrape_timeout: 30s is recommended to avoid overloading the SLURM master

Step 4: Enable the Pipeline

Add the receiver to your metrics pipeline:

config.yaml
service:
  pipelines:
    metrics:
      receivers: [prometheus]  # append prometheus to your existing receivers list
      processors: [batch]
      exporters: [otlp]

Visualizing SLURM Metrics

Once configured, verify ingestion in the Metrics Explorer. Search for SLURM-related metrics (exact names depend on the exporter).

You can use the pre-configured SLURM dashboard to monitor your cluster:

Dashboards → + New dashboard → Import JSON

Troubleshooting

Common Issues

  1. No metrics appearing in SigNoz

    • Verify the SLURM exporter is running and /metrics endpoint is accessible
    • Ensure firewall allows access to the exporter port
  2. Metrics showing stale or zero values

    • Confirm the exporter host has access to SLURM CLI commands
    • Check if SLURM services are running correctly

Last updated: March 3, 2026

Edit on GitHub