Prometheus is a powerful monitoring tool, and PromQL (Prometheus Query Language) is its heart. However, one thing you may quickly realize is that PromQL doesn’t support traditional if-else statements. So, how do you implement conditional logic? In this guide, we’ll dive into the ways PromQL can simulate if-else behavior using its powerful set of operators and show you how to build dynamic queries to monitor your systems effectively.

Understanding PromQL and Its Importance in Prometheus Monitoring

PromQL is the query language used to retrieve and manipulate time-series data stored by Prometheus. Its primary function is to extract useful insights from metric data and drive alerting and monitoring in dynamic environments.

Key features of PromQL include:

  1. Time-series focus: Optimized for querying and manipulating time-stamped data.
  2. Dimensional data model: Uses labels to create multi-dimensional metrics.
  3. Functional approach: Treats data as vectors and applies functions to transform them.
  4. Built-in aggregations: Offers various methods to combine and summarize data.

Compared to SQL or InfluxQL, PromQL shines in its ability to handle high-cardinality data and perform complex time-based calculations efficiently. This makes it ideal for monitoring microservices, containerized environments, and large-scale distributed systems.

Common use cases for PromQL include:

  • Calculating error rates across services
  • Monitoring resource utilization trends
  • Setting up dynamic alerting thresholds
  • Creating detailed performance dashboards

The Concept of If-Else Like Expressions in PromQL

In traditional programming languages, you rely on if-else statements for decision-making. PromQL, however, doesn't have native if-else statements, which can be limiting when trying to implement conditional logic like:

  • Combining multiple logical operators can result in complex and hard-to-read queries.
  • Performance issues and increased error rates

However, PromQL offers operators and vector matching that help mimic if-else-like behavior. By using logical operators like and, or, and unless, you can effectively achieve conditional querying.

Why are if-else-like expressions needed in monitoring queries? Consider these scenarios:

  1. You want to alert on high CPU usage, but only during business hours.
  2. You need to calculate different SLO thresholds based on the service tier.
  3. You want to adjust error rate calculations based on traffic volume.

In each case, you need to apply different logic or calculations based on certain conditions — the essence of if-else logic.

Implementing If-Else Logic Using PromQL Operators

Let's dive into the core of implementing conditional logic. We'll rely on the logical operators available in PromQL to simulate the behavior of if-else statements.

Using the and Operator

The and operator is handy for filtering metrics based on conditions. For example, if you want to alert only when CPU usage is above 90% and the service is running, you can write:

up{job="prometheus"} == 1 and scrape_duration_seconds < 0.5

The query will return results only when both conditions are true (mimicking the if statement)

  • The Prometheus service is up (up == 1).
  • The scrape duration is less than 0.5 seconds.
Using the and Operator
Using the and Operator

Using the or Operator

The or operator provides a fallback option. For example, if you want to monitor the number of HTTP 500 errors or the service being unavailable, you can write:

http_requests_total{status="500"} or up{job="prometheus"} == 0

This returns results if either of the conditions is true, like an "else if" clause. It checks for HTTP requests with a status of 500 (indicating server errors) or checks if the Prometheus service is down (up == 0).

Using the or Operator
Using the or Operator

The query http_requests_total{status="500"} or up == 0 checks for the total number of HTTP requests with a 500 status code, or if any targets are down (up == 0), returning results if either condition is true.

Using the or Operator in PromQL query
Using the or Operator in PromQL query

Combining and and or for Complex Logic

By combining these operators, you can handle more sophisticated conditions. For example:

(up{job="python-job"} == 0 and scrape_duration_seconds{job="python-job"} < 0.5) or (up{job="prometheus"} == 0 and scrape_duration_seconds{job="prometheus"} < 0.5)
  • The query will return true if either of the following is true:
    • The python-job is down and its scrape duration was fast (less than 0.5 seconds), OR
    • The prometheus job is down and its scrape duration was fast (less than 0.5 seconds).
Combining and and or for Complex Logic
Combining and and or for Complex Logic

In summary, this query helps detect when either of the two jobs (python-job or prometheus) is down, while highlighting that their last scrape duration was quick, possibly indicating that the job went down right after a successful scrape.

Example: Conditional Alerting Based on Metric Values

A common use case for if-else logic in PromQL is setting up conditional alerts. Let’s create an alert that triggers if either the Prometheus service or the Python job is down:

up{job="prometheus"} == 1 or up{job="python-job"} == 1

The above query returns the value 2 if both the services are operational, so you can set the threshold as a value below 2 and trigger alerts. If any one of the services is non-operational, the count will go below 2.

Steps for setting up alerts in Grafana:

  • Add Prometheus as the data source in your Grafana server.
  • Go to Alerting → Alert Rules → New alert rule button.
  • Provide details like:
    • Alert rule name: Any meaningful name of your choice.

    • Select the data source

    • Add in the query based on the result of which alert needs to be triggered. Set the threshold for the same.

      Setting alert rule in Grafana
      Setting alert rule in Grafana
    • Click on Set as alert condition

  • Next, save the rule and exit.

Your alerts will get fired if the specified condition is met.

Firing alerts using Grafana
Firing alerts using Grafana

Handling Multiple Conditions

Multiple conditions can be handled using nested and and or operators:

(sum(prometheus_http_requests_total{job="python-job"}) > 1000 and up{job="prometheus"} == 1) or
(sum(prometheus_http_requests_total{job="python-job"}) <= 1000 and up{job="python-job"} == 1) or
(up{job="prometheus"} == 0) or
(up{job="python-job"} == 0)

This query evaluates multiple conditions related to the HTTP requests for the python-job and the status of the prometheus job:

  1. Condition 1: (sum(prometheus_http_requests_total{job="python-job"}) > 1000 and up{job="prometheus"} == 1)
    • This checks if the total HTTP requests for the python-job exceed 1000 and the prometheus job is up. If both conditions are true, you may want to take action or trigger an alert.
  2. Condition 2: (sum(prometheus_http_requests_total{job="python-job"}) <= 1000 and up{job="python-job"} == 1)
    • This checks if the total HTTP requests for the python-job are 1000 or fewer while ensuring that the python-job is operational. This may indicate that the service is underutilized.
  3. Condition 3: (up{job="prometheus"} == 0)
    • This condition checks if the prometheus job is down. If true, it may trigger alerts regardless of request counts.
  4. Condition 4: (up{job="python-job"} == 0)
    • This checks if the python-job is down, indicating an issue that should trigger an alert.

Dealing with Edge Cases and Potential Pitfalls

To handle edge cases, always consider:

  1. What happens if a metric is missing?
  2. How do you deal with label mismatches?
  3. Are your time ranges consistent across conditions?

Testing conditional queries thoroughly is crucial. Use Prometheus's expression browser or tools like Grafana to visualize your query results over different time ranges and scenarios.

Advanced Techniques for If-Else-Like Behavior in PromQL

While basic operators cover a lot, PromQL offers advanced techniques for more complex logic.

Using the unless Operator

The unless operator is like a "negative if." It returns data only if one condition is not true:

sum(prometheus_http_requests_total) unless up{job="prometheus"} == 0

This query returns the total number of HTTP requests unless the Prometheus service is down.

Using the unless Operator
Using the unless Operator

Ternary-Like Operations with Vector Matching

In a traditional ternary operator, you would write something like:

condition ? value_if_true : value_if_false

PromQL can handle ternary-like behavior using vector matching. For example, to get different results based on conditions, you can use a query like:

sum(prometheus_http_requests_total{job="prometheus"}) > 1000 or vector(0)
Ternary-like behavior using vector matching- Satisfying if condition
Ternary-like behavior using vector matching- Satisfying if condition
Ternary-like behavior using vector matching - Satisfying else condition
Ternary-like behavior using vector matching - Satisfying else condition

This allows you to simulate ternary logic with metrics, returning different values depending on the condition.

In this case, the expression behaves similarly:

  • If sum(prometheus_http_requests_total{job="prometheus"}) < 1000, it evaluates to true, resulting in a value of 1.
  • If sum(prometheus_http_requests_total{job="prometheus"}) >= 1000, it evaluates to false, resulting in a value of 0.

So, the PromQL expression acts like the following pseudo-ternary:

if sum(prometheus_http_requests_total{job="prometheus"}) < 1000 then 1
else 0

Subqueries for Time-Based Logic

Subqueries can simulate time-dependent if-else logic. For example, if you want to check a condition over a period of time:

avg_over_time(metric[5m]) > 100

This query checks if the average value of metric over the last 5 minutes exceeds 100, simulating a time-based condition.

Reusable "Functions" with Recording Rules

Recording rules allow you to precompute complex conditions and reuse them in queries, which is useful for simulating complex if-else logic across different queries. Check out the detailed documentation here.

Real-World Scenarios: Applying If-Else Logic in Prometheus Monitoring

Let's explore some practical applications of conditional logic in PromQL:

Holiday-Aware Alerting

In certain environments, you may want to avoid triggering alerts on specific holidays when systems are expected to operate under different conditions.

high_cpu_usage and on(day) (day_of_week != 6 and day_of_week != 0) 
unless on(date) holiday_calendar == 1

This query adjusts CPU usage alerts to account for weekends and holidays.

Explanation:

  • high_cpu_usage: This metric represents high CPU usage.
  • and on(day): You are joining on the day label. If the day_of_week is labeled by day, you can join them accordingly.
  • (day_of_week != 6 and day_of_week != 0): This part filters out Saturdays (6) and Sundays (0). You should use the numeric values instead of string literals.
  • unless on(date) holiday_calendar == 1: This excludes holidays as per the holiday_calendar time series, assuming that holidays are marked by 1.

Conditional Rate Calculations

You can perform rate calculations conditionally, depending on whether metrics cross specific thresholds. The following query calculates the request rate but only for time series where the total number of requests exceeds 1000:

rate(http_requests_total[1m]) and http_requests_total > 1000

Explanation:

  • rate(http_requests_total[1m]): This calculates the per-second rate of increase of http_requests_total over the last minute.
  • and http_requests_total > 1000: This filters out time series where the value of http_requests_total is less than or equal to 1000. The and operator only retains time series where both conditions are true.

Service-Specific Monitoring

You can target specific services or instances by matching labels within your query. For example, to monitor the uptime of a particular service, you can use:

up{service="web",job="api"}

Optimizing Performance of Conditional PromQL Queries

Complex conditional queries can impact Prometheus’ performance. Here are some optimization techniques:

  1. Simplify expressions: Break down complex queries into smaller, reusable components.
  2. Use pre-computed metrics: Calculate frequently used conditions in advance and store them as new metrics.
  3. Optimize label matching: Use specific label matches instead of broad regex patterns.
  4. Balance query frequency: Adjust scrape intervals and evaluation periods based on the metric's volatility and importance.

Enhancing Monitoring with SigNoz: A PromQL-Compatible Alternative

If your monitoring setup needs more advanced conditional logic, consider using SigNoz. SigNoz is a modern observability platform that extends Prometheus’ capabilities and allows for better handling of complex queries. With built-in support for OpenTelemetry and a PromQL-compatible query system, SigNoz makes managing and optimizing complex queries easier.

SigNoz extends PromQL capabilities by:

  1. Offering a more user-friendly interface for building complex queries

  2. Providing advanced visualization options for conditional data

  3. Integrating traces and logs with metrics for comprehensive observability

    Integrating traces and logs with metrics for comprehensive observability
    Integrating traces and logs with metrics for comprehensive observability

SigNoz cloud is the easiest way to run SigNoz. Sign up for a free account and get 30 days of unlimited access to all features.

Get Started - Free CTA

You can also install and self-host SigNoz yourself since it is open-source. With 19,000+ GitHub stars, open-source SigNoz is loved by developers. Find the instructions to self-host SigNoz.

Key Takeaways

  • PromQL enables powerful conditional logic through the creative use of operators and vector matching.
  • Complex conditions require careful structuring and optimization for performance.
  • Real-world scenarios often involve combining multiple PromQL techniques.
  • Advanced platforms like SigNoz can enhance your monitoring capabilities while maintaining PromQL compatibility.

FAQs

How does PromQL handle null or missing values in conditional queries?

PromQL typically excludes null or missing values from calculations. Use the absent() function to explicitly check for missing metrics in your conditions.

Can I use regular expressions in PromQL conditional statements?

Yes, PromQL supports regex matching for label values. Use the =~ operator for regex matches and !~ for regex exclusions.

What are the performance implications of using complex conditional logic in PromQL?

Complex conditions can increase query execution time and resource usage. Optimize by simplifying expressions, using pre-computed metrics, and balancing query frequency with your monitoring needs.

How can I debug complex conditional queries in PromQL?

Use the Prometheus web interface to test your queries and visualize the results. Break down complex queries into smaller components and test each part individually for better debugging.

Was this page helpful?