Running Prometheus

All Topics

Recording and Alerting Rules

Recording and Alerting Rules

Recording rules. These rules pre-compute expensive PromQL queries on a schedule and store the result as a new time series. This makes dashboards faster and reduces query load (especially for complex aggregations used often).

Alerting rules. These define conditions that should trigger an alert. Example: if a target is down (up == 0) for 5 minutes, fire an alert. The alert is sent to Alertmanager (if configured).

Rules live in separate files. Prometheus reads rule files listed under rule_files in prometheus.yml. This keeps config cleaner and makes rules easier to manage.

Validate rules before applying. Use promtool to check syntax and catch mistakes early:

promtool check rules rules.yml

Minimal prometheus.yml referencing rules

global:
  scrape_interval: 15s
  evaluation_interval: 15s   # how often Prometheus evaluates rules

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

rule_files:
  - "rules.yml"

Basic rules.yml (recording + alerting)

groups:
  - name: basic_rules
    interval: 15s

    rules:
      # Recording rule: create a new time series with a simpler name
      - record: job:up:avg
        expr: avg by (job) (up)

      # Alerting rule: fire if any target is down for 5 minutes
      - alert: TargetDown
        expr: up == 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Target down"
          description: "{{ $labels.instance }} (job={{ $labels.job }}) is down"

Recording rules and alerting rules are covered in depth in their own dedicated training topics.