Prometheus

22 Apr 2018

Prometheus

for live reload either enable --web.enable-lifecycle when starting up and then curl -X POST localhost:9090/-/reload or
kill -1 $prometheus_PID send SIGHUP

Configuration

relabel instances (aka remove ports) - in main Prometheus YAML file

scrape_configs:
    - job_name: 'prometheus'
      static_configs:
          - targets: ['host:port']
      relabel_configs:
          - source_labels: [__address__]
            target_label:  instance
            regex: ^(localhost):9100*
            replacement: mac-${1}
          - source_labels: [__address__]
            target_label:  instance
            regex: ^.*:9091
            replacement: pushgw

Alerting (for version 2.x)

Alert rules configuration - in main Prometheus YAML file

rule_files:
    - "rules/test.rules"

the test.rules file

groups:
- name: host
  rules:

  - alert: idle_below_30pct
    expr:  (100 * (1 - avg by(job)(irate(node_cpu{mode='idle'}[1m])))) < 30
    annotations:
      summary: "Instance  CPU usage is dangerously high"
      description: " is using a LOT of CPU. CPU usage is %."
    labels:
      severity: warning

  - alert: low_disk_space_root
    expr: node_filesystem_avail{mountpoint="/"} < 50_000_000_000
    annotations:
        summary: "Instance  disk  is low on space"
        description: "Description disk  is at B"
    labels:
        group: storage
        severity: critical

Alert configuration in Alertmanager

route:
    receiver: default
    routes:
        - match:
            alertname: ALL
          repeat_interval: 1m
          receiver: all
receivers:
- name: default
- name: ALL

rants and reviews

Prometheus

Prometheus

Configuration

Alerting (for version 2.x)

Related Posts

How Asia Works by Joe Studwell 31 Mar 2021

Educated by Tara Westover 30 Dec 2020

Exit Strategy by Martha Wells 28 Sep 2020