GithubHelp home page GithubHelp logo

shashankm / sensu-plugins-prometheus-checks Goto Github PK

View Code? Open in Web Editor NEW

This project forked from schubergphilis/sensu-plugins-prometheus-checks

0.0 1.0 0.0 179 KB

License: MIT License

Ruby 99.61% Shell 0.39%

sensu-plugins-prometheus-checks's Introduction

Sensu Kubernetes Prometheus Plugin

build status codecov.io gem version

Description

Sensu plugin designed to query prometheus data output from node-exporter

Usage

check_prometheus.rb /path/to/config.yml

# Debug mode to output all json and blacklisted checks
PROM_DEBUG=true check_prometheus.rb /path/to/config.yml

Development and testing

Dependencies: docker, docker-compose

To spin-up a development stack and run the integration tests

ruby test.rb

Afterwards you can just run rspec to run the tests

To run the dockerized version (that gitlab-ci uses)

bash test.sh

Environment variables

Name Example Default Description
PROM_DEBUG true false Debug output instead of sending checks to sensu
PROMETHEUS_ENDPOINT hostname:9090 localhost:9090 Connection string in the format address:port
SENSU_SOCKET_ADDRESS hostname localhost Address used to connect to the sensu socket
SENSU_SOCKET_PORT 1234 3030 Port used to connect to the sensu socket

Config.yml

Check configuration is defined in the config.yml file under the key checks, and checks based on custom Prometheus queries are under custom. Example:

config:
  reported_by: sbppapik8s
  occurrences: 3
  domain: example.com
  whitelist: sbppapik8s.*
  use_default_source: false
checks:
  - service:
    name: kube-controller-manager.service
  - check: load_per_cluster
    host: sbppapik8s
    cfg:
      cluster: prometheus
      warn: 1.0
      crit: 2.0
      source: sbppapik8s
custom:
  - name: heartbeat
    query: up
    check:
      type: equals
      value: 1
    msg:
      0: 'OK: Endpoint is alive and kicking'
      2: 'CRIT: Endpoints not reachable!'

Checks

Name Description
service Checks if a systemd service is active
memory Checks memory usage as a percentage
load_per_cpu Checks cpu load divided by cpus
load_per_cluster Checks cpu load of entire cluster divided by total cpus
load_per_cluster_minus_n Checks cpu load of entire cluster divided by total cpus minus n failures
inode Checks inode usage as a percentage per mountpoint
disk Checks filesytem usage as a percentage per mountpoint
disk_all Checks filesystem and inode usage of all mountpoints
predict_disk_all Predicts if any of the disks in prometheus will be full in x days

Custom

Name Example Description
name heartbeat Custom check's name
query up Prometheus query
check.type (equals|below|above) Type of evaluation applied against value. Avilable: `equals`, `below` and `above`
check.value 1 Value to be compared against query results, using `check.type` evaluation
cfg.warn 33.00 Warning threshold level
cfg.crit 37.00 Critical threshold level.
msg.0 OK: heartbeat is up Message to be used when `value` evaluation is sucessful.
msg.2 CRITICAL: heartbeat is down Message to be used when not sucessful.

Global Configuration Options

Name Example Description
reported_by sbppapik8s hostname that shows up in sensu reported_by field
occurrences 3 amount of failures before sensu will send an alert
whitelist sbppapik8s.* regex used as a safety whitelist to make sure the source names are correct
ttl 300 Override the Sensu TTL in seconds
ttl_status 1 Override the status code for an expiring Sensu TTL
use_default_source false When `true` the source of the events will be Sensu-Client's

Check Configuration Options

Name Config Example
service name: servicename
state: active|deactivating|failed|inactive (default:active)
state_required: 0|1 (default:1)
name: test-service.service
memory warn: warning percentage
crit: critical percentage
warn: 90
crit: 95
load_per_cpu warn: warning percentage
crit: critical percentage
warn: 90
crit: 95
load_per_cluster cluster: cluster name
warn: warning percentage
crit: critical percentage
source: name that shows in sensu
cluster: nodes
warn: 90
crit: 95
source: sbppapik8s
load_per_cluster_minus_n cluster: cluster name
minus_n: amount of member failures
warn: warning percentage
crit: critical percentage
source: name that shows in sensu
cluster: nodes
minus_n: 1
warn: 90
crit: 95
source: sbppapik8s
inode mount: mountpoint
name: human readable name
warn: warning percentage
crit: critical percentage
mount: /var/lib/docker
name: docker
warn: 90
crit: 95
disk mount: mountpoint
name: human readable name
warn: warning percentage
crit: critical percentage
mount: /var/lib/docker
name: docker
warn: 90
crit: 95
disk_all ignore_fs: regex of filesystems
warn: warning percentage
crit: critical percentage
ignore_fs: tmpfs
warn: 90
crit: 95
predict_disk_all range_vector: Prometheus range vector used for sample size of prediction filter: prometheus filter to include/exclude disks
days: prediction days source: sensu name
range_vector: 24h
filter: {mountpoint="/"}
days: 14 source: sbppapik8s

sensu-plugins-prometheus-checks's People

Contributors

seth-karlo avatar kstruis avatar mbaan avatar mhulscher avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.