Comments (5)
Oh! I see. 😲
I had this trouble in the past as well.
Let me think about that. I will get back to you in a few days. 🤔
from awesome-prometheus-alerts.
@BenjaminHerbert I just discovered a metric: prometheus_rule_evaluation_failures_total
. This counter tell you when a rule fail to be executed, leading to ignored alerts.
expr: increase(prometheus_rule_evaluation_failures_total) > 0
from awesome-prometheus-alerts.
Using this query, you detect an absent job in prometheus.yml
. Not a killed exporter.
What is your use case ?
from awesome-prometheus-alerts.
My question concerns the absence of particular metrics and if it might be helpful for users to see an example for this. Up was just an example for a metric. My use case was creating alerts for failing or missing backup attempts with velero.
The alerts did not fire, as the metric I was using was renamed. Adding a check for absent(metric{job=myname}) prevents that such alerts break due to metrics going missing or due to alerting on metrics that do not exist.
What do you think?
from awesome-prometheus-alerts.
Sorry for the delay.
This is a chicken and egg problem: how to monitor the monitoring platform ? 😱
I think there are 3 things you can do:
- add an alert that detect killed exporters (expr:
up == 0
) => https://awesome-prometheus-alerts.grep.to/rules#prometheus-internals - do not let your infrastructure update automatically exporters, in order to avoid breaking changes in metric names
- upgrade exporters manually after reading changelogs
If there is no changelog, just pray... 🙏 🤣
Adding an alert for each metric used by your alerting system (such as absent(node_memory_MemAvailable_bytes)
) is probably over engineering.
Closing this issue. Feel free to open it again if necessary ;)
from awesome-prometheus-alerts.
Related Issues (20)
- `RedisOutOfConfiguredMaxmemory` for Redis that is solely used for cache? HOT 4
- Not working queries HOT 4
- Awesome Prometheus alerts
- Create releases HOT 4
- Add alerting rule for the metric: node_filesystem_device_error HOT 3
- customize nodeexporter rules for some servers HOT 1
- 7.2.1. Loki process too many restarts label HOT 1
- Make alert description suffix customizable
- Invalid PostgresqlTooManyConnections HOT 1
- KubernetesNodeOutOfPodCapacity fails when multiple replicas of kube-state-metrics
- Broken on iOS?
- Adjust "Kubernetes Volume full in four days" query? HOT 2
- Node-exporter option has been renamed HOT 1
- Rule "Host RAID array got inactive" has misleading description HOT 1
- flux alerts HOT 3
- changed Kernel info breaks querie(s) HOT 1
- Rule "Host out of inodes" triggers false positive with FAT16 on FreeBSD HOT 6
- Host Memory underutilized uses a `rate` on the `node_memory_MemAvailable_bytes` gauge
- Need to fix use of deprecated apiserver_request_latencies_bucket metric HOT 1
- HostFilesystemDeviceError should use `for: 2m` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from awesome-prometheus-alerts.