Comments (13)
I could help with setting up Prometheus alerting.
from sysadmin.
I could help with setting up Prometheus alerting.
We currently have a munin based monitoring system. How much more complex is it to use something like prometheus?
from sysadmin.
Prometheus is a little bit of up-front work, but we have Ansible scripts to deal with most of it.
But good alerting is the primary goal of Prometheus. So the long-term gains are worth it.
from sysadmin.
@SuperQ have you used blackbox_exporter to do active probing as well? Can you mention any pitfalls with prometheus?
from sysadmin.
I can talk about Prometheus at length.
Yes, I have done a number of active blackbox probe setups.
I think the biggest pitfall of Prometheus, is the fact that it is a general-use monitoring system, and has a high degree of flexibility. This flexibility requires a bit of a learning curve to get used to.
The only other thing that requires a bit of planning is that Prometheus collects data over a number of different HTTP ports. It requires a bit of planning depending on the network design. Prometheus targets are simple HTTP get endpoints, with no writeable API. Because of this simple design, security was left "up to the user". The target design for use on private networks, not on public IP hosted servers.
from sysadmin.
Some notes related to issues with sending email notifications:
-
Having a relay for
@infra.ooni.io
is okayish, having@openobservatory.org
is probably not a good idea right now -
Connection to smtp server for sending emails can be unreliable (if smtp server is in AMS and airflow is sending emails from HK) so it's maybe better to have a local email (or inside of the same location) queue and that will handle delivering email.
from sysadmin.
@SuperQ what is the recommended way of implementing authentication and encryption for scraping targets? https://prometheus.io/docs/introduction/faq/#why-don-t-the-prometheus-server-components-support-tls-or-authentication-can-i-add-those links to a post talking about putting a nginx proxy with http basic auth, is that how you would do it?
from sysadmin.
That's one way to do it. But, in reality, it's easier to just firewall the metrics ports off the internet. There is no need to authenticate or encrypt the metrics traffic for the most part. The metrics endpoints are extremely simple, are read-only, and very light weight.
from sysadmin.
My opinion is that our network is not 100% trusted as the subnet is shared with other projects and docker may unexpectedly mess with iptables rules, so I think that both firewall and some frontend with PSK are useful if we want to hide the surface from possible attacker.
from sysadmin.
Yes I agree with @darkk that we probably want some extra protection layer as the local network is not really trusted. We also have machines that we care to monitor residing in different datacenters and the traffic would be traversing the public internet.
@SuperQ would you be up to helping us setup a prometheus instance?
from sysadmin.
For the case of multiple datacenters, the typical design is to have a local installation of Prometheus. This way you are only tunneling one pipe of traffic to/from each Prometheus server, and not exposing all targets to a remote server.
Frankly, the attack surface of nginx + OpenSSL is far greater than what an exporter target provides. Simple firewalling is generally enough. But if you really want to go down the route, I recommend using client cert auth and not basic auth. It's easier to generate and secure.
Prometheus is very easy to setup, I'm happy to help with this. I already have ansible roles to deal with this.
from sysadmin.
Prometheus is very easy to setup, I'm happy to help with this. I already have ansible roles to deal with this.
Are you on jabber? I am as [email protected].
Edit: we also have an OONI IRC or slack (https://slack.openobservatory.org/)
from sysadmin.
Basic deployment of Prometheus is done long time ago and we're quite happy with Prometheus.
But lots of leftovers are still there, so further cleanup is needed and way more signals have to be scraped.
from sysadmin.
Related Issues (20)
- Upgrade TLS used by web-connectivity test-helper HOT 1
- Test master issue
- Child issue
- Bouncer giving out down test-helpers for ~16 hours HOT 5
- Migrate collectors to the oonified host HOT 1
- Make the cans public and move the to the open data account
- Implement a deb based deployment process HOT 2
- Monitoring: fix frequent alerts around .onion services being unreachable
- Prometheus / Grafana: store data for longer times HOT 10
- Incident: very slow rsync between ams-ps1 and datacollector HOT 4
- Incident: blocked pipeline on 2019-12-10 HOT 1
- y3zq5fwelrzkkv3s.onion ams-wcth2 unreachable HOT 2
- Drop brie.darkk.net.ru from monitoring HOT 2
- [FIRING] Lots of `scrape_samples_scraped` lost Now ~ 2.83k, 24h ago ~ 27.96k. HOT 1
- psql: FATAL: database "metadb" does not exist HOT 4
- MetaDB - Time to replication
- Metadb Replica Access - postgres user HOT 1
- no:assignee
- Slack bridge is not transmitting messages to IRC channel HOT 2
- RSS by country feed is not available
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sysadmin.