GithubHelp home page GithubHelp logo

Comments (4)

wojas avatar wojas commented on August 16, 2024

Exporting a metric with the last time a snapshot was written during the lifetime of a process would not be hard to add.

Exporting it for repositories that have not seen a backup since the last rest-server restart would be harder, as it would require scanning the disk for all repositories on startup. We have so far avoided that. This part is probably not necessary to make this useful for monitoring.

from rest-server.

wojas avatar wojas commented on August 16, 2024

Additional metrics that may be useful; some of which I suspect would need the repositories credentials. I am not sure if the REST server would have the capabilities to hook into that. Maybe it could generate metrics during the running of the actual backup command and then store them for the metrics export later, since it can't very well open the repository for each metrics request ?

The only way to implement this would be for the restic client to store such report in the repo, but then there is also the consideration of how much information is too much for a report that rest-server can read. Things like "affected files/dirs" would give information about the backup contents that restic is currently making sure to encrypt.

HDD/SSD/Whatever storage device metrics (per Repository, as we store our Repos on separate volumes for better isolation) - total size, free size, used size in bytes, maybe optionally some health data like device errors if present ? Useful for obvious reasons such as alerting on low disk space.

This could be useful, but doing this per repository would scanning all repositories, which rest-server currently does not do.

Last backup metrics such as duration, affected files/dirs, maybe things like delta sizes or total files/bytes represented by a snapshot to monitor for suspicious changes in usage patterns such as encryption malware on the client system.

The rest-server cannot read this data.

Date and results of last forget/prune/check commands such as runtime, deleted snapshots, recovered bytes, repacked bytes and so on.

The rest-server cannot tell when specific command were run. Creating a new snapshot has the side-effect of creating a new file in the snapshots directory which makes this easy, but this is not true for these commands.


This issue is related to #50.

from rest-server.

schoentoon avatar schoentoon commented on August 16, 2024

I actually have metrics for most of these things already using a simple bash script, a systemd timer and prometheus-node-exporter picking up the textfile produced by it.

#!/bin/sh

set +e
set +x

NAMESPACE=restic

BACKUP_FOLDER=/mnt/restic

for dir in $(find "${BACKUP_FOLDER}" -maxdepth 1 -mindepth 1 -type d); do
   total_size=$(du -bs "${dir}" | cut -f 1) 

   snapshots_raw=$(ls -t -l --full-time "${dir}/snapshots" | sed 1d)
   snapshots_count=$(echo "${snapshots_raw}" | wc -l)
   lock_count=$(ls -1 "${dir}/locks" | wc -l)
   latest_snapshot=$(echo "${snapshots_raw}" | head -n 1 | awk '{ print $6 " " $7 }') 
   latest_snapshot_unix=$(date -d "${latest_snapshot}" +"%s")

   OUTPUT="${OUTPUT}${NAMESPACE}_repository_size_bytes{repository=\"${dir}\"} ${total_size}\n"
   OUTPUT="${OUTPUT}${NAMESPACE}_snapshots_count{repository=\"${dir}\"}  ${snapshots_count}\n"
   OUTPUT="${OUTPUT}${NAMESPACE}_latest_snapshot_time_seconds{repository=\"${dir}\"} ${latest_snapshot_unix}\n"
   OUTPUT="${OUTPUT}${NAMESPACE}_lock_count{repository=\"${dir}\"} ${lock_count}\n"
done

echo $OUTPUT | sort

This does make a fair bit of assumptions however, it won't work with restic repositories in subdirectories for example. But this has served me very well so far.

from rest-server.

Gaibhne avatar Gaibhne commented on August 16, 2024

Additional metrics that may be useful; some of which I suspect would need the repositories credentials. I am not sure if the REST server would have the capabilities to hook into that. Maybe it could generate metrics during the running of the actual backup command and then store them for the metrics export later, since it can't very well open the repository for each metrics request ?
The only way to implement this would be for the restic client to store such report in the repo, but then there is also the consideration of how much information is too much for a report that rest-server can read. Things like "affected files/dirs" would give information about the backup contents that restic is currently making sure to encrypt.

The REST server only supplies 'protocol', it can't tap into the commands themselves, is that correct ? I can see how that would be problematic and probably make such statistics severely out of scope. Would there be any way for a client script or similar to communicate such data to the server (optionally), or even interest in a solution like that ? I was thinking of trying to bridge or include https://github.com/ngosang/restic-exporter with this project, but if there is no real way for the server to hold that data, each client would have to run their own exporter, which doesn't really seem desirable.

HDD/SSD/Whatever storage device metrics (per Repository, as we store our Repos on separate volumes for better isolation) - total size, free size, used size in bytes, maybe optionally some health data like device errors if present ? Useful for obvious reasons such as alerting on low disk space.
This could be useful, but doing this per repository would scanning all repositories, which rest-server currently does not do.

It would solve a large part of our metric/alerting needs, so I would be very much in favor of that. Optionally, probably even opt-in, as the majority of people probably don't use metrics, I would think.

Date and results of last forget/prune/check commands such as runtime, deleted snapshots, recovered bytes, repacked bytes and so on.
The rest-server cannot tell when specific command were run. Creating a new snapshot has the side-effect of creating a new file in the snapshots directory which makes this easy, but this is not true for these commands.

The existing metrics for that could still be improved - latest snapshot timestamp, for example, would be very helpful. Currently, if I run automated forgetting, it would be hard to distinguish between no backup running or a backup + one expired snapshot, since both would result in the same reported snapshot amount/change (+-0), right ?

from rest-server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.