Comments (4)
I would assume increment size
is a function of the variety of data you are backing up. For instance if you host some websites and decide to keep frequent incremental backups of all website source code stored in /var/www
your increment size is likely to remain lean and consistent in its growth. However, imagine your websites see a decent amount of traffic and you frequently run backups of /var/log/httpd
- odds of developing a case of "bloaty destination directory" likely pretty high.
Consider the data you are backing up and how frequently it changes and turns over. Even a single log file or temporary folder in your set may find there is a high turnover folder in your backup source. If you want to exclude a folder containing unneeded files like boated logs, the --exclude
feature and related CLI options are brilliant for this exact use case.
from rdiff-backup.
Shouldn’t this be accounted for in the increment size
number? I wouldn’t mind if the increments were huge, because I understand that those are my backed up data. What I was wondering is, why the cumulative size of all increments is lower by an order of a magnitude than the total size of rdiff-backup-data/
directory where they are stored?
(Just by the way, the first example is a directory with my source codes, the second one of my e-mails stored in Maildir format (i.e., one file for each e-mail message – but even the old mailbox format was fine in terms of bloating (nonetheless, that was a separate rdiff-backup repository)). Both constitute almost a year of backups once per 30 minutes.)
from rdiff-backup.
This is based on a mere high-level understanding of rdiff-backup from several years of use in a variety of systems, so it might be entirely incorrect.
I don't believe rdiff-backup stores any unnecessary data in its archives which would bloat things up beyond what we expect from the fundamental backup strategy the tool uses. And while I am not totally clear on what "Cumulative Size" is, I think it's the total size of your current increment plus the actual differential metadata describing changes since the previous increment.
So you have a current data set of 765MB and rdiff-backup needs 112MB for increment metadata/diffs. You have 2139 increments in the entire set, and while per-increment metadata varies between increments, with that many increments it is reasonable to wind up with an 11GB archive set.
I admire the ambitious 30-min backup schedule but due to its inability to selectively prune within the archive set (ditch the half-hourlies and maintain daily copies older than one month, for example) if disk space is an issue or you find that even trimming increments older than a certain date or restoring any increment is excessively time consuming and resource intensive, have a look at some of the other mature backup solutions out there. This is a fantastic tool but no solution is fit for every backup strategy.
I hope that is helpful :)
from rdiff-backup.
With the explanations from marshallstokes, I think we can close this issue. Feel free to re-open it if you think that it's still an issue.
from rdiff-backup.
Related Issues (20)
- Warning SpecialFileError: [Errno 95] Operation not supported HOT 2
- Backup files not saved with original user HOT 2
- [BUG] "OSError: [Errno 36] File name too long" HOT 3
- [?] rdiff-backup 1.2.8 freezes while restoring files from backup HOT 3
- [?] rdiff-backup doc and remote HOT 2
- [BUG] CVE-2023-49797 pyinstaller: unauthorized deletion of files HOT 2
- [ENH] allow flexible usage of better hashing algorithm than SHA1
- [BUG] rdiff-backup fails on too long filenames under Windows HOT 1
- What are the errors in statistics? HOT 4
- [ENH] populate no_compression_regexp with _something_ so it matches the documentation HOT 4
- [BUG] read-only commands should return 2 as warning if last back-up failed HOT 6
- [BUG] Removal of setup.py usage in debian/autobuild.sh regressed it HOT 2
- [BUG] crash on date beyond 2038 (last 32 bits date) HOT 3
- test action fails with empty error message when using API 201 HOT 1
- [ENH] Suppress output line-wrapping when using --parsable-output HOT 3
- [?] MacOS with Python Universal2, using pip to install rdiff-backup is working, but the resulting installation is broken HOT 2
- [ENH] Efficient restore to a populated destination HOT 7
- [BUG] Recurring Failure to "find the path specified: b'I:/'" HOT 2
- backup to a pCloud mounted drive[?] HOT 8
- problem [Errno 11] Resource temporarily unavailable, when running backup HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rdiff-backup.