GithubHelp home page GithubHelp logo

zfs-tools's Introduction

ZFS tools

Donate to support this free software
1Cw9nZu9ygknussPofMWCzmSMveusTbQvN

The ZFS backup tools will help you graft an entire ZFS pool as a filesystem into a backup machine, without having to screw around snapshot names or complicated shell commands or crontabs.

The utilities let you do this:

  1. zfs-shell:
    a shell that allows remote ZFS administration and nothing more

  2. zsnap:
    a command that snapshots a dataset or pool, then deletes old snapshots

  3. zreplicate
    a command that replicates an entire dataset tree using ZFS replication streams. Best used in combination with zsnap as in:

    • zsnap on the local machine
    • zreplicate from the local machine to the destination machine

    Obsolete snapshots deleted by zsnap will be automatically purged on the destination machine by zreplicate, as a side effect of using replication streams. To inhibit this, use the --no-replication-stream option.

    Run zreplicate --help for a compendium of options you may use.

  4. zbackup: a command to snapshot and replicate filesystems according to their user properties. This uses zsnap and zreplicate to do the work, which is all driven by properties. For details, see this further description of zbackup.

  5. zflock: a command to lock a filesystem against replication by zbackup. For details, see this further description of zbackup.

The repository, bug tracker and Web site for this tool is at http://github.com/Rudd-O/zfs-tools. Comments to me through [email protected].

Setting up

Setup is rather complicated. It assumes that you already have ZFS running and vaults on both the machine you're going to back up and the machine that will be receiving the backup.

On the machine to back up

  • Install the zfs-shell command
    cp zfs-shell /usr/local/sbin
    chmod 755 /usr/local/sbin/zfs-shell
    chown root.root /usr/local/sbin/zfs-shell

  • Create a user with a home directory and shell zfs-shell
    useradd -rUm -b /var/lib -s /usr/local/sbin/zfs-shell zfs

  • Let sudo know that the new user can run the zfs command
    zfs ALL = NOPASSWD: /usr/local/sbin/zfs
    (ensure you remove the requiretty default on /etc/sudoers) (check sudoers.zfs-tools in contrib/ for an example)

  • Set up a cron job to run zsnap as frequently as you want to, snapshotting the dataset you intend to replicate.

On the backup machine

  • Set up public key authentication for SSH so the backup machine may log as the user zfs (as laid out above) in the machine to be backed up.

  • Create a dataset to receive the backup stream.

  • Set up a cron job to fetch the dataset snapshotted by zsnap from the remote machine into the newly created dataset. You will use zreplicate for that (see below for examples).

  • After the first replication, you may want to set the mountpoint attributes on the received datasets so they do not automount on the backup machine.

Test

If all went well, you should be able to do this without issue:

(on the machine to back up)

[root@peter]
zsnap senderpool

(on the machine to receive)

[root@paul]
zfs create receiverpool/senderpool # <--- run this ONLY ONCE
zreplicate -o zfs@paul:senderpool receiverpool/senderpool
# this should send the entire senderpool with all snapshots
# over from peter to paul, placing it in receiverpool/senderpool

(on the machine to back up)

[root@peter]
zsnap senderpool

(on the machine to receive)

[root@paul]
zreplicate -o zfs@paul:senderpool receiverpool/senderpool
# this should send an incremental stream of senderpool
# into receiverpool/senderpool

And that's it, really.

zfs-tools's People

Contributors

awapf avatar blag avatar javascriptdude avatar mdcurtis avatar michaelhierweck avatar ph84172 avatar rudd-o avatar tesujimath avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zfs-tools's Issues

Have you lost interest in zfs-tools?

zfs-tools is a critical component of the HPC system I maintain. I have submitted a couple of PRs here which remain unmerged for a looong time. And recently, I have done further work advancing the functionality in my fork, at https://github.com/tesujimath/zfs-tools, namely, support for resumable transfers introduced in zfs 0.7.0.

Observing the lack of progress here, and not wanting to spend time justifying the changes I have been making, which are useful for me, and in production use here, I am inclined to rename my repo something like zfs-tools2, and start versioning and releasing independently. (I can't justify spending much time negotiating about PRs.)

However, I want to acknowledge the good basis this repo was for my work on zbackup, so don't want to be seen as hostile here. That's not my intention. This is about making progress independently.

zreplicate don't replicate archives entry datasets

Hi, i try used zreplicate with two machines.

i create the snapshot with:

zsnap --keep=7 --prefix SAS- -t %Y-%m-%d-%H%M%S storage/TESTE

origin machine:

NAME USED AVAIL REFER MOUNTPOINT
storage/TESTE@SAS-2019-05-24-163013 0B - 236K -
storage/TESTE@SAS-2019-05-24-163014 0B - 236K -
storage/TESTE@SAS-2019-05-24-163015 0B - 236K -
storage/TESTE@SAS-2019-05-24-163017 0B - 236K -
storage/TESTE@SAS-2019-05-24-163018 0B - 236K -
storage/TESTE@SAS-2019-05-24-163019 0B - 236K -
storage/TESTE@SAS-2019-05-24-163021 0B - 236K -

after created all snapshots, i running this in my backup machine:

zreplicate sourcestorage:storage/TESTE storage/TESTE

after running, the snapshosts are replicated. But, the archives i put in 'storage/TESTE' was empty.

in machine backup:

storage/TESTE# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
storage/TESTE@SAS-2019-05-24-163013 0B - 236K -
storage/TESTE@SAS-2019-05-24-163014 0B - 236K -
storage/TESTE@SAS-2019-05-24-163015 0B - 236K -
storage/TESTE@SAS-2019-05-24-163017 0B - 236K -
storage/TESTE@SAS-2019-05-24-163018 0B - 236K -
storage/TESTE@SAS-2019-05-24-163019 0B - 236K -
storage/TESTE@SAS-2019-05-24-163021 0B - 236K

/storage/TESTE# ls -la
total 1
drwxr-xr-x 2 root root 2 May 24 16:28 .
drwxr-xr-x 3 root root 3 May 24 16:28 ..

KeyError [zreplicate thing]

Im wondering why is that below.
On localhost ./zreplicate works as expected:

lap bin # ./zreplicate -vo --create-destination --no-replication-stream data/test data/backup
Replicating dataset localhost:data/test into localhost:data/backup...
Assessing that the source dataset exists...

Assessing that the destination dataset exists...

Replicating (full) data/test to data/backup
Base snapshot available in destination: None
Target snapshot available in source: <Snapshot: data/test@autosnapshot-2016-10-21-014702>
send from @ to data/test@autosnapshot-2016-10-21-014702 estimated size is 9,50K
total estimated size is 9,50K
TIME SENT SNAPSHOT
40,7KiB 0:00:00 [1,29GiB/s] [ <=>
receiving full stream of data/test@autosnapshot-2016-10-21-014702 into data/backup@autosnapshot-2016-10-21-014702

received 40,4KB stream in 1 seconds (40,4KB/sec)

Replication complete.

but doing it to remote direction (all the user zfs/zfs-shell/sudo stuff is OK) outputs:

lap bin # ./zreplicate -o -v --create-destination --no-replication-stream localhost:data/test zfs@edge1:rpool/DATA/backup/test
Replicating dataset localhost:data/test into zfs@edge1:rpool/DATA/backup/test...
Assessing that the source dataset exists...
Assessing that the destination dataset exists...
Creating missing destination dataset exists as requested
Traceback (most recent call last):
File "./zreplicate", line 78, in
dst_conn.create_dataset(destination_dataset_name)
File "/home/llu/praca/zfs/zfs-tools/bin/../src/zfstools/connection.py", line 66, in create_dataset
return self.pools.lookup(name)
File "/home/llu/praca/zfs/zfs-tools/bin/../src/zfstools/connection.py", line 58, in _get_poolset
self._poolset.parse_zfs_r_output(stdout2,properties)
File "/home/llu/praca/zfs/zfs-tools/bin/../src/zfstools/models.py", line 179, in parse_zfs_r_output
fs = self.pools[poolname]
KeyError: 'rpool'

[feature] zreplicate setting to specify the number of remote snapshots

Hi

Today zreplicate synchronize all snapshots from source to destination.
There are cases where the space constraints are different between the source and destination.
Sometime it could be the source that is space constrained, sometime the destination.

It would be nice if we could specify the number of source and remote snapshots separately, why not in zbackup settings.

Looking at zbackup's documentation, I haven't understood if such feature exists with the snapshot-limit setting.

Thanks

Formally Deprecate Python 2.x Support

As Python 2 has now been formally deprecated for over a year and is no longer receiving security updates, how about this project formally deprecate and no longer allow new release to run on Python 2. This will make maintaining the code much easier.

I just finished some querys of pypi data to see how many downloads there were of zfs-tools.

Here is the of zfs-tools downloads in last n months from PyPi:

version | 1 month | 2 months | 6 months | 12 months
2.7     | 0       | 0        | 8        | 20 
3.4     | 0       | 0        | 0        | 1
3.5     | 0       | 0        | 0        | 20 
3.6     | 0       | 2        | 3        | 13 
3.7     | 3       | 3        | 10       | 27 
3.8     | 3       | 7        | 17       | 21 
3.9     | 0       | 0        | 1        | 1   

Note: pypi does not store version information from all downloads so these numbers are a small subset of total downloads but should still be reflective of the big picture.

FYI - Data was retrieved from Google https://console.cloud.google.com/bigquery using the following SQL:

SELECT
  REGEXP_EXTRACT(details.python, r"[0-9]+\.[0-9]+") AS python_version,
  COUNT(*) AS num_downloads,
FROM `the-psf.pypi.file_downloads`
WHERE
  DATE(timestamp)
    BETWEEN DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 2 MONTH), MONTH)
    AND CURRENT_DATE()
  AND file.project = 'zfs-tools'
GROUP BY `python_version`
ORDER BY python_version

zreplicate performance enhancements

On large datasets, the zfs get creation and zfs list that happens when zreplicate starts takes ages. These need to query specifically and only the dataset subtree they are gonna be manipulating, and at the same time if possible have these queries be parallelized.

zsnap property KeyError: 'creation'

root@stor2:~# zsnap -k 24 -p auto-hourly- dpool
Traceback (most recent call last):
File "/usr/local/bin/zsnap", line 5, in
pkg_resources.run_script('zfs-tools==0.4.1', 'zsnap')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 528, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1394, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/zfs_tools-0.4.1-py2.7.egg/EGG-INFO/scripts/zsnap", line 82, in
ssn = sorted([ (x.get_property('creation'), x.name, x) for x in source_dataset.get_snapshots(flt) ])
File "/usr/local/lib/python2.7/dist-packages/zfs_tools-0.4.1-py2.7.egg/zfstools/models.py", line 95, in get_property
return self._properties[name]
KeyError: 'creation'

I'm looking into this issue it appears that the parse_zfs_r_output function in models.py isn't populating the creations property. Reverting to commit 56b56fe resolves the problem for the time being.

Version of zbackup in this repo is not safe

I'm the author of zbackup, which is in use here on 5 production fileservers replicating 100s of TB of data. Rather than implement it in my own repo, seperate from zfs-tools, I decided to contribute it to the zfs-tools repo, mainly because of the close integration with the underlying scripts and library. In hindsight, that may have been a mistake.

I submitted a pull request for an enhancement I made to zbackup in September 2014 in my repo. This was not accepted, but in my view is essential if zbackup is to be used safely in a large-scale environment.

So, for now I recommend that any serious users of zbackup use my original version.

I see another issue (#18) has been raised about zbackup functionality, but the author of this repo is not able to respond meaningfully because he is not the author of zbackup itself.

I am therefore wondering whether zbackup should now be decoupled from the underlying zfs-tools, so that proper maintenance may be performed by the original author, without getting bogged down in inconsequential disagreements (like the location of the lock directory).

What do you think?

Refactor Class Structure for Dataset, Pool, Snapshot

With current class design, its not possible to distinguish objects using isinstance(obj, <type>) because Pool and Snapshot both inherit from Dataset.

Refactor the class layout to include an 'abstract' class ZfsItem(object) which Pool, Dataset and Snapshot would all inherit from. This would give you the ability to share properties / methods whilst allowing usage of strong type assertions of all the key Objects.

I have a hard fork of your code in my own library that I'm writing called [zfslib](https://github.com/JavaScriptDude/zfslib) that includes this change. I just made the tweak and so its not fully tested but basic smoke testing passed.

Blacken code

Once all outstanding PRs are merged, I plan to run the code through black so it doesn't look terrible as it does today (this was written almost back in the days where pep8 was barely beginning to be a thing on the world wide webs).

zbackup: Not possible to omit a single nested filesystem

Hi,

With zbackup, it appears to not be possible to backup all filesystems except for a particular one.

Say I have a set of ZFS filesystems:

tank
tank/home
tank/data
tank/var/cache

I'd like to zbackup everything except for 'tank/var/cache'.

Here are my attempts. I have the following zbackup properties:

# zbackup -l
tank daily-snapshots=6 weekly-snapshots=5 replica=backuphost:extbackup replicate=daily

To exclude tank/var/cache, I figured I should unset 'replica' and 'replicate' properties:

# zbackup --unset tank/var/cache replica replicate
zfs inherit com.github.tesujimath.zbackup:replica tank/var/cache
zfs inherit com.github.tesujimath.zbackup:replicate tank/var/cache

But of course that has no effect, because the properties were inherited to begin with.

Let's try setting to a bogus value of 'none':

# zbackup --set tank/var/cache replica=none replicate=none
tank daily-snapshots=6 weekly-snapshots=5 replica=backuphost:extbackup replicate=daily
tank/var/cache replica=none replicate=none

# zbackup -v daily
tank replica=backuphost:extbackup local
tank replicate=daily local
tank daily-snapshots=6 local
tank/var/cache replica=none local
tank/var/cache replicate=none local
========== zsnap -k 6 -p auto-daily- -v tank ==========
.....

No luck, tank/var/cache is still backed up. This is because zbackup's backup_or_reap_snapshots function only considers the 'replica' property of the root ('tank').

I imagine 'fixing' this would be a major endeavour, as it would require making 'zreplicate' aware of ZFS properties (to notice that 'replica' differs halfway down the tree). If so, perhaps this should just be considered a 'known limitation' of an otherwise nicely done tool.

NameError: name 'file' is not defined

file() is not supported in Python 3
zfstools/connection.py line 103
p = SpecialPopen(cmd,stdin=file(os.devnull),stdout=subprocess.PIPE,bufsize=bufsize)
fix:
p = SpecialPopen(cmd,stdin=open(os.devnull),stdout=subprocess.PIPE,bufsize=bufsize)
Thank you.

zreplicate does not accurately detect when the transfer is supposed to be local

if the hostname of the sending / receiving side evaluates to the IP address of the machine where the zreplicate transfer is executing, then SSH should be elided altogether for that leg of the transfer.

the idea being that 127.0.0.x or any hostname pointing to an IP address that $HOSTNAME also points to, should be simply interpreted as local transfer.

zreplicate hangs sometimes

the send receive pipeline needs this to handle SIGPIPE

output=dmesg | grep hda

becomes

p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]

AttributeError: 'Snapshot' object has no attribute 'get_creation'

Hi,

I'm trying to back up my ZFS to an external disk. Following the sequence in the README:

zfs create tank/foo
zfs create zextbackup/foo
zreplicate tank/foo zextbackup/foo
zsnap tank/foo
zreplicate tank/foo zextbackup/foo

The last step blows up with:

jturner-desktop ~ # zreplicate tank/foo zextbackup/foo
Traceback (most recent call last):
  File "/usr/local/bin/zreplicate", line 5, in <module>
    pkg_resources.run_script('zfs-tools==0.4.3', 'zreplicate')
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 528, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1394, in run_script
    execfile(script_filename, namespace, namespace)
  File "/usr/local/lib/python2.7/dist-packages/zfs_tools-0.4.3-py2.7.egg/EGG-INFO/scripts/zreplicate", line 114, in <module>
    operation_schedule = recursive_replicate(source_dataset,destination_dataset)
  File "/usr/local/lib/python2.7/dist-packages/zfs_tools-0.4.3-py2.7.egg/zfstools/sync.py", line 19, in recursive_replicate
    all_snapshots = [ y[1] for y in sorted([ (x.get_creation(), x.name) for x in all_snapshots ]) ]
AttributeError: 'Snapshot' object has no attribute 'get_creation'

It seems that the get_creation() method doesn't exist. This is the case in the 0.4.3 release and git. Am I doing anything wrong?

Code Question

In Models.py::extract_properties(), you have a line assert len( items ) == len( properties ), (properties, items).

Q) What does the , (properties, items) part do?

Speedup snapshot list

Can you add specific dataset list instead of list all snapshot in _get_poolset functions please?

[zfs-tools]# time zfs list -Hpr -o name,creation -t all | wc -l

19574
real	2m17.044s
user	0m1.931s
sys	0m12.320s

instead of

[zfs-tools]# time zfs list -Hpr -o name,creation -t all pool/path | wc -l
126

real	0m0.354s
user	0m0.019s
sys	0m0.135s

Thank you for your job and time.

deb package creation failure

command: dpkg-buildpackage -uc -us
[...]
Creating tar archive
removing 'build/bdist.linux-x86_64/dumb' (and everything under it)
tar -C zxvmf dist/zfs-tools-.linux-.tar.gz
tar: You must specify one of the -Acdtrux' or--test-label' options
Try tar --help' ortar --usage' for more information.
make[1]: *** [install] Error 2
make[1]: Leaving directory `/data/src/rudd-o-zfs-backup/zfs-tools'
dh_auto_build: make -j1 returned exit code 2
make: *** [binary] Error 2
dpkg-buildpackage: error: debian/rules binary gave error exit status 2

First as you can see, DESTDIR isn't specified, so there is no argument for -C.

Then if it's specified in the Makefile:
[...]
Creating tar archive
removing 'build/bdist.linux-x86_64/dumb' (and everything under it)
tar -C /opt zxvmf dist/zfs-tools-.linux-.tar.gz
tar: You must specify one of the -Acdtrux' or--test-label' options
Try tar --help' ortar --usage' for more information.
make[1]: *** [install] Error 2
make[1]: Leaving directory `/data/src/rudd-o-zfs-backup/zfs-tools'
dh_auto_build: make -j1 returned exit code 2
make: *** [binary] Error 2
dpkg-buildpackage: error: debian/rules binary gave error exit status 2

Using dash (-zxvmf) fixes the issue.

So doing again:
[...]
dh_usrlocal
dh_usrlocal: debian/zfs-tools/usr/local/bin/zreplicate is not a directory
dh_usrlocal: debian/zfs-tools/usr/local/bin/zfs-shell is not a directory
dh_usrlocal: debian/zfs-tools/usr/local/bin/zsnap is not a directory
"rmdir debian/zfs-tools/usr/local/bin"
rmdir: failed to remove ‘debian/zfs-tools/usr/local/bin’: Directory not empty
dh_usrlocal: rmdir debian/zfs-tools/usr/local/bin returned exit code 1
make: *** [binary] Error 1

I don't find, why it is local, unfortunately I'm not familiar with packaging.

Distributor ID: Ubuntu
Description: Ubuntu 13.10
Release: 13.10
Codename: saucy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.