GithubHelp home page GithubHelp logo

tjjh89017 / ezio Goto Github PK

View Code? Open in Web Editor NEW
161.0 8.0 20.0 260 KB

BT-based Disk Deployment tool

License: GNU General Public License v2.0

C++ 60.35% Python 35.88% CMake 2.83% Shell 0.13% Dockerfile 0.81%
pxe torrent deployment disk clone

ezio's Introduction

EZIO Developer and User Guide

build test

Introduction

EZIO is a tool for rapid server disk image cloning/deployment within local area network. We utilize BitTorrent protocol to speed up the data distribution. Also, we use partclone to dump used filesystem blocks, and EZIO receiver can directly write received blocks to raw disk, which greatly improves performance.

Motivation

EZIO is inspired by Clonezilla and BTsync (Resilio) for its idea to transfer data and massive deployment. The issue of Clonezilla is that, it is too slow in real world due to multicast feature. In real world, all clients must register to Clonezilla server before starting deployment, which cost too much time. In addition, whenever there is a client that doesn't get data or gets incorrect one and need to be re-transferred, it causes server a lot of effort. Most importantly, in most case, the clients which cannot get data correctly may be broken, it will make server to re-transfer data again and again until it reaches its re-transfer limit and quit. Due to above issues of Clonezilla, EZIO make a difference by changing transfer mechanism. EZIO implement transfer function on top of BitTorrent, and make a lot of progress on deployment speed.

Feature

  • Faster than Clonezilla by implementing data transfer on top of the BitTorrent protocol. Clonezilla uses multicast for transfer,for which in practice are extremely slow due to limitation of multicast and clients' status. Limitation of multicast, for example, they will cost too much time waiting all the clients to register to the server. As for Computer status, for example, when there are a small amount of computers which don't have enough disk storage or might be broken, in this case, they won't get data from server successfully and need to re-transfer data, which cost lots of time.

  • Plenty of File systems are supported: (1) ext2, ext3, ext4, reiserfs, reiser4, xfs, jfs, btrfs, f2fs and nilfs2 of GNU/Linux (2) FAT12, FAT16, FAT32, NTFS of MS Windows (3) HFS+ of Mac OS (4) UFS of FreeBSD, NetBSD, and OpenBSD (5) minix of Minix (6) VMFS3 and VMFS5 of VMWare ESX. Therefore you can clone GNU/Linux, MS windows, Intel-based Mac OS, FreeBSD, NetBSD, OpenBSD, Minix, VMWare ESX and Chrome OS/Chromium OS, no matter it's 32-bit (x86) or 64-bit (x86-64) OS. For these file systems, only used blocks in partition are saved and restored. For unsupported file system, sector-to-sector copy is done by dd in EZIO.

  • Different from BTsync file level transfer, EZIO is block level transfer. Whenever a client gets wrong data, in file level transfer, it will take a lot of time re-transfer whole file. However, in block level transfer, all we need to do is to re-transfer the specific piece of data.

  • Saves data in the hard disk by using partclone.

Installation

Minimum System Requirements

  • 64bit
  • 1GB RAM

Dependencies

  • Debian 11 or above
  • libtorrent-rasterbar>=2.0.8
  • libboost>=1.74
  • cmake>=3.16
  • spdlog
  • gRPC
sudo apt install build-essential cmake libboost-all-dev libtorrent-rasterbar-dev libgrpc-dev libgrpc++-dev libprotobuf-dev protobuf-compiler-grpc libspdlog-dev

Build and Install

mkdir build
cd build
cmake ../
make
sudo make install

We also provide a Dockerfile for the ease of installation and CI testing. To build the image type this:

docker build . -t ezio-latest-img

Usage

Partclone

Partclone provides utilities to save and restore used filesystem blocks (and skips the unused blocks) from/to a partition.

The newest partclone will support dump your disk to EZIO image, and generate torrent.info simultaneously.

sudo partclone.extfs -c -T -s /dev/sda1 -O target/ --buffer_size 16777216

or you want generate torrent, but don't want BT image.

sudo partclone.extfs -c -t -s /dev/sda1 -O target/ --buffer_size 16777216

When finishing to dump disk, you will see the file like the picture. And using utils/partclone_create_torrent.py to generate torrent for deploy.

utils/partclone_create_torrent.py -c CloneZilla -p sda1 -i <some_path>/torrent.info -o sda1.torrent -t 'http://<some tracker>:6969/announce'

EZIO

When you have a sda1.torrent you can deploy or clone your disk via Network.

Help

Allowed Options:
  -h [ --help ]          some help
  -F [ --file ]          read data from file rather than raw disk
  --listen arg           gRPC service listen address and port, default is 

Seeding

  • Seeding from BT image
./ezio -F
./utils/create_proto_py.sh
./utils/add_torrent_seed.py sda1.torrent /some/path/to/sda1
  • Seeding from Disk
./ezio
./utils/create_proto_py.sh
./utils/add_torrent_seed.py sda1.torrent /dev/sda1

Downloading

  • Downloading to Disk
./ezio
./utils/create_proto_py.sh
./utils/add_torrent.py sda1.torrent /dev/sda1
  • Proxy or save the image
./ezio -F
./utils/create_proto_py.sh
./utils/add_torrent.py sda1.torrent /some/path/to/save/sda1

Proxy

If you want to deploy over Internet or some bottleneck, you can proxy the torrent via regular BT software like qBittorrent. And don't let internal peer connect outside directly.

Easy Usage to Deploy Disk or OS via EZIO

Using CloneZilla Live (version>=testing-2.6.0-31). CloneZilla contains EZIO in its Lite Server Mode. It will be most easy way to deploy your disk or OS via BT.

Design

In raw_storage.cpp implements a libtorrent custom storage, to allow the receiver to write received blocks directly to raw disk.

We store the "offset" in hex into torrent, the "length" into file attribute. so BT will know where the block is, and it can use the offset to seek in the disk

{
    'announce': 'http://tracker.site1.com/announce',
    'info':
    {
        'name': 'root',
        'piece length': 262144,
        'files':
        [
            {'path': ['0000000000000000'], 'length': 4096}, // store offset and length of blocks
            {'path': ['0000000000020000'], 'length': 8192},
            ...
        ],
        'pieces': 'some piece hash here'
    }
}

Benchmark

Compare with CloneZilla Multicast Mode with EZIO Mode.

Experimental environment

  • Network: Cisco 3560G
  • Server: Dell T1700 with Intel Xeon E3-1226, 16G ram, 1TB hard disk
  • PC Client: 32 Client, same as Server
  • Image: Ubuntu Linux with 50GB data in disk. Multicast Image is compressed by pzstd. BT Image is raw file.

Result

Time in second

Number of client Time (Unicast) Time (EZIO) Time (Multicast) Ratio (BT/Multicast)
1 474 675 390 1.731
2 948 1273 474 2.686
4 1896 1331 638 2.086
8 3792 1412 980 1.441
16 7584 1005 1454 0.691
24 11376 1048 1992 0.526
32 15168 1143 2203 0.519

Open Access Journal

More details about EZIO design and benchmark are in A Novel Massive Deployment Solution Based on the Peer-to-Peer Protocol.

Limitation

  • Making a torrent cost lots of time due to sha-1 hash need to be done on every single piece of data.
  • EZIO will be extremely slow when the number of clients is too small.
  • Due to partclone limitation, for unsupported filesystem, sector-to-sector copy is done by dd in EZIO.

Future

Contribute

Support

If you are having issues, please let us know. EZIO main developer email is located at: [email protected]

Special Thanks

  • National Center for High-performance Computing, NCHC, Taiwan
    • Provide many devices to test stability and knowledge support.

License

The project is licensed under the GNU General Public License v2.0 license.

ezio's People

Contributors

chameleon10712 avatar chengchingwen avatar ching-kuo avatar cuda-chen avatar dependabot[bot] avatar leepupu avatar mangokingtw avatar stevenshiau avatar tjjh89017 avatar yanglin5689446 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ezio's Issues

Test cleaner

clean the disk from torrent info before re-test again.

Remove all finish condition from server side

Remove all finish condition from server side
Migrate all stop server condition to client side (or gRPC client side).
make ezio be a generic daemon.
Let ezio be GPL and gRPC client be any license.

How does it work ?

1° What ezio does is just creating on a source host a magnet which contains /dev/sda (complete disk) or /dev/sda1 and seed it ?

2° Then a destination host boot with network (pxe) and download magnet directly to correspondant /dev/sdxY ?

3° How does ezio send only used blocks ?

4° For a windows partition (which is fragmented), does it work ?

5° How do you treat partition table ? (multiboot problematic)

RSS Support

RSS Support for pipelining
partclone is also needed to modify

Refactor for libtorrent 2.0

Development environment: Debian sid

Must

  • singleton for easy grpc
  • log for debug (need torrent status) SPDLOG sudo apt install libspdlog-dev
  • re-write disk_interface (only disk_interface could create disk_buffer_holder and pass it to job)
    • re-write mmap_storage (4th)
    • write buffer_pool (3rd)
      • read pool
      • write pool
    • threading pool (std::thread, not pthread) (1st)
    • job pool (2nd)
    • update README.md (5th)

Optional

  • dial out mode for controller
    • controller could get seq to setup 1 to 1 peer

Simple UI

  • Need Simple UI
  • Need Simple CLI display only

Resume Recovery

TODO
New Feature: Resume Recovery
Approach: set flag with resume status, and check the partition hash for torrent

Tuning Cache

settings_pack::volatile_read_cache might help

Torrent is larger than 4MB

Torrent is larger than 4MB

Running: ezio_add_torrent_seed.py /home/partimag/btzone/2023-06-12-12-img-CAD/sd
a3.torrent /home/partimag/btzone/2023-06-12-12-img-CAD
Traceback (most recent call last):
  File "/usr/sbin/ezio_add_torrent_seed.py", line 32, in <module>
    stub.AddTorrent(request)
  File "/usr/lib/python3/dist-packages/grpc/_channel.py", line 946, in _call_
    return _end_unary_response_blocking(state, call, False, None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/grpc/_channel.py", line 849, in _end_unar
y_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.RESOURCE_EXHAUSTED
        details = "Received message larger than max (9777509 vs. 4194304)"
        debug_error_string = "UNKNOWN:Error received from peer ipv4:127.0.0.1:50
051 {grpc_message:"Received message larger than max (9777509 vs. 4194304)", grpc_status:8, created_time:"2023-07-05T03:46:39.257029777+00:00"}"

Running: ezio_add_torrent_seed.py /home/partimag/btzone/2023-06-12-12-img-CAD/sda4.torrent /home/partimag/btzone/2023-06-12-12-img-CAD
[2023-07-05 03:46:39.366] [info] [service.cpp:81] AddTorrent

https://sourceforge.net/p/clonezilla/discussion/Clonezilla_live/thread/7ed2c3c74c/?limit=25&fbclid=IwAR1xcHmvyX6lqlhm1kCoQfJWk65LE9k_k1eSsBbbHlq3w7Sr93YSqPAc3d4#1a88/164a

Compilation fail on including gRPC header and generated_message_table_driven.h

When I tried to compile ezio I got the following messages:

$ make -j3
[ 11%] Building CXX object CMakeFiles/ezio.dir/daemon.cpp.o
[ 33%] Building CXX object CMakeFiles/ezio.dir/buffer_pool.cpp.o
[ 33%] Building CXX object CMakeFiles/ezio.dir/main.cpp.o
[ 44%] Building CXX object CMakeFiles/ezio.dir/ezio.pb.cc.o
In file included from /home/jio/cpp_code/ezio/build/ezio.pb.cc:4:
/home/jio/cpp_code/ezio/build/ezio.pb.h:17:2: error: #error This file was generated by an older version of protoc which is
   17 | #error This file was generated by an older version of protoc which is
      |  ^~~~~
/home/jio/cpp_code/ezio/build/ezio.pb.h:18:2: error: #error incompatible with your Protocol Buffer headers. Please
   18 | #error incompatible with your Protocol Buffer headers. Please
      |  ^~~~~
/home/jio/cpp_code/ezio/build/ezio.pb.h:19:2: error: #error regenerate this file with a newer version of protoc.
   19 | #error regenerate this file with a newer version of protoc.
      |  ^~~~~
In file included from /home/jio/cpp_code/ezio/build/ezio.pb.cc:4:
/home/jio/cpp_code/ezio/build/ezio.pb.h:27:10: fatal error: google/protobuf/generated_message_table_driven.h: No such file or directory
   27 | #include <google/protobuf/generated_message_table_driven.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [CMakeFiles/ezio.dir/build.make:118: CMakeFiles/ezio.dir/ezio.pb.cc.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/jio/cpp_code/ezio/daemon.hpp:5,
                 from /home/jio/cpp_code/ezio/daemon.cpp:3:
/home/jio/cpp_code/ezio/service.hpp:8:10: fatal error: grpc++/grpc++.h: No such file or directory
    8 | #include <grpc++/grpc++.h>
      |          ^~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [CMakeFiles/ezio.dir/build.make:92: CMakeFiles/ezio.dir/daemon.cpp.o] Error 1
In file included from /home/jio/cpp_code/ezio/daemon.hpp:5,
                 from /home/jio/cpp_code/ezio/main.cpp:10:
/home/jio/cpp_code/ezio/service.hpp:8:10: fatal error: grpc++/grpc++.h: No such file or directory
    8 | #include <grpc++/grpc++.h>
      |          ^~~~~~~~~~~~~~~~~
compilation terminated.

FYI, here're the commands I use for compilation:

$ cmake -DCMAKE_INSTALL_PREFIX=$MY_INSTALL_DIR ..
$ make -j3

I also provide the version and installation method of dependencies:

  • Ubuntu 20.04
  • libtorrent 2.0.6
    • Compiled from source using CMake.
  • libboost 1.71.0
  • CMake 3.16.3
  • gRPC 1.45.0
    • Compiled from source using CMake with tutorial procedures.
    • Set custom installation directory ($MY_INSTALL_DIR) as ~/.local/.
    • The gRPC itself contains Protobuf compiler version 3.19.4
  • ProtoBuf 3.20.1
    • Compiled from source using CMake

some help

I installed python bindings of libtorrent 1.1.3 with

python setup.py build
python setup.py install

and then

root@g25:~/bt# partclone/src/partclone.extfs -c -T -s /dev/sda2 -o eziopackage/test | ezio/utils/partclone_create_torrent.py
Partclone v0.3.5g http://partclone.org
Starting to clone device (/dev/sda2) to image (eziopackage/test)
Partclone fail, please check /var/log/partclone.log !
Traceback (most recent call last):
  File "ezio/utils/partclone_create_torrent.py", line 3, in <module>
    import libtorrent as lt
ImportError: /usr/local/lib/python2.7/dist-packages/libtorrent.so: undefined symbol: _ZTIN5boost6python15instance_holderE
root@g25:~/bt# python -c "import libtorrent; print libtorrent.version"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: /usr/local/lib/python2.7/dist-packages/libtorrent.so: undefined symbol: _ZTIN5boost6python15instance_holderE

it seems to be a known bug, how did you pass through ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.