GithubHelp home page GithubHelp logo

sohonetlabs / scp-chunk Goto Github PK

View Code? Open in Web Editor NEW
35.0 13.0 7.0 36 KB

For transferring files over long latency links. Depending on the TCP/IP stack and the version of ssh installed latency can limit the speed that a single transfer will achieve, on a per connection basis. To work around this scp-chunk transfers multiple chunks at the same time.

License: MIT License

Python 100.00%
transfer scp-chunk python scp chunk transfer-files latency-links transferring-files rsync

scp-chunk's Introduction

scp-chunk

Why ?

For transferring files over long latency links. Depending on the TCP/IP stack and the version of ssh installed latency can limit the speed that a single transfer will achieve, on a per connection basis. To work around this scp-chunk transfers multiple chunks at the same time.

Use the system python, without having to install any other python packages!!!! just put this on the machine and go.

Can use rsync instead of scp.

How it works

Split a large file into chunks and then transfer via multiple scp connections. Then join the chunks back together, check the checksum. then clean up all the chunks, at the local and remote ends. It will use at peak twice the disk space of the size of the file to be transferred at each end.

Requirements

Uses rsync or scp to transfer the files to the remote system in parrellel, and expects the user to be pre-keyed to the remote systems. see article here on how to set this up

Goal

Use the system python, without having to install any other python packages, just using the programs listed below.

It is expected that the remote shell will provide access to the following commands :-

remote system

  • openssl usage to calculate checksum: openssl md5 <filename>
  • cat usage to reassemble chunks: cat <filename> >> <filename>
  • rm usage to remove chunks: rm <filename>
  • rsync usage to transfer chunks, use --use_rsync
local system
  • scp to copy files to remote system.
  • rsync usage to transfer chunks, use --use_rsync

Usage

usage: scp-chunk.py [-h] [-c CYPHER] [-s SIZE] [-r RETRIES] [-t THREADS] [--use_rsync]
                    src srv dst

Chunk a file and then kick off multiple SCP threads.Speeds up transfers over high latency links

positional arguments:
  src                   source file
  srv                   remote server and user if required e.g [email protected]
  dst                   directory (if remote home dir then specify . )

optional arguments:
  -h, --help            show this help message and exit
  -c CYPHER, --cypher CYPHER
                        cypher to use, from transfer see: ssh
  -s SIZE, --size SIZE  size of chunks to transfer.
  -r RETRIES, --retries RETRIES
                        number of times to retry transfer.
  -t THREADS, --threads THREADS
                        number of threads (default 3)
  --use_rsync           Use rsync instead of scp, scp is being deprecated

Example output

python scp-chunk.py  2GB.mov [email protected] . --threads 10

spliting file
uploading MD5 (d8ce4123aaacaec671a854f6ec74d8c0) checksum to remote site
starting transfers
Starting chunk: chunk_.00000 1:5 remaining 4 retries 0
Starting chunk: chunk_.00001 2:5 remaining 3 retries 0
Starting chunk: chunk_.00002 3:5 remaining 2 retries 0
Starting chunk: chunk_.00003 4:5 remaining 1 retries 0
Starting chunk: chunk_.00004 5:5 remaining 0 retries 0
Finished chunk: chunk_.00004 5:5 remaining 0
Finished chunk: chunk_.00002 3:5 remaining 0
Finished chunk: chunk_.00001 2:5 remaining 0
Finished chunk: chunk_.00000 1:5 remaining 0
Finished chunk: chunk_.00003 4:5 remaining 0
re-assembling file at remote end
processing chunk_.00004 -
re-assembled
checking remote file checksum
PASSED checksums match
cleaning up
removing file chunks
removing file chunk chunk_.00004 \
transfer complete

PING transfer.example.com (xxx.xxx.xxx.xxx): 56 data bytes
64 bytes from xxx.xxx.xxx.xxx: icmp_seq=0 ttl=58 time=151.308 ms
64 bytes from xxx.xxx.xxx.xxx: icmp_seq=1 ttl=58 time=151.264 ms
64 bytes from xxx.xxx.xxx.xxx: icmp_seq=2 ttl=58 time=151.449 ms
64 bytes from xxx.xxx.xxx.xxx: icmp_seq=3 ttl=58 time=150.927 ms


python scp-chunk.py /Stuff/23GBlargefile.mov  [email protected] /Store/ben_test/ --threads 10 --size 1G
spliting file
uploading MD5 (5e631de28dd45d1b05952c885a882be1) checksum to remote site
copying /Stuff/23GBlargefile.mov to /Store/ben_test/23GBlargefile.mov.md5
starting transfers
Starting chunk: /Stuff/23GBlargefile.mov.00000 1:29 remaining 28 retries 0
Starting chunk: /Stuff/23GBlargefile.mov.00001 2:29 remaining 27 retries 0
Starting chunk: /Stuff/23GBlargefile.mov.00002 3:29 remaining 26 retries 0
Starting chunk: /Stuff/23GBlargefile.mov.00003 4:29 remaining 25 retries 0
Starting chunk: /Stuff/23GBlargefile.mov.00004 5:29 remaining 24 retries 0
Starting chunk: /Stuff/23GBlargefile.mov.00005 6:29 remaining 23 retries 0
Starting chunk: /Stuff/23GBlargefile.mov.00006 7:29 remaining 22 retries 0
Starting chunk: /Stuff/23GBlargefile.mov.00007 8:29 remaining 21 retries 0
Starting chunk: /Stuff/23GBlargefile.mov.00008 9:29 remaining 20 retries 0
Starting chunk: /Stuff/23GBlargefile.mov.00009 10:29 remaining 19 retries 0
Finished chunk: /Stuff/23GBlargefile.mov.00008 9:29 remaining 19
Starting chunk: /Stuff/23GBlargefile.mov.00010 11:29 remaining 18 retries 0
<SNIP>
Finished chunk: /Stuff/23GBlargefile.mov.00019 20:29 remaining 2
Starting chunk: /Stuff/23GBlargefile.mov.00027 28:29 remaining 1 retries 0
Finished chunk: /Stuff/23GBlargefile.mov.00017 18:29 remaining 1
Starting chunk: /Stuff/23GBlargefile.mov.00028 29:29 remaining 0 retries 0
Finished chunk: /Stuff/23GBlargefile.mov.00014 15:29 remaining 0
Finished chunk: /Stuff/23GBlargefile.mov.00028 29:29 remaining 0
Finished chunk: /Stuff/23GBlargefile.mov.00020 21:29 remaining 0
Finished chunk: /Stuff/23GBlargefile.mov.00024 25:29 remaining 0
Finished chunk: /Stuff/23GBlargefile.mov.00021 22:29 remaining 0
Finished chunk: /Stuff/23GBlargefile.mov.00025 26:29 remaining 0
Finished chunk: /Stuff/23GBlargefile.mov.00023 24:29 remaining 0
Finished chunk: /Stuff/23GBlargefile.mov.00022 23:29 remaining 0
Finished chunk: /Stuff/23GBlargefile.mov.00026 27:29 remaining 0
Finished chunk: /Stuff/23GBlargefile.mov.00027 28:29 remaining 0
re-assembling file at remote end
processing 23GBlargefile.mov /
re-assembled
checking remote file checksum
PASSED checksums match
cleaning up
removing file chunks
removing file chunk /Stuff/23GBlargefile.mov.00028 -
transfer complete
--------------------------------------------------------------------------------
file size              :28.2 GB
transfer rate          :25.7 MB/s
                       :205.9 Mb/s
transfer time          :18 minutes 43 seconds 
local chunking time    :10 minutes 35 seconds 
remote reassembly time :4 minutes 20 seconds 
remote checksum time   :1 minute 28 seconds 
total transfer rate    :13.3 MB/s
                       :106.6 Mb/s
total time             :36 minutes 8 seconds

Would be faster is if the disks where not rubbish at the source end. SSD at each end would make it faster to chunk the file.

Thank you

Bytes-to-human / human-to-bytes converter

Humanizeize time

scp-chunk's People

Contributors

benroeder avatar patsumby-sohonet avatar quantifiedcode-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scp-chunk's Issues

password mode

implement password mode, rather than just shared ssh keys

Directory support and local transfer between servers/NFS

Would be nice if there was an option to transfer files locally too, e.g. over NFS. No need for "srv" argument in this case.

Also when I try to transfer a directory, it doesn't support that either "Error: Source is not a file" but I have entire directories that need to be transferred.

or maybe you know of a copy tool that supports chunked transfers and directory support? I continue to search...

scp and cipher

It would be nice if there was an option to not specify the cipher and just let scp use it's default cipher.

Also it would be good to stop using scp and use only rsync. The maintainers of ssh have deprecated scp.

verify / retry

individual md5 check in event of failure / recovery option.

startup error for recent debian

I am testing this app and looks like I need to update line 13 to get this working on python3 /debian 10:

from multiprocessing import Queue

Initial Update

Hi ๐Ÿ‘Š

This is my first visit to this fine repo, but it seems you have been working hard to keep all dependencies updated so far.

Once you have closed this issue, I'll create seperate pull requests for every update as soon as I find one.

That's it for now!

Happy merging! ๐Ÿค–

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.