GithubHelp home page GithubHelp logo

plow / sync-utils Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 23 KB

Scripts to ensure file synchronization across multiple sources with different directory structures and file names

Shell 100.00%

sync-utils's Introduction

sync-utils

Scripts to ensure file synchronization across multiple sources with different directory structures and file names

Mounting

Mounting the NAS volumes:

sudo su
mount -t nfs -o nfsvers=3 192.168.0.10:/volume1/photos-tmp /media/photos-tmp/

Pre-Hashing

Verify that all files are readable before hashing (chmod if not):

find . ! -readable

Rename image files according to their creation date:

exiftool '-filename<CreateDate' -d IMG_%Y%m%d_%H%M%S%%-c.%%le -r -ext JPG -ext jpg .

Hashing

Create Hashes

Create hashes recursively:

hashdeep -c md5 -v -r -l -W hashdeep_out.txt .

Validate Hashes

Validate a hash file without re-hashing:

./hashing/hash_file_validation.sh

The scripts checks

  • whether the sum of all file byte sizes according to the file system and according to the hash file are identical and
  • whether the list of files in the current directory is equivalent with the file list in the hash file.

Input: Expects a hashdeep_out.txt in the current directory. Output: The result of the file size checks and list of files which are unique either to file system or to the hash file (only file names and relative paths considered).

Merging hashes

Merge all hashdeep_out.txt files in the subdirectories of the current working directory (one level only, not recursively). The output is printed to stdout and contains a hashdeep-alike header.

./hashing/merge_hash_files.sh > hashdeep_out.txt

Auditing

Auditing using hashdeep (hashing on the fly)

Run an audit of the files in the current directory against the files listed in the hashes file provided:

hashdeep -vvv -a -r -k /media/photos-tmp/2017/photos_new/hashdeep_out.txt . > hashdeep_audit.txt 2>&1

Regex search audit file: cat hashdeep_audit.txt | grep -oP '.*No\smatch'

Auditing using audit script (previously created hash files)

The audit script checks all hashes in the current directory against hashes in a reference directory. The script lists files that are considered unique to current directory, i.e. files that are not yet exist in the reference directory. The script expects a hashdeep_out.txt file to be available in the current directory. The hash file of the reference directory is provided as argument. Example:

/path/sync-utils/auditing/audit.sh /media/photos-arch/2020/hashdeep_out.txt

Handling of duplicates

Find file duplicates within a particular directory. Out of a set of identical files the first one listed in the input is kept and all the others are listed as duplicates. Input: Expects a hashdeep_out.txt in the current directory. Output: List of duplicates (file path only)

find_duplicates_in_hash_file.sh

Move duplicates to a separate folder. Directory structure will not be preserved in the destination folder ../photos_duplicates/. However, existing destination files are backuped (numbered) and not overwritten. Hint: If some files are already moved (hence, moving them will fail) append 2>&1 >/dev/null | grep -v 'No such file or directory' to the command in order to ignore non-existing files.

find_duplicates_in_hash_file.sh | xargs -d '\n' mv --backup=t -t ../photos_duplicates/

Archiving

Directories can be ecrypted and prepared for archiving (e.g. on cloud storage) using the archiving scripts

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.