GithubHelp home page GithubHelp logo

marmiky / scorch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from trapexit/scorch

0.0 1.0 0.0 15 KB

Silent CORruption CHecker - My personal/experimental version

License: ISC License

Python 100.00%

scorch's Introduction

scorch (Silent CORruption CHecker)

A tool to help discover silent corruption on a filesystem. If using ZFS, BTRFS, SnapRaid, or similar this tool isn't needed. While tools like {md5,sha1}sum and other tools can hash files and check files against that hash scorch provides a full workflow for doing so.

Usage

usage: scorch [-h] -d DB [-v] [-r {sticky,readonly}] [-f FNFILTER]
              [-s {none,radix,reverse-radix,natural,reverse-natural,random}]
              [-m MAX] [-b]
              {add,append,check,check+update,delete,cleanup,list,list-unhashed,list-dups}
              dir [dir ...]

a tool to help discover file corruption

positional arguments:
  {add,append,check,check+update,delete,cleanup,list,list-unhashed,list-dups}
                        actions
  dir                   directories to work on

optional arguments:
  -h, --help            show this help message and exit
  -d DB, --db DB        database which stores hashes
  -v, --verbose         print details of files
  -r {sticky,readonly}, --restrict {sticky,readonly}
                        restrict action to certain types of files
  -f FNFILTER, --fnfilter FNFILTER
                        restrict action to files which match regex
  -s {none,radix,reverse-radix,natural,reverse-natural,random}, --sort {none,radix,reverse-radix,natural,reverse-natural,random}
                        when adding/appending/checking sort files before
                        acting on them
  -m MAX, --max MAX     max number of actions to take
  -b, --break-on-error  break on first failure / error

Instructions

  • add: for all regular files found, compute and store it's hash and metadata.
  • append: for all regular files not found in the hash database, compute and store it's hash and metadata.
  • check: for all files in the database, recompute hash and report if mismatched. If mtime or size of file differ from when originally computed it will warn and list differences.
  • check+update: same as check but when a file is found changed recompute the hash and metadata and overwrite existing entry.
  • delete: remove files from hash database.
  • cleanup: remove files no longer in filesystem from hash database.
  • list: a md5sum compatible listing of files and their hashes.
  • list-unhashed: list files which are not hashed in the database.
  • list-dups: returns a listing of files which have the same hash value.

Example

$ ls -lh /tmp/files
total 0
-rw-rw-r-- 1 bile bile 0 May  3 16:30 a
-rw-rw-r-- 1 bile bile 0 May  3 16:30 b
-rw-rw-r-- 1 bile bile 0 May  3 16:30 c

$ scorch -v -d /var/tmp/hash.db add /tmp/files
1/3 /tmp/files/c: d41d8cd98f00b204e9800998ecf8427e
2/3 /tmp/files/a: d41d8cd98f00b204e9800998ecf8427e
3/3 /tmp/files/b: d41d8cd98f00b204e9800998ecf8427e

$ scorch -v -d /var/tmp/hash.db check /tmp/files
1/3 /tmp/files/a: OK
2/3 /tmp/files/b: OK
3/3 /tmp/files/c: OK

$ echo asdf > /tmp/files/d

$ scorch -v -d /var/tmp/hash.db list-unhashed /tmp/files
/tmp/files/d

$ scorch -v -d /var/tmp/hash.db append /tmp/files
1/1 /tmp/files/d: 2b00042f7481c7b056c4b410d28f33cf

$ scorch -v -d /tmp/hash.db list-dups /tmp/files
d41d8cd98f00b204e9800998ecf8427e /tmp/files/a /tmp/files/b /tmp/files/c

$ echo foo > /tmp/files/a
$ scorch -v -d /tmp/hash.db check+update /tmp/files
1/4 /tmp/files/b: OK
2/4 /tmp/files/c: OK
3/4 /tmp/files/a: FILE CHANGED
 - size: 0 4
 - mtime: 1462307452 1462307725
 - updated hash: d3b07384d113edec49eaa6238ad5ff0
4/4 /tmp/files/d: OK

$ scorch -v -d /tmp/hash.db list /tmp/files | md5sum -c
/tmp/files/c: OK
/tmp/files/d: OK
/tmp/files/a: OK
/tmp/files/b: OK

Automation

A typical setup would probably be initialized manually by using add or append. After it's finished creating the database a cron job can be created to check, update, and append to the database. By not placing scorch into verbose mode only failures will be printed and the output from the job running will be emailed to the user.

#!/bin/sh

scorch -d /var/tmp/hash.db check+update /tmp/files
scorch -d /var/tmp/hash.db append /tmp/files
scorch -d /var/tmp/hash.db cleanup /tmp/files

Since if a file's modify time or size change it is likely it was changed intentionally the check+update instruction will warn about the change and update the database rather than indicating it's a corruption ("FAILED"). Only if the mtime and size are the same and the hashes differ do we consider it corrupted.

scorch's People

Contributors

trapexit avatar marmiky avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.