GithubHelp home page GithubHelp logo

tapeimgr's Introduction

tapeimgr

Tapeimgr is a software application that sequentially reads all files from a data tape. After the extraction is done it also generates a checksum file with SHA-512 hashes of the extracted files. Tapeimgr is completely format-agnostic: it only extracts the raw byte streams. It is up to the user to figure out the format of the extracted files (e.g. a TAR archive), and how to further process or open them.

In short, tapeimgr tries to read sequential files from a tape until its logical end is reached. For each successive file, it automatically determines its block size using an iterative procedure.

Internally tapeimgr wraps around the Linux dd and mt tools.

Warnings

At this stage tapeimgr has only had limited testing with a small number of DDS-1 and DLT-IV tapes. Use at your own risk, and please report any unexpected behaviour using the issue tracker.

For now tapeimgr can only read tapes that were written in fixed block mode; tapes written in variable block mode may result in unexpected behaviour (if ). Support of variable block mode may be added to a future release. Note that older tapes were most likely written in fixed block mode, as variable block mode is typically only supported by more recent tape drives.

System requirements

Tapeimgr is only available for Linux (but you would probably have a hard time setting up a tape drive on Windows to begin with). So far it has been tested with Ubuntu 18.04 LTS (Bionic) and Linux Mint 18.3, which is based on Ubuntu 16.04 (Xenial). In addition it has the following dependencies (many distros have most or all of these installed by default):

  • Python 3.2 or more recent (Python 2.x is not supported)

  • Tkinter. If tkinter is not installed already, you need to use the OS's package manager to install (there is no PyInstaller package for tkinter). If you're using apt this should work:

      sudo apt-get install python3-tk
    
  • dd and mt (but these are available by default on all Linux platforms)

Installation

Preparation: add default user to tape group

By default, Linux requires special permissions to access tape devices. So before proceeding any further, use the command below (replacing $USER with the name of the user who will be using tapeimgr):

sudo adduser $USER tape

The user is now added to the 'tape' system group. Now log out and then log in again for the changes to take effect.

Global install

For a global (all-users) installation run the following command:

sudo pip3 install tapeimgr

Then run:

sudo tapeimgr-config

If all goes well this should result in the following output:

INFO: writing configuration file /etc/tapeimgr/tapeimgr.json
INFO: creating desktop file /home/johan/Desktop/tapeimgr.desktop
INFO: creating desktop file /usr/share/applications/tapeimgr.desktop
INFO: tapeimgr configuration completed successfully!

User install

Use the following command for a single-user installation:

pip3 install --user tapeimgr

Then run:

~/.local/bin/tapeimgr-config

Result:

INFO: writing configuration file /home/johan/.config/tapeimgr/tapeimgr.json
INFO: creating desktop file /home/johan/Desktop/tapeimgr.desktop
INFO: creating desktop file /home/johan/.local/share/applications/tapeimgr.desktop
INFO: tapeimgr configuration completed successfully!

Tapeimgr is now ready to roll!

In the instructions that follow below, it is assumed that you have a functioning tape device attached to your machine, and that a tape is loaded (i.e. inserted in the drive).

GUI operation

You can start tapeimgr from the OS's main menu (in Ubuntu 18.04 the tapeimgr item is located under System Tools), or by clicking the tapeimgr shortcut on the desktop. Depending on your distro, you might get an "Untrusted application launcher" warning the first time you activate the shortcut. You can get rid of this by clicking on "Mark as Trusted". On startup the main tapeimgr window appears:

Use the Select Output Directory button to navigate to an (empty) directory where the extracted files are to be stored. Press the Start button to start the extraction. You can monitor the progress of the extraction procedure in the progress window:

Note that the screen output is also written to a log file in the output directory. A prompt appears when the extraction is finished:

If the extraction finished without any errors, the output directory now contains the following files:

Here, file000001.dd through file000003.dd are the extracted files; checksums.sha512 contains the SHA512 checksums of the extracted files, metadata.json contains some basic metadata and tapeimgr.log is the log file.

Options

If needed you can use the folowing options to customize the behaviour of tapeimgr:

Option Description
Tape Device Non-rewind tape device (default: /dev/nst0).
Initial Block Size Initial block size in bytes (must be a multiple of 512). This is used as a starting value for the iterative block size estimation procedure. Block sizes smaller than 4096 are reported to give poor performance (source: forensicswiki), and this option can be useful to speed up the extraction process in such cases. Note that the user-specified value of Initial Block Size is ignored if the Fill failed blocks option (see below) is activated.
Files Comma-separated list of files to extract. For example, a value of 2,3 will only extract the 2nd and 3rd files from the tape, and skip everything else. By default this field is empty, which extracts all files).
Prefix Output prefix (default: file).
Extension Output file extension (default: dd).
Fill failed blocks Fill blocks that give read errors with null bytes. When this option is checked, tapeimgr calls dd with the flags conv=noerror,sync. The use of these flags is often recommended to ensure a forensic image with no missing/offset bytes in case of read errors (source: forensicswiki), but when used with a block size that is larger than the actual block size it will generate padding bytes that make the extracted data unreadable. Because of this, any user-specified value of the Initial Block Size setting (see above) is ignored when this option is used. WARNING: this option may result in malformed output if the actual block size is either smaller than 512 bytes, and/or if the block size is not a multiple of 512 bytes! (I have no idea if this is even possible?).
Identifier Unique identifier. You can either enter an existing identifier yourself, or press the UUID button to generate a Universally unique identifier.
Description A text string that describes the tape (e.g. the title that is written on its inlay card).
Notes Any additional info or notes you want to record with the tape.

Command-line operation

It is also possible to invoke tapeimgr with command-line arguments. The general syntax is:

tapeimgr [-h] [--version] [--fill] [--device DEVICE]
            [--blocksize SIZE] [--files FILES] [--prefix PREF]
            [--extension EXT] [--identifier IDENTIFIER]
            [--description DESCRIPTION] [--notes NOTES]
            dirOut

Here dirOut is the output directory. So, the command-line equivalent of the first GUI example is:

tapeimgr /home/bcadmin/test/

This will extract the contents of the tape to directory /home/bcadmin/test/, using the default options. Note that for a user install, you may need to provide the full path to tapeimgr, i.e.:

~/.local/bin/tapeimgr /home/bcadmin/test/

Options

As with the GUI interface you can customize the default behaviour by using one or more of the following optional arguments:

Argument Description
-h, --help show help message and exit
--version, -v show program's version number and exit
--device DEVICE, -d DEVICE Non-rewind tape device (default: /dev/nst0).
--blocksize SIZE, -b SIZE Initial block size in bytes (must be a multiple of 512). This is used as a starting value for the iterative block size estimation procedure. Block sizes smaller than 4096 are reported to give poor performance (source: forensicswiki), and this option can be useful to speed up the extraction process in such cases. Note that the user-specified value of --blocksize is ignored if the --fill option (see below) is activated.
--files FILES, -s FILES Comma-separated list of files to extract. For example, a value of 2,3 will only extract the 2nd and 3rd files from the tape, and skip everything else. By default this field is empty, which extracts all files).
--prefix PREF, -p PREF Output prefix (default: file).
--extension EXT, -e EXT Output file extension (default: dd).
--fill, -f Fill blocks that give read errors with null bytes. When this option is checked, tapeimgr calls dd with the flags conv=noerror,sync. The use of these flags is often recommended to ensure a forensic image with no missing/offset bytes in case of read errors (source: forensicswiki), but when used with a block size that is larger than the actual block size it will generate padding bytes that make the extracted data unreadable. Because of this, any user-specified value of the --blocksizesetting (see above) is ignored when this option is used. WARNING: this option may result in malformed output if the actual block size is either smaller than 512 bytes, and/or if the block size is not a multiple of 512 bytes! (I have no idea if this is even possible?).
--identifier IDENTIFIER, -i IDENTIFIER Unique identifier. You can either enter an existing identifier yourself, or enter special value @uuid to generate a Universally unique identifier.
--description DESCRIPTION, -c DESCRIPTION A text string that describes the tape (e.g. the title that is written on its inlay card).
--notes NOTES, -n NOTES Any additional info or notes you want to record with the tape.

Metadata file

The file metadata.json contains metadata in JSON format. Below is an example:

{
    "acquisitionEnd": "2019-01-21T13:26:38.813304+01:00",
    "acquisitionStart": "2019-01-21T13:25:53.570770+01:00",
    "checksumType": "SHA-512",
    "checksums": {
        "file000001.dd": "e58279519cd394870f7d39fe59d722bf85c64fb95a9b4c8a893fde0a606f5e270529d17d0598d9e703f9b259a2355c91aaa3721249a64982e580f2f18e6e52f5",
        "file000002.dd": "19f200700afeab90d45d3beec0cdaf5290ca517574b9049feee80d1257c5d11677d9ecd101c31e30b02d58b4d84c8cac0ea7326d10342d1d7ea4ce40dde663ca",
        "file000003.dd": "9f4f5ea7cc3639c07ca88b4dcda9b976d2802d229a26415d960962d4fdc2d920ea1ce554343dfc5f22c3c616cad3977e24ddf33c86417dc0112a8560e2f1e75f"
    },
    "description": "Test tape October 25 2018",
    "extension": "dd",
    "files": "",
    "fillBlocks": false,
    "identifier": "2d257dec-1d77-11e9-9594-2c4138b5272c",
    "initBlockSize": 512,
    "notes": "This tape only contains some old rubbish. Created specifically for testing tapeimgr.",
    "prefix": "file",
    "successFlag": true,
    "tapeDevice": "/dev/nst0",
    "tapeimagrVersion": "0.4.0b1"
}

Configuration file

Tapeimgr's internal settings (default values for output file names, tape device, etc.) are defined in a configuration file in Json format. For a global installation it is located at /etc/tapeimgr/tapeimgr.json; for a user install it can be found at ~/.config/tapeimgr/tapeimgr.json. The default configuration is show below:

{
    "checksumFileName": "checksums.sha512",
    "defaultDir": "",
    "extension": "dd",
    "files": "",
    "fillBlocks": "False",
    "initBlockSize": "512",
    "logFileName": "tapeimgr.log",
    "metadataFileName": "metadata.json",
    "prefix": "file",
    "tapeDevice": "/dev/nst0",
    "timeZone": "Europe/Amsterdam"
}

You can change tapeimgr's default settings by editing this file. Most of the above settings are self-explanatory, with the exception of the following:

  • defaultDir: this allows you to change the default file path that is opened after pressing Select Output Directory. By default tapeimgr uses the current user's home directory. However, if defaultDir points to a valid directory path, that directory is used instead.

  • timeZone: time zone string that is used to correctly format the acquisitionStart and acquisitionEnd date/time strings. You can adapt it to your own location by using the TZ database name from this list of tz database time zones.

Note that it is not recommended to change the value of initBlockSize, as it may result in unexpected behaviour. If you accidentally messed up the configuration file, you can always restore the original one by running the tapeimgr-config tool again.

Uninstalling tapeimgr

To remove tapeimgr, first run the tapeimgr-config with the --remove flag to remove the configuration file and the start menu and desktop files. For a global install, run:

sudo tapeimgr-config --remove

For a user install, run:

~/.local/bin/tapeimgr-config --remove

The resulting output (shown below for a user install):

INFO: removing configuration file /home/johan/.config/tapeimgr/tapeimgr.json
INFO: removing configuration directory /home/johan/.config/tapeimgr
INFO: removing desktop file /home/johan/Desktop/tapeimgr.desktop
INFO: removing desktop file /home/johan/.local/share/applications/tapeimgr.desktop
INFO: tapeimgr configuration completed successfully!

Then remove the Python package with following command (global install):

sudo pip3 uninstall tapeimgr

For a user install use this:

pip3 uninstall tapeimgr

Testing tapeimgr witout a tape drive

If you want to test tapeimgr without having access to a physical tape drive, check out mhvtl, the Linux based Virtual Tape Library (and also these rough notes which explain how to install mhvtl as well as its basic usage).

Note: based on some limited tests, it seems that rewinding a virtual tape in mhvtl with the mt command (which is used by tapeimgr) doesn't actually rewind the tape at all! This has the effect that after a tape has been processed by tapeimgr, running tapeimgr again on the same device will cause mt to freeze (and tapeimgr with it). A workaround is to unload the tape, and then load it again using something like this:

mtx -f /dev/sg11 uload 1 0
mtx -f /dev/sg11 load 1 0

After this the virtual tape device works normally again.

Contributors

Written by Johan van der Knijff.

License

Tapeimgr is released under the Apache License 2.0.

tapeimgr's People

Contributors

bitsgalore avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

tapeimgr's Issues

Support for tapes that were written in variable-block mode?

Apparently block size can be variable within a file that's written to tape, see:

https://github.com/torvalds/linux/blob/master/Documentation/scsi/st.rst

In variable block mode, the byte count in write() determines the size
of the physical block on tape. When reading, the drive reads the next
tape block and returns to the user the data if the read() byte count
is at least the block size. Otherwise, error ENOMEM is returned.

Don't know how common this is, but this is something to keep in mind if tapeimgr fails on a tape. In theory support for variable-block tapes could be added by reading 1-block at a time, and then do a new blockSize estimation once a block results in read errors.

TODO: see if variable-block tapes can be emulated with mhvtl, and use that for testing.

It also implies that the 'filll failed blocks' option is likely to yield corrupted content for variable-block tapes.

See also:

https://unix.stackexchange.com/a/366217/187090

and:

https://www.mkssoftware.com/docs/man4/tape.4.asp

which says:

SCSI tape drives can support variable block mode, fixed block mode, or both. Older drives typically support only fixed block mode. Newer drives often either support variable block mode only, or both variable block and fixed block modes.

So if true this means older tapes are unlikely to be affected.

Report imaging status at end of each run

Would need an overall success/fail flag, based on the dd / mt exit codes. If fail, notify user via pop-up at end of imaging run (and refer to log file for details)

Move getConfiguration() in gui.py and cli.py to Tape class

The exact same function is now replicated in both cli.py and gui.py:

def getConfiguration(self):

def getConfiguration(self):

It would be better to move it to the Tape class. This would require some changes to the Tape class and how it is called by gui.py and cli.py. In particular:

  • Define default values for all Tape class arguments
  • Create a Tape class instance before Submit function is called
  • Then use additional function calls to update the Tape instance variables.

Add I/O checks

IOError, PermissionError for log file; check if dirOut already contains files, etc.

ERROR - 'int' object is not callable at end of run

Sometime the following error occurs at the end of a tapeimgr run:

2018-12-04 14:34:48,206 - ERROR - 'int' object is not callable
Traceback (most recent call last):
  File "/home/johan/.local/lib/python3.5/site-packages/tapeimgr/gui.py", line 341, in main
    time.sleep(0.1)
TypeError: 'int' object is not callable

Ther error refers to the following line (sleep value for main loop in GUI):

 time.sleep(0.1)

Running tapeimgr in GUI mode on BitCurator / Ubuntu 18.04LTS (Bionic)

The current launcher uses gksudo to run tapeimgr as root. But gksudo is not installed by default on BitCurator / Ubuntu 18.04LTS (Bionic). BitCurator does have the pkexec tool (which it uses to launch Guymager as root), but running the command:

pkexec tapeimgr

Results in this error:

_tkinter.TclError: no display name and no $DISPLAY environment variable

It seems that this not a bug, but a result of the pkexec/PolicyKit policy settings. See also:

https://groups.google.com/forum/#!topic/comp.lang.python/s4WMWIdkXMY

More on differences between pkexec and gksudo:

https://askubuntu.com/questions/78352/when-to-use-pkexec-vs-gksu-gksudo

And specifically on Python:

https://askubuntu.com/questions/288087/can-i-use-pkexec-in-a-python-script-or-a-desktop-file

Verify code for Fill Failed blocks data entry + reset_gui

Current code:

self.fillblocks_entry = tk.Checkbutton(self, variable=self.tape.fillBlocks)

Shouldn't this be:

self.fillblocks_entry = tk.Checkbutton(self, variable=self.tape.fBlocks)

Also code in reset_gui doesn't look quit right. See also omimgr.

Package with Pypi

Note: tapeimgr needs root access, so menu/desktop launcher command should be sth like gksudo tapeimgr. This might need some kind of post-install hook.

Repeatedly reads only first block

Trying to recover old DDS-90 tapes on a Sony DAT drive. Keeps reading only the first 32K block. I let it run on a tape and stopped it when the first 40+ .dd files were identical. Tried a couple others and get either this or it just keeps increasing the test size and never reads a block.

Installed fresh Ubuntu 20.x and kept using apt-get until it ran.

Support user installs

In that case:

Global install goes to:

/usr/local/lib/python3.5/dist-packages/

User install goes to:

/home/johan/.local/lib/python3.5/site-packages/

So rule, sth like:

  • If packageDir in $HOME :

    • configuration to ~/.config/tapeimgr directory
    • menu entry in ~/.local/share/applications
    • desktop launcher to ~/Desktop
  • Else:

    • configuration to /etc/tapeimgr directory
    • menu entry in /usr/share/applications
    • desktop launcher to ~/Desktop

Update documentation

  • New annotation options / interface
  • Metadata output file
  • New items in configuration file
  • Default dir in config file

Additional check on dd exit status after each file/session?

For DDS-3 tape, fsr exit status after 3 sessions is 0, but subsequent reads with dd result in:

dd: error reading '/dev/nst0': Input/output error
0+0 records in
0+0 records out
0 bytes copied, 0.0206987 s, 0.0 kB/s

Result is that tapeimgr get stuck in an infinite loop, producing 0-byte files for each iteration. Unclear if cause is tape, reader or something else. A possible fix is to do a check on dd's exit status, and quit if it isn't 0 (but this could have some unintended side-effects).

Add tool that creates summary CSV file of dir that contains multiple tapes

This utility should traverse all 1st-order subdirs, look for a tapeimgr metadata file (naming based on config file, with option to override to user-defined name), and create a summary file in CSV format that lists, for each tape:

  • tape directory file path (relative to parent dir)
  • identifier
  • description
  • acquisitionStart
  • acquisitionEnd
  • successFlag

Packages do not match the hashes from the requirements file.

When attempting to install this tape imager using the command
sudo install pip3 tapeimgr
I encounter this error.
Screenshot_14
I'm running Mint 19.3 64-bit MATE.
I copied the error into a note because it shows up as red text in the terminal and is difficult to read.
I'm not sure how to continue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.