GithubHelp home page GithubHelp logo

wummel / patool Goto Github PK

View Code? Open in Web Editor NEW
390.0 18.0 68.0 23.43 MB

patool is a portable command line archive file manager

Home Page: http://wummel.github.io/patool/

License: GNU General Public License v3.0

Python 97.20% Shell 0.95% Makefile 1.85%

patool's Introduction

Patool

Patool is an archive file manager.

Various archive formats can be created, extracted, tested, listed, searched, repacked and compared with patool. The advantage of patool is its simplicity in handling archive files without having to remember a myriad of programs and options.

The archive format is determined by the file(1) program and as a fallback by the archive file extension.

patool supports 7z (.7z, .cb7), ACE (.ace, .cba), ADF (.adf), ALZIP (.alz), APE (.ape), AR (.a), ARC (.arc), ARJ (.arj), BZIP2 (.bz2), BZIP3 (.bz3), CAB (.cab), CHM (.chm), COMPRESS (.Z), CPIO (.cpio), DEB (.deb), DMS (.dms), FLAC (.flac), GZIP (.gz), ISO (.iso), LRZIP (.lrz), LZH (.lha, .lzh), LZIP (.lz), LZMA (.lzma), LZOP (.lzo), RPM (.rpm), RAR (.rar, .cbr), RZIP (.rz), SHN (.shn), TAR (.tar, .cbt), XZ (.xz), ZIP (.zip, .jar, .cbz), ZOO (.zoo) and ZSTANDARD (.zst) archive formats.

It relies on helper applications to handle those archive formats (for example xz for XZ (.xz) archives).

The archive formats TAR, ZIP, BZIP2 and GZIP are supported natively and do not require helper applications to be installed.

Examples

# Extract several archives with different formats
patool extract archive.zip otherarchive.rar

# Extract archive with password
patool extract --password somepassword archive.rar

# Test archive integrity
patool test --verbose dist.tar.gz

# List files stored in an archive
patool list package.deb

# Create a new archive
patool create --verbose /path/to/myfiles.zip file1.txt dir/

# Create a new archive with password
patool create --verbose --password somepassword /path/to/myfiles.zip file1.txt dir/

# Show differences between two archives
patool diff release1.0.tar.gz release2.0.zip

# Search for text inside archives
patool search "def urlopen" python-3.3.tar.gz

# Repackage an archive in a different format
patool repack linux-2.6.33.tar.gz linux-2.6.33.tar.bz2

Website

See https://wummel.github.io/patool/ for more info and downloads.

API

You can use patool functions from other Python applications. Log output will be on sys.stdout and sys.stderr. On errors, PatoolError will be raised. Note that extra options or customization for specific archive programs are not supported.

import patoolib
patoolib.extract_archive("archive.zip", outdir="/tmp")
patoolib.test_archive("dist.tar.gz", verbosity=1)
patoolib.list_archive("package.deb")
patoolib.create_archive("/path/to/myfiles.zip", ("file1.txt", "dir/"))
patoolib.diff_archives("release1.0.tar.gz", "release2.0.zip")
patoolib.search_archive("def urlopen", "python3.3.tar.gz")
patoolib.repack_archive("linux-2.6.33.tar.gz", "linux-2.6.33.tar.bz2")
patoolib.is_archive("package.deb")

See https://wummel.github.io/patool/ for detailed API documentation.

Test suite status

Patool has extensive unit tests to ensure the code quality.

Bash completion

Install the argcomplete python package with apt-get install python3-argcomplete, then run eval "$(register-python-argcomplete patool)". After that typing patool, a <SPACE> and then <TAB> lists available options and commands.

patool's People

Contributors

a1346054 avatar andriyor avatar benjaminwinger avatar c01o avatar careljonkhout avatar cyledoux avatar falon avatar jfcherng avatar kevinmatthe avatar parona-source avatar sanand0 avatar sr-verde avatar trellixvulnteam avatar vitlav avatar wummel avatar yarikoptic avatar yuejiyueren avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

patool's Issues

patool not responding

if extension.lower() in ['.zip', '.rar']:
    patoolib.extract_archive(archive=doc['url'], outdir=os.path.join(os.getcwd(), temp_folder))
patool: Extracting C:\Temp\docmover_temp\temp\Afternoon Reports.rar ...
patool: running "C:\Program Files\7-Zip\7z.EXE" x -oC:\Temp\docmover_temp\temp -- "C:\Temp\docmover_temp\temp\Afternoon Reports.rar"

I noticed that patool is unresponsive when i run it from python, since the files already exist at the output location.

if i take that command and run it in a command prompt it prompts me for an action : "would you like to overwrite the existing file?"

thanks,
Atti

Lack good documentation on how to use patool on Windows

I was looking for a global tool to extract files on windows server with python django . After checking couple of other libraries i tried patool.

On some research it looks that tool is good but then it really lacks good documentation . First of all i couldnt install it in Windows by just doing

pip install patool
or
python setup.py install

To Install it i have to copy patoolib in my working directory and then import patoolib and this information is not documented anywhere .

secondly patoolib supports TAR, ZIP, BZIP2 and GZIP and even tar.gz and its really impressive however it dosent support all other extension without helper applications .

Lets take example of rar . it supports rar but it needs helper rar application and it lacks documentation on how to install that helper application and where to set path for it to run a working example .

I tried various things and in the end finally i entered the path of unrar.exe file in the patoolib file to make it working . It works and it great but really lacks documentation .

in my case adding the path of unrar.exe in patoolib file helped me i.e. in patoolib under ArchivePrograms = {} dictionary with values being
ArchivePrograms = {
'rar': {
None: ('rar',),
'extract': ('unrar'),
'list': ('unrar', '7z'),
'test': ('unrar', '7z'),
},
}

change value to
ArchivePrograms = {
'rar': {
None: ('rar',),
'extract': ('D:\UnRAR.exe', '7z'),
'list': ('D:\UnRAR.exe', '7z'),
'test': ('D:\UnRAR.exe', '7z'),
},
}

where 'D:\UnRAR.exe' is the path of windows unrar application .

What i mean it this should be documented somewhere .

patool and Python 3

As I've read on PyPI you doesn't support Python3 (no classifiers for that). On the site you say that patool have support for specific versions of Python.

Please, update classifiers

patool cli not found

First of all thanks for patool. It is appreciated.

After following installation instructions on the following page (pip install patool) I see that patools is installed using 'pip list' which returns 'patool (1.7)'. However, I am unable to execute the command line interface 'patool extract myfile.rar'. patool can't be found in the path. Where can I find it?

http://wummel.github.io/patool/

"patool repack a.tar.bz2 b.tar.gz" should not untar

For repacking if both archives have the same archiver but different compressions, only the compression should be changed (ie. from tar.bz2 to tar.gz).
It is not necessary to untar the complete archive.

Currently only TAR archives will need this.

Create Archive Example

Hello,

I have a question about create_archive and filenames.

patoolib.create_archive("/path/to/myfiles.zip", ("file1.txt", "dir/"))

How do you compress a folder and all the sub-folders?
If you want a file to go into a specific location in the compress folder like:

'root_file.txt'
'/subfolder/sub_folder.txt'

How do you pass that into the create_archive?

Thank you

Return an iterator with list_archive

Hi,

First of all, great library, thank you.

I have a question regarding list_archive(). It lists the archive contents directly to stdout. To use the output, sys.stdout needs to be captured by some means of redirection and restoring.

Is it possible to perhaps have it return an iterator instead or in addition to the sys.stdout() output?

Edit: A tuple would be great too.

UnicodeEncodeError: 'utf-8' codec can't encode character '\udcfc'

Converting some old archive once a pone a time created on an Amiga:

<class 'UnicodeEncodeError'> 'utf-8' codec can't encode character '\udcfc' in position 164: surrogates not allowed
Traceback (most recent call last):
  File "/tmp/pa/bin/patool", line 210, in main
    res = globals()["run_%s" % args.command](args)
  File "/tmp/pa/bin/patool", line 100, in run_repack
    patoolib.repack_archive(args.archive_src, args.archive_dst, verbosity=args.verbosity)
  File "/tmp/pa/lib64/python3.4/site-packages/patoolib/__init__.py", line 734, in repack_archive
    res = _repack_archive(archive, archive_new, verbosity=verbosity)
  File "/tmp/pa/lib64/python3.4/site-packages/patoolib/__init__.py", line 617, in _repack_archive
    _create_archive(archive, files, **kwargs)
  File "/tmp/pa/lib64/python3.4/site-packages/patoolib/__init__.py", line 507, in _create_archive
    run_archive_cmdlist(cmdlist, verbosity=verbosity)
  File "/tmp/pa/lib64/python3.4/site-packages/patoolib/__init__.py", line 409, in run_archive_cmdlist
    return util.run_checked(cmdlist, verbosity=verbosity, **runkwargs)
  File "/tmp/pa/lib64/python3.4/site-packages/patoolib/util.py", line 210, in run_checked
    retcode = run(cmd, **kwargs)
  File "/tmp/pa/lib64/python3.4/site-packages/patoolib/util.py", line 190, in run
    log_info("running %s" % " ".join(map(shell_quote_nt, cmd)))
  File "/tmp/pa/lib64/python3.4/site-packages/patoolib/util.py", line 502, in log_info
    print("patool:", msg, file=out)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcfc' in position 164: surrogates not allowed
System info:
patool 1.8
Python 3.4.3 (default, Jul  1 2015, 18:38:11) 
[GCC 4.9.2] on linux
Local time: 2015-10-04 13:07:14+002
sys.argv ['/tmp/pa/bin/patool', 'repack', 'some-old-archive.lha', 'some-old-archive.tar.xz']
LANGUAGE = 'de_DE.UTF-8:de'
LC_CTYPE = 'de_DE.UTF-8'
LANG = 'de_DE.UTF-8'

I also tried LC_CTYPE=C LANGUAGE=C LANG=C LC_ALL/pa/bin/patool' ... and LC_CTYPE=de.ISO-8859-1 LANGUAGE=de.ISO-8859-1 LANG=de.ISO-8859-1 LC_ALL=de.ISO-8859-1 '/tmp/pa/bin/patool' 'repack' with the same result.

But I was bale to solve this by running:
LC_ALL=ISO-8859-1 '/tmp/pa/bin/patool'

Unfortunately I now have Latin-1 characters in the filenames of the new archive.

Perhaps some --input-encoding option would be useful.

Can't gzip file when 7z is installed

$ patool --verbose create 1.gz 1
patool: Creating 1.gz ...
patool error: error creating 1.gz: 'module' object has no attribute 'create_gzip'

I guess it is due too short list in programs/p7zip.py:
create_zip =
create_xz =
create_7z

Use patool as python module.

Hi,
Great project. One tool to rule them all. I was looking for a project like you's to integrate in to my project. However there are two doubts which I would like to clear from you.

  1. Is there any third part tools/ executables like 7zip required to be installed on the host system? No where it is mention in your documentation.
  2. What if I want to integrate patool into my project and simply use import patool. Again, this is not mentioned in your documentation. Particularly I am more interested in iso extraction and cpio creation/ extract.
    Bear with me if it these issues are sound stupid as I am in the learning curve stage.
    Regards.

patool doesn't create usable patool.bat/.exe on windows

running pip install patool (within virtualenv) does create patool python script (without .py extension) which is useless on windows since what is needed is an obscure .exe which would call into that script. So far the most straightforward way to generate such .exe files I found is to use machinery available within setuptools and which it provides for entry_points, see e.g. primitive setup.py of ours: https://github.com/yarikoptic/datalad/blob/nf-custom-remotes/setup.py#L30 where we tell it to create two cmdline scripts invoking two different cmdline handlers.

Unfortunately patool's setup.py is too evolved for me to provide any complete recipe here, if only I knew more why inno scripts etc.

Cheers

Need an overwrite mode

When extract an rar archive it asks:

a_file_name.txt already exists. Overwrite it ?
[Y]es, [N]o, [A]ll, n[E]ver, [R]ename, [Q]uit

Sometime it's need to work without interactive mode with ability to turn on/off an overwrite mode.

PyPI wheel format has incorrect patool shebang line

The source code for the patool script correctly has a shebang line of #!/usr/bin/env python

However, if you pip install patool in a clean environment and ensure you're using wheels rather than building from the github source, the shebang line is changed. Instead it now has #!/usr/bin/python which means if you're running in a virtualenv, the script tries to reference the system python which may not have patool installed.

You can verify the contents of the wheel on PyPI by doing the following:

  1. Go to https://pypi.python.org/pypi/patool/
  2. Download the listed wheel on that page. Available from this link: https://pypi.python.org/packages/43/94/52243ddff508780dd2d8110964320ab4851134a55ab102285b46e740f76a/patool-1.12-py2.py3-none-any.whl#md5=d8ce0b8cff19d7afeb38db4ec9cddb3c
  3. Unzip the wheel into a folder: unzip patool-1.12-py2.py3-none-any.whl
  4. Look at the patool script under patool-1.12.data/scripts
  5. You will see that it has the #!/usr/bin/python shebang instead of #!/usr/bin/env python.

make non-interactive mode less interactive

Hi,

First of all: Thanks for you awesome work on this API. It simplifies things a lot.

There is one issue I still need to solve, though. As you clearly state: patool does not support password encrypted archives. This is a valid design decision of the project. Nevertheless I think it should not fail because of password protected archives.

I use it in non-interactive mode like
patoolib.extract_archive("somefile.zip", outdir="/path/to/somewhere/", interactive=False)

and have to handle archives from an untrusted source. Some of those are password protected, I don't know which ones.

So what happens is that once I have to handle a password protected archive the non-interactive mode becomes quite interactive. Altough I don't see the output from the unpack tools used in the backend my programm stalls and waits for input. It continues once I hit enter.

So I propose to stop trying to get passwords, at least in non-interactive mode. The used tools seem to support this, but so far I did not test all of them.

I created two password protected test archives to illustrate the issue and possible solutions.

unzip fails directly when given a wrong password:

$ unzip -v
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.

Latest sources and executables are at ftp://ftp.info-zip.org/pub/infozip/ ;
see ftp://ftp.info-zip.org/pub/infozip/UnZip.html for other sites.

Compiled with gcc 6.2.1 20161124 for Unix (Linux ELF).

[...]

$ unzip -P supersecret test.zip
Archive:  test.zip
   skipping: test.txt                incorrect password

unrar is quite interactive when being confronted with password protection:

$ unrar x -y test.rar

UNRAR 5.30 beta 2 freeware      Copyright (c) 1993-2015 Alexander Roshal

Enter password (will not be echoed) for test.rar:

Checksum error in the encrypted file test.rar. Corrupt file or wrong password.
No files to extract

We could try giving it some pseudo-password. Still interactive:

$ unrar x -p supersecret -y test.rar

Enter password (will not be echoed):

Reenter password:

ERROR: Passwords do not match

Enter password (will not be echoed):
[...]

But yeah, we can tell it to ignore password protection:

$ unrar x -p- -y test.rar

UNRAR 5.30 beta 2 freeware      Copyright (c) 1993-2015 Alexander Roshal

Checksum error in the encrypted file test.rar. Corrupt file or wrong password.
No files to extract

Installation and Usage of Patool on windows . import patoollib gives module not found error

Hi ,

I need to use patool and my test server is windows . When i try to install patool by :
pip install patool i get error

Hello i am from site pakages
Downloading/unpacking patool
Downloading patool-1.7.tar.gz (69kB): 69kB downloaded
Running setup.py egg_info for package patool
Hello i am from site pakages

warning: no files found matching 'doc*.tmpl'
Installing collected packages: patool
Running setup.py install for patool
Hello i am from site pakages
usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: -c --help [cmd1 cmd2 ...]
or: -c --help-commands
or: -c cmd --help

error: option --single-version-externally-managed not recognized
Complete output from command C:\Python2.7.5\python.exe -c "import setuptools
;file='c:\users\abcde\appdata\local\temp\pip_build_abcde\patool\se
tup.py';exec(compile(open(file).read().replace('\r\n', '\n'), file, 'exe
c'))" install --record c:\users\abcde\appdata\local\temp\pip-rbjtsv-record\inst
all-record.txt --single-version-externally-managed:
Hello i am from site pakages

usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]

or: -c --help [cmd1 cmd2 ...]

or: -c --help-commands

or: -c cmd --help

error: option --single-version-externally-managed not recognized

Cleaning up...
Command C:\Python2.7.5\python.exe -c "import setuptools;file='c:\users\pba
tra\appdata\local\temp\pip_build_abcde\patool\setup.py';exec(compile(open
(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record c:
\users\abcde\appdata\local\temp\pip-rbjtsv-record\install-record.txt --single-v
ersion-externally-managed failed with error code 1 in c:\users\abcde\appdata\lo
cal\temp\pip_build_abcde\patool
Storing complete log in C:\Users\abcde\pip\pip.log

I even tried downloading the windows binary for patool names patool-0.17.exe and patool-1.7.exe however still i am not able to import patool and even when i try to use it from pyunpack i get error

File "D:/unzipping_pyunpack.py", line 13, in
Archive(rfilename).extractall(rDir)
File "C:\Python2.7.5\lib\site-packages\pyunpack__init__.py", line 83, in extractall
raise ValueError("no backend for archive file: %s (is patool installed?)" % str(self.filename))
ValueError: no backend for archive file: D:\a.rar (is patool installed?)

Can you advice how can this be fixed ?

pyunpack not working from cron (python3)

Works from manual start, not working from crontab

patool: error: unrecognized arguments: --non-interactive
Traceback (most recent call last): File "/opt/ip_location/ip_location.py", line 183, in <module> main() File "/opt/ip_location/ip_location.py", line 91, in main Archive(curDownloadedBase).extractall(temporaryDir) File "/usr/local/lib/python3.4/dist-packages/pyunpack/__init__.py", line 90, in extractall self.extractall_patool(directory, patool_path) File "/usr/local/lib/python3.4/dist-packages/pyunpack/__init__.py", line 62, in extractall_patool raise PatoolError('patool can not unpack\n' + str(p.stderr)) pyunpack.PatoolError: patool can not unpack usage: patool [-h] [--verbose] {extract,list,create,test,repack,diff,search,formats} ... patool: error: unrecognized arguments: --non-interactive

TypeError: extract_archive() got an unexpected keyword argument 'interactive'

<type 'exceptions.TypeError'> extract_archive() got an unexpected keyword argument 'interactive'
Traceback (most recent call last):
File "/opt/django/gc2/env/bin/patool", line 213, in main
res = globals()"run_%s" % args.command
File "/opt/django/gc2/env/bin/patool", line 33, in run_extract
patoolib.extract_archive(archive, verbosity=args.verbosity, interactive=args.interactive, outdir=args.outdir)
TypeError: extract_archive() got an unexpected keyword argument 'interactive'
System info:
Patool 1.4
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Local time: 2016-01-22 12:42:31+002
sys.argv ['/opt/django/gc2/env/bin/patool', 'extract', '/opt/django/gc2/gc2/media/files/storage1/logged-users/2/2016-01-22_12-42-25/in/test.tar.gz', '--outdir=/opt/django/gc2/gc2/media/files/storage1/logged-users/2/2016-01-22_12-42-25/in']
LANGUAGE = 'en_US'
LC_ALL = 'en_US.UTF-8'
LANG = 'en_US.UTF-8'

******** Patool internal error, over and out ********

Make bootable iso

I was wondering if it's possible to create a bootable iso with patool out of the box. If not, how can someone add custom flags to a command (ie. for genisoimage)?

Support for the package unar

File Roller has switched to unar instead of unrar-nonfree to extract RAR files, since unar has very good support for the different RAR formats, besides being released under a free license. patool fails to extract RAR files on my system, since I don't have unrar installed on my system (of course I can install unrar, but not depending on non-free software is the issue at hand).

Could not find an executable program to extract fomar rar

Hello,
I'm trying to use the patool to open a RAR file.

I've install the patool using pip (patool 1.12)

But using:

import patoolib
patoolib.extract_archive("archive.zip", outdir="/tmp")

I get the error:
PatoolError: coud not find an executable program to extract format rar; candidates are (rar,unrar,7z)

I'm using Python 2.7 and a Mac OS El Capitan

What i'm missing? I couldn't find any tutorials or help for that.
Thanks for any help

uncompressed nested compress file

This is my compressed file structure.

mainzip is the primary file, mainfileA1.zip & mainfileA2.zip is the 2 file inside the mainzip file

I want to extract main zip file with all the inner zip file too.

mainzip.zip

  • mainfileA1.zip

  • mainfileA2.zip

Kindly suggest the solution.

make test error

When I tried packaging patool, some error occured.

patoolib/init.py:446:


filename = '/home/rpmaker/rpmbuild/BUILD/patool-1.2/tests/data/t.txt.lzma.foo'

def get_archive_format (filename):
    """Detect filename archive format and optional compression."""
    mime, compression = util.guess_mime(filename)
    if not (mime or compression):
      raise util.PatoolError("unknown archive format for file `%s'" % filename)

E PatoolError: unknown archive format for file `/home/rpmaker/rpmbuild/BUILD/patool-1.2/tests/data/t.txt.lzma.foo'

patoolib/init.py:277: PatoolError

Wrong issue-URL in error message when requesting bugreport

The URL is this error message is wrong:

You have found an internal error in patool. Please write a bug report
at http://wummel.github.io/patool/issues and include at least the information below:

Either add a redirect page to wummel.github.io (github is able to do this somehow) or correct that URL.

a plethora of tests failing on Windows

not that I like to deal with Windows -- I thought to use patool as a panacea for dealing with archives across platforms, thus was running our tests across platforms as well. But then in attempt to troubleshoot a recent failure on our end decided to run patool's tests on that windows VM -- oh well -- wasn't pleasant. Windows kept popping up with some dialogs about crashed bzip2 etc. Here you can find the total dump of the 'make test' run on that VM: http://www.onerussian.com/tmp/patool-make-test-windows.log cmdline extractors are available as they came with git/git-annex.

No module named programs.p7zip

I am using windows 7 Enterprise, python 2.7.8, patool 1.7.

When running from cmd:
c:\temp>patool test test.cab
patool: Testing test.cab ...
patool: running "C:\Program Files (x86)\7-Zip\7z.EXE" t -- test.cab
patool: ... tested ok.

But when running from within python script:
patoolib.test_archive('test.cab')
I get following back:
patool: Testing test.cab ...
No module named programs.p7zip

How to extract .zip file which requires password?

Command

In [18]: patoolib.extract_archive('c:/test.zip')
patool: Extracting c:/test.zip ...

Error message

PatoolError: error extracting c:/temp.zip: File <zipfile.ZipInfo object at 0x04E2DB2
8> is encrypted, password required for extraction

Using zipfile directly works, but I prefer to use patool:.

zip = zipfile.ZipFile('c:/temp.csv', 'r')
zip.setpassword(password)
zip.extractall()
zip.close()    

support for "split" archives

Quite often people split large archives into multiple files, e.g. due to restrictions of file systems or hosting providers. Sample schemes I found needing to support:

(git)smaug:/mnt/btrfs/datasets/datalad/crawl/crcns/hc-2[incoming]ec014.277
$> ls -L ec014.277.dat.tar.gz-*
ec014.277.dat.tar.gz-a  ec014.277.dat.tar.gz-b  ec014.277.dat.tar.gz-c  ec014.277.dat.tar.gz-d

$> file -L ec014.277.dat.tar.gz-*
ec014.277.dat.tar.gz-a: gzip compressed data, last modified: Wed Jan 12 02:47:03 2011, from Unix
ec014.277.dat.tar.gz-b: data
ec014.277.dat.tar.gz-c: data
ec014.277.dat.tar.gz-d: data
$> ls -L neurovault_snapshot_2018_01_17.zip.00*
neurovault_snapshot_2018_01_17.zip.001  neurovault_snapshot_2018_01_17.zip.002  neurovault_snapshot_2018_01_17.zip.003

$> file -L neurovault_snapshot_2018_01_17.zip.00*
neurovault_snapshot_2018_01_17.zip.001: Zip archive data, at least v2.0 to extract
neurovault_snapshot_2018_01_17.zip.002: data
neurovault_snapshot_2018_01_17.zip.003: data

so it is just a matter of either concatenating into a big file (if possible) or piping them all to a single extraction process. .00* suffix splitting seems to be quite common in .zip world, and some tools seems to support it "natively":

$> 7z x neurovault_snapshot_2018_01_17.zip.001
...
Extracting archive: neurovault_snapshot_2018_01_17.zip.001
--
Path = neurovault_snapshot_2018_01_17.zip.001
Type = Split
Physical Size = 4290772992
Volumes = 3
Total Physical Size = 11130414625
----
Path = neurovault_snapshot_2018_01_17.zip
Size = 11130414625
--
Path = neurovault_snapshot_2018_01_17.zip
Type = zip
Physical Size = 11130414625
64-bit = +

Everything is Ok

Folders: 341
Files: 12148
Size:       11128089339
Compressed: 11130414625
...

so seems to extract all 3. patool (1.12-3 in debian) unfortunately doesn't know how to handle them:

$> patool extract neurovault_snapshot_2018_01_17.zip.001
patool: Extracting neurovault_snapshot_2018_01_17.zip.001 ...
patool error: error extracting neurovault_snapshot_2018_01_17.zip.001: unknown archive format for file `neurovault_snapshot_2018_01_17.zip.001'

large zip -- use jar to extract? unzip uses -v (list files) while extracting?

from time to time we encounter some large .zip files so I guess some version of ZIP64 support is enabled/used. e.g. now got 16GB zip

patool tries to use 7z on those but that one blows right away without even trying to extract anything. I have tried to trick it with

patoolib.ArchivePrograms['zip'].pop(None)

to cause it to use unzip. But that one also blows right away on unzip -v -- file -d outdir command, and -v is apparently a flag not to be verbose, but to cause listing of the files in the archive. So I wondered if that is correct behavior.

Well -- even without '-v' unzip also doesn't work fully correctly -- just extracts some first 4GB of data and then exits with -2.

This was on a Debian jessie box, tried also with patool straight from git.

So overall question is -- have you encountered large zip files patool (with the tools employed) could not extract? ;)

patool should avoid conflating its diagnostic logging with target output of the commands

ATM all of the output is simply provided to stdout, and I don't see any '--quiet' option which would prevent any internal output from patool (atm on 1.10 from debian), so

  1. situation could be as confusing as
$> patool list package_maker.tar.gz 2>/dev/null
patool: Listing package_maker.tar.gz ...
patool: running /bin/tar --list -z --file package_maker.tar.gz
patool: Listing
patool: Listing package_maker.tar.gz ...

(noone forbids files like patool: Listing in the archive ;) )

  1. at the level of python module, either monkey patching logging facilities or just running patool commands while swallowing all stdout to avoid printouts to the stdout when non desired (see e.g. our ugly attempts to mitigate here: https://github.com/datalad/datalad/blob/master/datalad/support/archives.py#L34)

IMHO it would be sweet if patool just used standard python's logging module for logging and possibly even directed its diagnostic output to stderr (as e.g. git does), while such commands as list have used stdout solely for their output.

Also would be neat to have e.g. --quiet option of operation which wouldn't spit out any logging unless there is an error

Cheers and thanks!

patool fails to find 7-Zip on windows 64bit

When running 32 bit python on Windows AMD64, patool does not properly add the 7-zip directory to the path when looking for programs to extract with.

Code should be updated to use the proper key opening when the user is on AMD64 windows. Fixed function attached.
updated.zip

Means of programmatically overwriting pre-existing files?

Hi,

I'm using patool to programmatically unarchive a variety of archivables by calling it via subprocess in python. Here's my basic code:

def unarchive(file_path, file_dir):
  """
  Use patool for universal extraction.
  """
  args = ['patool', 'extract', file_path, '--outdir', file_dir]
  p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  (out, err) = p.communicate()
  return out, err

However, in some cases if there are duplicate files in an archive, patool will ask for user input as how to proceed:

EO-926055986-N81-2014/CAT. No. EO-926055986-N81-2014       14-PL-0179.xlsx already exists. Overwrite it ?
[Y]es, [N]o, [A]ll, n[E]ver, [R]ename, [Q]uit A

Unfortunately, this means that the subprocess just hangs without proceeding and p.communicate() never completes. I was hoping there might be a way to add a command line flag to pass in a default response to this prompt (somewhat like -y for sudo apt-get install).

util.p7zip_supports_rar() absolute path

The util.p7zip_supports_rar() function uses an absolute path in /usr/lib/ to search for the rar codec, this is not appropriate for all systems.

Personally I am on a Mac with p7zip installed through Homebrew.

The easy fix would be to at least search /usr/local/lib/ too.

Allow passwords on non-interactive

I need an ability to provide passwords on non-interactive runs of patool.

I don't actually need this on the cli, but only through programmatic support. Essentially, I need the functions patool.extract_archive and patool.create_archive to take an optional password parameter. If the password parameter is not None, use the provided password for extraction or creation.

I'm more than willing to implement this myself.

Is this a feature you would take a pull request for? If so, do you want to expose this feature via a cli option and how do you want to handle archive formats that don't support a password? My plan was to log a warning if a password is passed to an unsupported format. If the capability is exposed via cli, I was going to explicitly state in the option the list of formats that support a password.

Failed with list or unpack rar archive with 7z on x86_64 system

$ patool --verbose list a1.rar
patool: Listing a1.rar ...
patool error: error listing a1.rar: could not find an executable program to list format rar; candidates are (rar,unrar,7z),

$ strace -f patool list a1.rar
[pid 381328] stat("/usr/bin/7z", {st_mode=S_IFREG|0755, st_size=459000, ...}) = 0
[pid 381328] access("/usr/bin/7z", X_OK) = 0
[pid 381328] stat("/usr/bin/7z", {st_mode=S_IFREG|0755, st_size=459000, ...}) = 0
[pid 381328] stat("/usr/lib/p7zip/Codecs/Rar29.so", 0x7fff8a10a120) = -1 ENOENT (No such file or directory)
[pid 381328] stat("/usr/local/lib/p7zip/Codecs/Rar29.so", 0x7fff8a10a120) = -1 ENOENT (No such file or directory)
[pid 381328] write(2, "patool error:", 13patool error:) = 13

Real path on x86_64 is /usr/lib64/p7zip/Codecs/Rar29.so

don't store full absolute paths in a tar archive

When use patool for tar archive creating, like
$ patool create some.tar /path/to/dir
created tarball will contains path/to/dir path.
With other type of archives (zip, 7z, etc) all is ok: only last component will in a archive.

Not extracting archives with Japanese names

I'm getting the following error on files with Japanese characters.

Here are examples of the names that cause these errors:

(C86) [C-CLAYS] - 番ヒノ翼 -ツガヒノツバサ-.7z
(C92) 幽閉 少女 ア ク テ ィ ブ NEETs - 第 3 幕 協奏曲 「華 鳥 風月」 SIDE B [KNCD-0015].rar

If I change the names of these archives to a.7z then no error happen, but the name of the folders is crucial for my script. Is there any way to fix this?

I can send you these archives if you want.

error: patool can not unpack


********** Oops, I did it again. *************

You have found an internal error in patool. Please write a bug report
at https://github.com/wummel/patool/issues/ and include at least the information below:

Not disclosing some of the information below due to privacy reasons is ok.
I will try to help you nonetheless, but you have to give me something
I can work with ;) .

<class 'UnicodeEncodeError'> 'charmap' codec can't encode characters in position 47-50: character maps to <undefined>
Traceback (most recent call last):
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\Scripts\patool", line 213, in main
    res = globals()["run_%s" % args.command](args)
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\Scripts\patool", line 33, in run_extract
    patoolib.extract_archive(archive, verbosity=args.verbosity, interactive=args.interactive, outdir=args.outdir)
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\lib\site-packages\patoolib\__init__.py", line 683, in extract_archive
    util.log_info("Extracting %s ..." % archive)
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\lib\site-packages\patoolib\util.py", line 516, in log_info
    print("patool:", msg, file=out)
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 47-50: character maps to <undefined>
System info:
patool 1.12
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32
Local time: 2017-12-23 17:25:54+003
sys.argv ['C:\\Users\\S\\AppData\\Local\\Programs\\Python\\Python36-32\\Scripts\\patool', '--non-interactive', 'extract', 'D:\\New folder (2)\\(C86) [C-CLAYS] - \u756a\u
30d2\u30ce\u7ffc -\u30c4\u30ac\u30d2\u30ce\u30c4\u30d0\u30b5-.7z', '--outdir=D:\\New folder (2)\\tmp']
LANG = 'en_US.UTF-8'

 ******** patool internal error, over and out ********
error: patool can not unpack


********** Oops, I did it again. *************

You have found an internal error in patool. Please write a bug report
at https://github.com/wummel/patool/issues/ and include at least the information below:

Not disclosing some of the information below due to privacy reasons is ok.
I will try to help you nonetheless, but you have to give me something
I can work with ;) .

<class 'UnicodeEncodeError'> 'charmap' codec can't encode characters in position 35-36: character maps to <undefined>
Traceback (most recent call last):
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\Scripts\patool", line 213, in main
    res = globals()["run_%s" % args.command](args)
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\Scripts\patool", line 33, in run_extract
    patoolib.extract_archive(archive, verbosity=args.verbosity, interactive=args.interactive, outdir=args.outdir)
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\lib\site-packages\patoolib\__init__.py", line 683, in extract_archive
    util.log_info("Extracting %s ..." % archive)
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\lib\site-packages\patoolib\util.py", line 516, in log_info
    print("patool:", msg, file=out)
  File "C:\Users\S\AppData\Local\Programs\Python\Python36-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 35-36: character maps to <undefined>
System info:
patool 1.12
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32
Local time: 2017-12-23 17:25:54+003
sys.argv ['C:\\Users\\S\\AppData\\Local\\Programs\\Python\\Python36-32\\Scripts\\patool', '--non-interactive', 'extract', 'D:\\New folder (2)\\(C92) \u5e7d\u9589 \u5c11\
u5973 \u30a2 \u30af \u30c6 \u30a3 \u30d6 NEETs - \u7b2c 3 \u5e55 \u5354\u594f\u66f2 \u300c\u83ef \u9ce5 \u98a8\u6708\u300d SIDE B [KNCD-0015].rar', '--outdir=D:\\New fol
der (2)\\tmp']
LANG = 'en_US.UTF-8'

 ******** patool internal error, over and out ********
PS D:\New folder (2)>

GZipped rar file fails using `unrar` and `rar`

I'm on Ubuntu 16.04.

I have the gzipped rar file archive.rar.gz created by:

echo "Hi" > text.txt
rar a archive.rar text.xt
gzip archive.rar

If unrar is installed, patool fails.

> sudo apt-get remove rar
> sudo apt-get install unrar
> patool extract archive.rar.gz
patool: Extracting archive.rar.gz ...
patool: running /usr/bin/unrar x -- /home/cledoux/Downloads/archive.rar.gz
patool:     with cwd='./Unpack_eRzOXK'
patool error: error extracting archive.rar.gz: Command `['/usr/bin/unrar', 'x', '--', '/home/cledoux/Downloads/archive.rar.gz']' returned non-zero exit status 10

If rar is installed, same problem.

> sudo apt-get remove unrar
> sudo apt-get install rar
> patool extract archive.rar.gz
patool: Extracting archive.rar.gz ...
patool: running /usr/bin/rar x -- /home/cledoux/Downloads/archive.rar.gz
patool:     with cwd='./Unpack_AX8Xbs'
patool error: error extracting archive.rar.gz: Command `['/usr/bin/rar', 'x', '--', '/home/cledoux/Downloads/archive.rar.gz']' returned non-zero exit status 10

The problem is that patool is detecting gzip as an encoding and rar as the compression format. patool doesn't do anything to strip the gzip layer, however, and the unrar tools don't know how to handle the gzip layer.

The API doesn't seem to be python-friendly

Let's consider the API to list the content of an archive:

patoolib.list_archive("archive.zip")

This statement "prints" the list of files to the screen. How can I use that list? Why don't we just return a list of strings? Like this:

>>> patoolib.list_archive("package.deb")
["file.txt", "picture.jpg"]

Allow Cygwin tar to extract full Windows paths

When we run patoolib.extract_archive('D:/file.tar') we get the following error on Windows:

>>> patoolib.extract_archive('D:/file.tar')
patool: Extracting D:/file.tar ...
patool: running D:\cygwin64\bin\tar.EXE --extract -z --file D:/file.tar --directory .\Unpack_8xxk9ccq
tar (child): Cannot connect to D: resolve failed

This can be resolved by adding a --force-local to the tar command. Can we assume that most people would use patool for local files only and add this to the command list?

Happy to send a merge request if you don't mind @wummel

patool on Windows XP results in error

Thanks for your promising tool. Alas, I met a problem:

C:\hih>patool -v test myfile.rar
patool: Testing myfile.rar ...
patool error: error testing myfile.rar: No module named programs.p7zip

C:\hih>patool -v test yourfile.zip
patool: Testing yourfile.zip ...
patool error: error testing yourfile.zip: No module named programs.p7zip

Extract and repack fail as well.

Windows XP, SP3; 7zip is installed. Patool installed (and uninstalled) via pip, now installed using the windows binary. Version 1.2. Happy to supply you with any additional information to get the issue fixed!

ACE support in windows

Hi,
we needed to open ACE archive on a windows platform but we had some problem.
In detail we had to install peazip with ace add-on (in order to have unace.exe) and then edit the extract_ace function in unace.py.

Default code output was:

>>> x = patoolib.extract_archive("pippo.ace", verbosity=1, outdir=".")
patool: Extracting pippo.ace ...
patool: running "C:\Program Files\PeaZip\res\unace\unace.exe" x pippo.ace ./
Info: creating file list

Error: no files specified
Info: finished creating file list

ActiveACE operation return code: 2
patool: ... pippo.ace extracted to .

We had to change the x flag to e [extract in current folder] and remove the output folder

def extract_ace (archive, compression, cmd, verbosity, interactive, outdir):
    """Extract an ACE archive."""
    return [cmd, 'e', archive]
>>> x = patoolib.extract_archive("pippo.ace", verbosity=1, outdir=".")
patool: Extracting pippo.ace ...
patool: running "C:\Program Files\PeaZip\res\unace\unace.exe" e pippo.ace
Info: creating file list
Info: adding file to file list
Info: finished creating file list
Extracting pippo.ace

  Extracting pippo.txt  (12 byte uncompressed, 12 byte compressed)
    processed: 0 of 12 bytes (0 of 12 bytes)
    processed: 0 of 12 bytes (0 of 12 bytes)
    processed: 0 of 12 bytes (0 of 12 bytes)
    processed: 12 of 12 bytes (12 of 12 bytes)
   CRC OK

ActiveACE operation return code: 0
patool: ... pippo.ace extracted to `.'.

There is a better way to proceed?

Execution of 7z binary hangs for existing output files

7z prompts the user to confirm whether an existing file should be overwritten. This cause the process to hang since the user is unable to supply a response to the 7z process. The program supports the -y option to answer yes to all prompts; overwriting all existing output files. patool should supply -y if the 7z binary is used in order to make the behavior consistent with other tools such as gunzip and tar.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.