mafintosh / tar-fs Goto Github PK
View Code? Open in Web Editor NEWfs bindings for tar-stream
License: MIT License
fs bindings for tar-stream
License: MIT License
I tried to understand your documentation, and I looked a little at the source code but I cannot understand how I would use your library to add files to a tarball with specific user, group, file, and directory permissions.
I am trying to create a tarball where certain directories have certain permissions, certain files have certain permissions and certain files are owned by a particular user and group.
Is this possible with your library?
I can't get the "DONE!" to print out here (even though tar-fs successfully extracts the whole tar):
fs.createReadStream('./my-file.tar')
.pipe(tar.extract('./my-dir'))
.on('error', function(err) {
console.error('ERROR', err);
})
.on('end', function() {
console.log('DONE!');
});
I've also tried listening for finish
instead of end
(as I saw that name used in your tests) but that doesn't work either. Not sure what I'm doing...
I'm working on a project to watch the new files in a directory and pack them via tar-fs and then transfer them to a remote server. Because I know exactly which files to pack, so it's not necessary to traverse the whole directory. I think it may be useful to support whitelist in this scenario.
If you need, I can submit a pr of this feature.
After success pack the tar is corrupt.(can not open, and the size of file is 0 bytes)
Vagrant, and more specifically VirtualBox and vboxfs, don't allow to create hard-links, so I propose to add an option to create a new file if we get an EPERM
when trying to extract a hard-link, so prebuild
and prebuild-install
modules could work correctly inside vagrant. I think the key line it's at
Line 250 in ebc956e
I'm getting odd errors with this module. I create a tar-stream of the current directory and then write it out to a file. When I open the tar, about 1/8 of the directories are in the wrong place (they were 1 level deep, now they're in the top directory. And multiple files are missing. It seems the same directories/files are moved/missing each time.
I'm using transforms in the mapStream function for each file. Several I've used work as expected; however, I'm unable to use zlib.createGzip as a transform.
mapStream: function(fileStream, header) {
var compressor = zlib.createGzip();
return fileStream.pipe(compressor);
}
The following works
mapStream: function(fileStream, header) {
var cipher = crypto.createCipher('aes-256-ctr', 'Thisisatest')
return fileStream.pipe(cipher);
}
Am I missing something? I've also tried zlib.createDefate, zlib.createDefateRaw, and several other transforms without success.
Thank you in advance for your time and consideration.
If a single file in the dir, the results do not have dir, but multiple not.
The "pack" action is not working for this scenario:
I think it should work fine even if some files are being used inside sourceCodeFolder as it should only require read access to the files being packed.
The pack process does not fail. The result is a zero bytes .tar file. If I double click it (in Mac at least) the Archive Utility extracts another file with exact same name but with .cpgz extension. If I now try to open this .cpgz file I end up once again with a .tar file with the same name (with a " 2" added to the file's name due to duplication). This turns into an infinite loop if you want to uncompress .tar and then uncompress .cpgz over and over again.
Is it possible to have better error handling when extracting (or packing) or does it already exist?
For instance, if I try to extract a tarball containing filenames differing only in case onto a file-system that is only case-preserving, I can get exceptions in a number of different ways - for instance, if two files differ only in case and one is a file and the other is a directory, or if one of these files tries to be a symlink to the other. I'm also concerned that other types of errors might occur which I simply haven't run into yet.
I have tried catching such errors using process.on("uncaughtException",f)
but after the exception handler is run no more files are processed from the tarball.
Is there some kind of best practice for handling errors that I am missing, or is this a bug in tar-fs?
Hey there, first off thanks for making tar faster.
I'm trying to unpack + untar binaries on linux which have a specific file permission. It looks like you honor the file permission (https://github.com/mafintosh/tar-fs/blob/master/index.js#L166) but my testcases fails: steffenmllr@5e546a5
Any ideas?
Hi:
Curious why an error is not emitted on invalid tar files?
echo "this is just a plain text file" > file.tar
var tarFs = require('tar-fs')
var fs = require('fs')
fs.createReadStream('./file.tar')
.pipe(
tarFs.extract('./test-o')
)
.on('error', function (err) {
console.log(err);
})
.on('finish', function () {
console.log('finished!');
})
It doesn't seem to work on Windows 10.
I am piping .tar file from a web server into tar-fs.extract.
request.get({
url: config.api.core+"/resource/download",
json: true,
qs: {r: task.resource_id, p: instance._id+"/_thaw/"+path.basename(file.path)},
headers: auth_headers,
}).pipe(tar.extract(dirname))
I get following error message.
stream.js:74
throw er; // Unhandled stream error in pipe.
^
Error: EISDIR: illegal operation on a directory, open 'C:\Users\Soichi\Desktop\Python-2.7.5.5717f7b43e583c602758d8e9\python'
at Error (native)
I am using
$ node --version
v4.2.6
While trying to decompress the LInux kernel 4.6 with tar-fs
with the option {strip: 1}
I've got the next error:
{ Error: ENOENT: no such file or directory, symlink '' -> 'tmp/linux/arch/arm/boot/dts/sun8i-a23-ippo-q8h-v1.2.dts'
at Error (native)
errno: -2,
code: 'ENOENT',
syscall: 'symlink',
path: '',
dest: 'tmp/linux/arch/arm/boot/dts/sun8i-a23-ippo-q8h-v1.2.dts' }
at errorPurge (/home/piranna/Proyectos/download-manager/index.js:200:15)
at ClientRequest.<anonymous> (/home/piranna/Proyectos/download-manager/index.js:215:11)
at ClientRequest.g (events.js:286:16)
at emitNone (events.js:86:13)
at ClientRequest.emit (events.js:185:7)
at emitAbortNT (_http_client.js:247:8)
at _combinedTickCallback (internal/process/next_tick.js:71:11)
at process._tickCallback (internal/process/next_tick.js:98:9)
Looking at the Linux kernet itself, that symlink should be pointing to sun8i-a23-q8-tablet.dts
instead of being an empty string, and reviewing the tar-fs
source code I think the problem is that it's changing inconditionally all the symlinks destinations, while it should don't do it when they are relative paths.
The readme file mentions using mapStream
during pack.entry
, but any transformation on the stream that alters the size of the contents results in "size mismatch" error, thrown from tar-stream here https://github.com/mafintosh/tar-stream/blob/bbb3e91a44fde7bdb3d179e27afdaa4b08fa74ac/pack.js#L174.
The error is silent. Unless debugging logs are added to onnextentry
function of tar-fs, the error manifests as the stream simply freezing.
Is there a way to get around this?
Linux standard tar command includes command line switch allowing to follow symlinks. Would be great to add this as an option to this module. My project requires to archive quite a lot of data 'on the fly' and pass output stream directly to HTTP socket. Some of files ar big and copy would be inefficient, that is why they are symbolic links.
I use this package in my app to compress some folder content. as you said
tar.pack(source).pipe(fs.createWriteStream(fullAddress));
// source = folder address
// fullAddress = tar file destination
After call it tar file created and everything is ok but when i try to uncompressed that file
fs.createReadStream(TarFile).pipe(tar.extract(destinationFolder));
I got this error
events.js:72
throw er; // Unhandled 'error' event
^
Error: EPERM, utime '/code/' <-------------- destination address
31 May 05:42:48 - [nodemon] app crashed - waiting for file changes before starting...
Code folder permission is 777
Why this happen and how can i solve it?
Hi,
In the https://github.com/yarnpkg/yarn project, we faced a strange problem both with node-tar and tar-fs during the unpacking of many stream at the same time.
The issue is yarnpkg/yarn#2629 and the interesting bit of information that led to this issue can be found starting with yarnpkg/yarn#2629 (comment). The conclusion was that when yarn is downloading very very fast more than 12 (on one of the yarn dev machine, @bestander, than found the source of the issue) tar, because of the extreme concurrency, some of the extraction were failing at the filesystem level.
Originally tar-fs was tried as an alternative of the currently used node-tar, and the problem was visible as well with it. Yarn people are interested on using tar-fs and was wondering if you would help:
I hope we can find a good solution together to improve the state of tar extraction in the node ecosystem :)
Thanks!
This is useful when the tarball was created on windows (which can result in dirs not being readable)
can you elaborate on this? if I use tar-fs to pack a tar and then later unpack it on the same windows machine, could I still have problems? under what circumstances would dirs be unreadable?
node-tar supports { strip: 1 }
is this possible to do in tar-fs
?
If you're using a UNIX machine and you tar up the file by referencing the current directory with "./" such as
tar -cvf testietartwo.tar ./
All the files within the tar will have a prefixing period like so:
./F96F692411E4E85800000080EF059BE0_190358.tsv
(You can see this by doing a VI on the .tar)
When this tar is untarred with tar-fs and it's name is passed to the ignore function in tar-fs, the file has a prefix of "._" which is obviously something that we want to ignore.
What can be done to fix this? I've looked over the source code and didn't fight anything that blatantly prepends it to the file name, is it one of the encoders?
I am getting an issue form one of your sub dependencies mkdirp
, it has a deprecated notation of
mode = 0777 & (~process.umask());
this is solve on 0.5.1 ( https://github.com/substack/node-mkdirp/issues/98 ), would it be posible to update this dependency? i will open a PR for it later today.
I am using your pack in this project https://www.npmjs.com/package/any-prebuilt
Add umask
option for packing in addition to dmode
and fmode
similar to the one for extract, so files can be stored with their permissions filtered without needed to do chmod -R
before.
I'm using tar-fs to periodically download tar files and extract them into a file store with hash identifiers as directory names. Is it possible to set the name of the output directory during extraction or does this need to be done after the fact?
Hello, I'm wrapping tar.extract within a promise for my application, and I'm having a bit of trouble listening for the end of the extraction.
Here is the code I'm using, assume that it's wrapped within a promise and resolve
and reject
are defined and work properly. As well, incoming
is an incoming file stream.
var extractor = tar.extract(folder);
incoming.on("error", reject);
extractor.on("error", reject);
extractor.on("end", resolve);
incoming.pipe(extractor);
When we execute " fs.createReadStream('my-other-tarball.tar').pipe(tar.extract('./my-other-directory')) " ,the docker will occur an error.This error is " Invalid tar header. Maybe the tar is corrupted or it needs to be gunzipped? ".
As indicated at prebuild/prebuild#110 tar-fs
needs to have support for hard links, and ideally it would need to add support for the remaining file types supported by tar-stream
.
Hello,
I wanted to use the header map option while packing in order to modify the header mode based on the header type.
Currently whatever the entry is (directory or file), the header given to the mapping function is always 'file'.
Looking at the source code, it seems it cannot be something else.
Is that deliberate? Why is it so?
Hi,
when trying to open (with another tar utility, such as gnu tar or docker) very large (1 GB+) tar files generated with tar-fs, one gets errors such as:
FYI
tar-fs seems to set permissions on the directories too early, preventing itself from later writing their contents.
Minimal example:
const tar = require("tar-fs")
const fs = require("fs")
fs.createReadStream("ro.tar").pipe(tar.extract("ro-extracted"))
With https://sphalerite.org/dump/ro.tar in the CWD, this results in:
events.js:183
throw er; // Unhandled 'error' event
^
Error: EACCES: permission denied, open 'ro-extracted/ro/foo'
Since the directory ro
is created read-only. GNU tar creates the ro
directory read-write, and only sets it read-only after populating it, resulting in a successful extraction.
Hi ! Nice work @mafintosh !
Could you provide an example packing several different directories/files please.
This simplified example lead me to an error :
...
let pack = require('tar-stream').pack();
tarfs.pack(path.join(__dirname, 'a', 'file.ini'), {
pack: pack,
map: function (header) {
header.name = `file/${header.name}`;
return header;
}
});
tarfs.pack(path.join(__dirname, 'any', 'folder'), {
pack: pack,
map: function (header) {
header.name = `folder/${header.name}`;
return header;
}
});
pack.pipe(outputStream);
This error :
if (this._stream) throw new Error('already piping an entry')
^
Error: already piping an entry
at Pack.entry (E:\svn\michelin\lp2r\trunk\dev\decoupe\node_modules\tar-strea
m\pack.js:108:27)
at onstat (E:\svn\michelin\lp2r\trunk\dev\decoupe\node_modules\tar-fs\index.
js:107:19)
at E:\svn\michelin\lp2r\trunk\dev\decoupe\node_modules\tar-fs\index.js:40:9
at FSReqWrap.oncomplete (fs.js:82:15)
I can't seem to find how to know when tar.pack
is done so I can call the callback.
I have
function tarMyDir(done) {
tar.pack('my-dir')
.pipe(fs.createWriteStream('my-file.tar'))
.on('error', done)
.on('end', done);
}
But the done
callback is never ever called. If I listen for finish
instead of end
, it makes no difference. What am I missing?
When extracting on process.getuid() === 0
, chown is (correctly!) run on any files that get expanded. Directories, on the other hand, do not have their owners modified, so they'll always get extracted as root
.
I don't mind putting up my own PR for this, and if y'all have particular opinions on how to do it, I can totally do it that way.
Cheers!
When running on Linux system the pack example tar.pack('./my-directory').pipe(fs.createWriteStream('my-tarball.tar'))
, the generated tar contains a "." entry.
tar -tf my-tarball.tar
.
my-file.txt
another-file.txt
This entry should not be included.
Hi,
Stream events like, 'data', 'end', 'error' events are not getting triggered while pipe operation in Linux systems. But the same events are triggering perfectly at Windows. Hence, kindly provide the solution this issue.
Thanks,
Pravin
Hello,
I have a tar file that contains a folder with a colon in it's name. In windows this is an illegal character and tar-fs seems to hang on it.
Usually most utilities, like 7z will convert colons to underscores....
As discussed on #49, control tests coverage to prevent regressions due to corner cases.
Is there any particular limitation that prevents a header object from being passed to the ignore function? I would like to be able to filter on type.
I just dont get it work... (with gunzip-maybe).
this is how i try it, but no error occures but the nothing get extracted...
fs.createReadStream('/opt/app/docker/data/test/GeoIP2-Country_20160223.tar.gz').pipe(gunzip()).pipe(tar.extract('/home/core/blup'));
I'm using ts-far and am running into an issue where an exception is thrown when a device node is encountered in a tarball:
Error: unsupported type for dev/dsp1 (character-device)
Looking at the code, it appears that tar-fs only knows how to handle normal files, directories, and symlinks. It's not clear to me how to work around this so my program won't crash or improperly handle device nodes.
is there a possibility to filter files by the metadata (called header)?
passing the header to the ignore func should do the job:
if (ignore(name)) {
stream.resume()
return next()
}
https://github.com/mafintosh/tar-fs/blob/master/index.js#L224-L227
if (ignore(name, header)) {
stream.resume()
return next()
}
but this is only the export side. but we need the import side also to generate and pass the header or just pass no header to the import ignore func but that would lead to an asymmetric api.
Hi,
I am using your tool in a script updating some node projects. This includes copying symlinks.
Your latest version fails unpacking bundles I packed with your tool because of line 337 in index.js
I see you have opts.dereference
available in the pack()
function to choose between stat and lstat.
Can you please make the sam available in extract()
too?
Thanks
(I'm currently running a fork of your version, but seems overkill because of this minor change)
Hi,
i have an error when try to run my script:
this is my code:
import * as path from 'path';
import * as fs from 'fs';
import * as tar from 'tar-fs';
const folderPath = path.resolve(__dirname+'/../tozip/');
tar.pack(folderPath).pipe(fs.createWriteStream('my-tarball.tar'));
Anyone have any idea for this behavior?
thank you in advance
It is fine to specify which entries to pack using the entries option.
entries: ['file1', 'subdir/file2'] // only the specific entries will be packed
How can we also rename file1 to fileXyz and file2 to fileAbc during packing
Could use a map function but it seems the syntax is very limited?
Hi,
Would it be possible for some form of events or callbacks to be implemented, as far as I am aware I cannot execute tasks that are dependent on the write (extract) being finished.
I'm guessing this solution only works for *nix environments?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.