kevva / download Goto Github PK

View Code? Open in Web Editor NEW

1.3K 20.0 195.0 250 KB

Download and extract files

License: MIT License

JavaScript 100.00%

http promise async download extract decompress stream nodejs

download's People

Stargazers

Watchers

Forkers

peamcc balaclark tschaub radum bigspaceship shinecita ahmednuaman designeng spongeuk cajones sealday connor4312 brettp lumiscript alexislg2 rbhushan90 acidburn0zzz susoko es128 mauricionobrega alfonso-presa jkramp floatdrop masakura flyeven noe007 vweevers nodeos-legacy elbergm mbrand mastilver reactjs-alex-alex2006hw jmaxxz jairfrancesco miguelrjim hedefalk piranna ouyang789987 ismnoiet p2227 donotlazy zen-li jamestalmage yangdengxian aaronklaassen alex-zhang lessenadam juneidy shadowcz007 stimms vanwalj modulexcite lfernando-silva themce yuxiaoyan1205 fallive taoyuan silverwind brunolevermanzor rauchg hafffe elidoran farfromrefug mikh17 sky0014 anhthii simonepri liath puppycodes tonny0812 nguyenchilong buzai yuannianfeng vasilevich okoyl tisunov pampanelson dahuahe keyzf calebboyd jf3096 soda-x dckt cartmand faker-a n0v1 golden0080 the-cc-dev linecode nviswanathan baldwin9527 korbinzhao wooodhead jinphen joehua87 guananddu giona69 adros nexxluo crick64

download's Issues

When using download-status, files don't get extracted

By using

  var Download = require('download')
  var progress = require('download-status')

  const NODEJS_VERSION = "v0.11.14"
  const NODEJS_URL="http://nodejs.org/dist/"+NODEJS_VERSION+"/node-"+NODEJS_VERSION+".tar.gz"
  const SRC_DIR = 'deps/node'

  Download({ extract: true, strip: 1 })
  .get(NODEJS_URL, SRC_DIR)
//  .use(progress())
  .run(function(error)
  {
    ...
  }

the source code of Node.js it's downloaded and extracted correctly at deps/node, but by using

  var Download = require('download')
  var progress = require('download-status')

  const NODEJS_VERSION = "v0.11.14"
  const NODEJS_URL="http://nodejs.org/dist/"+NODEJS_VERSION+"/node-"+NODEJS_VERSION+".tar.gz"
  const SRC_DIR = 'deps/node'

  Download({ extract: true, strip: 1 })
  .get(NODEJS_URL, SRC_DIR)
  .use(progress())
  .run(function(error)
  {
    ...
  }

the code don't get extracted and the .tar.gz only gets stored at deps/node instead.

Extract and download in parallel

When downloading several files, currently it's being waited to finish all of them previously to decompress them. They should be done at the same time, so when one finish to download it can start to decompress while other start downloading.

Notify extract

Notify to the user when a file is being extracted. It would be good that it could be showed a progress bar the same way that when it's being downloaded.

Run return stream

Can run return stream of cb(err, files) like here ?

performance issue

Tried to grab the linux kernel zip, and the process ate WAY too much memory on the extraction.

Can you come up with a more efficient unzipping method?

Re-enable support for download-status

On 4.x.x, support for download-status was removed, so now there's no way to have a feedback of the current progress. Since now several files can be downloaded at once, it would be ok, if the download progress is the total of all the files, but a better option is to have support for an independent progress for each file if they don't interfere in the screen.

Invalid permissions on extracted files from zip

Not sure I'm doing something wrong but this:

var d = new Download({extract: true, strip: 1}, 'https://github.com/google/web-starter-kit/archive/v0.5.2.zip').dest('.')
d.run(function() { console.log('done') })

Results in:

$ ls -l
total 64
----------   1 crhyme  5000  11859 Feb 15 10:39 LICENSE
----------   1 crhyme  5000   5111 Feb 15 10:39 README.md
drwxr-xr-x  14 crhyme  5000    476 Feb 15 10:39 app
drwxr-xr-x   4 crhyme  5000    136 Feb 15 10:39 docs
----------   1 crhyme  5000   5625 Feb 15 10:39 gulpfile.js
----------   1 crhyme  5000    981 Feb 15 10:39 package.json

(first column, file permissions)

md5

Add option to specify the md5 of the file to download, and raise error if it's not correct.

Download into buffer

So I can work on the files in-memory.

is_url_superb prevents downloading from intranet domains

The is_url_superb package throws an error when dealing with domains defined on the local intranet (example .local domains). Can I suggest/contribute a validateURL option?

Request: Fire events for each percentage of download

I would like the ability to register an event handler similar to this:

download.on('progress-update', function(percentage)
{
    console.log(percentage + "%");
});

To get progress updates during large file downloads.

Allow to define destination for each get

.dest() method define the location of all the downloads, and if you call it several times the last one wins. In that case, it would be better to download to that location only the ones defined after calling to .dest(), being by default the current dir. It's said, that internally each download store the location where it must be downloaded.

Incorrect extraction

As pointed at #42 (comment), extraction of files sometimes is incorrect, leading to don't maintain the execution bit and also missing files and folders.

TypeError: Object #<Download> has no method 'use'

I have code like this:

var Download = require('download');
var progress = require('download-status');
var download = new Download({extract: false, mode: '755'})
                                  .get(url)
                                  .dest(cachePath)
                                  .use(progress());

When executing, the error in title shows up.

Not sure what's going on there, it used to work a week ago. Is there any api update in v4.0.0 that can cause an issue like this?

Uses a lot of memory

If I download a really large file, it eats up a lot of memory. Is there any way to have it stream the output to the destination?

Try downloading a 1GB file

download files and keep the directory structure

when I download the web resources(Multiple files), how to keep the directory structure？
I used download.get(url).dest(dirname), but not work, how can i to do? tks

# requires
fs = require("node-fs")
url = require("url")
Download = require("download")
progress = require("download-status")
_ = require("underscore")

# argvs
if process.argv.length < 3
  console.log "Usage: node server.js aa.json"
  process.exit 1

# output
dest = process.argv[3] or "dest"

# download instance
download = new Download().use(progress())#.dest(dest)

# read chrome HAR json
fs.readFile process.argv[2], (err, data) ->
  throw err  if err
  jsonObj = JSON.parse(data)

  # loop of entry
  _.each jsonObj.log.entries, (entry, i) ->
    _url = entry.request.url
    # parsed = url.parse _url
    _url = _url.replace /\?.+$/, ''
    dirname = dest + parsed.pathname.replace(/\/[^\/]+$/, '')

    download.get(_url)#.dest(dirname)
    return

  # start
  download.run (err, files, stream) ->
    throw err  if err
    console.log "File downloaded successfully!"
    return

  return

Support proxies

download should call got module using tunnel agent when user is behind a corporate proxy

Display Progress For Each Download of Array

Hi,

is it possible to display progress for each file? and any option to limit max concurrent request at time?

Thanks

global process events

When downloading several files at a time, the progress callback (for example when using download-status) should notify the progress globally instead of for each individual download progress.

Files with the same basename overwrite each other

download/index.js

Line 152 in 5f19588

path: path.basename(url),

just the basename of the URL is used to create the vinyl file. So files with URLs with the same basename will overwrite each other. To solve this with the current codebase, one could write a rename function to add a unique (random) identifier to the path. Having the full path as the path or the URL as a property on the file would allow for a better rename or pipe function to avoid this issue.

Using with thunkify?

Cheers on a solid module. Any chance of an example outlining how one would use this with thunkify/co?

timeout causes error

When download is created with a timeout, and when it is called multiple times (for look, or asynch.each) it generates the error:

Error: callback() can only be called once.

Codebit:

function downloadFiles() {
    async.each(toDownload, function (p, done) {
        var download = new Download({
            timeout: 10000,
            pool: {
                maxSockets: Infinity
            }
        }).get(p).dest(dldir);

        download.run(function (err, files, stream) {
            if (err) {
                errorFiles.push(p);
            } else {
                console.log("Downloaded");
            }
            done();
        });
    }, function (err) {
        if (err) console.log(err);
        console.log("")
    });
};

download progress

Please add in a "progress" event using request-progress, an add-on for request (which you're currently using).

Add URL as property to vinyl file

I'm having two problems which could be solved with a url property on the vinyl files being created. One is to support better renaming to prevent files being overwritten (see issue #30). The other is that I want to create a download-manifest.json with a list of url:path pairs (like the rev-manifest from gulp-rev). But the reference to the URL ist lost at the moment in

download/index.js

Line 152 in 5f19588

path: path.basename(url),

. With the URL of the downloaded files as a property on the corresponding vinyl files I could create a pipe function to keep track of the relation in a manifest.
Another way would be to set the full URL (and not only the basename) as the path of the vinyl file, but this breaks backwards compatibility.

I'm happy to send a pull request if you think this would be a useful addition.

Custom HTTP header

I'm trying to download a resource which required authentication via a token in an HTTP header.

Is there a way to inject an HTTP header before triggering the download?

"Buffer larger than maximum size" exception

When trying to download http://gd.tuwien.ac.at/gnu/gcc/snapshots/5-20150616/gcc-5-20150616.tar.bz2 I've got the next exception:

buffer.js:71
    throw new RangeError('Attempt to allocate Buffer larger than maximum ' +
          ^
RangeError: Attempt to allocate Buffer larger than maximum size: 0x3fffffff bytes
    at new Buffer (buffer.js:71:11)
    at outputStream.writeByte (/home/piranna/Proyectos/NodeOS/node_modules/download/node_modules/gulp-decompress/node_modules/decompress/node_modules/decompress-tarbz2/node_modules/seek-bzip/seek-bzip/index.js:474:23)
    at Bunzip._read_bunzip (/home/piranna/Proyectos/NodeOS/node_modules/download/node_modules/gulp-decompress/node_modules/decompress/node_modules/decompress-tarbz2/node_modules/seek-bzip/seek-bzip/index.js:430:25)
    at Function.Bunzip.decode (/home/piranna/Proyectos/NodeOS/node_modules/download/node_modules/gulp-decompress/node_modules/decompress/node_modules/decompress-tarbz2/node_modules/seek-bzip/seek-bzip/index.js:508:10)
    at DestroyableTransform._transform (/home/piranna/Proyectos/NodeOS/node_modules/download/node_modules/gulp-decompress/node_modules/decompress/node_modules/decompress-tarbz2/index.js:59:19)
    at DestroyableTransform.Transform._read (/home/piranna/Proyectos/NodeOS/node_modules/download/node_modules/through2/node_modules/readable-stream/lib/_stream_transform.js:184:10)
    at DestroyableTransform.Transform._write (/home/piranna/Proyectos/NodeOS/node_modules/download/node_modules/through2/node_modules/readable-stream/lib/_stream_transform.js:172:12)
    at doWrite (/home/piranna/Proyectos/NodeOS/node_modules/download/node_modules/through2/node_modules/readable-stream/lib/_stream_writable.js:237:10)
    at writeOrBuffer (/home/piranna/Proyectos/NodeOS/node_modules/download/node_modules/through2/node_modules/readable-stream/lib/_stream_writable.js:227:5)
    at DestroyableTransform.Writable.write (/home/piranna/Proyectos/NodeOS/node_modules/download/node_modules/through2/node_modules/readable-stream/lib/_stream_writable.js:194:11)

Seems to me that's trying to decompress it in memory at once instead of stream it chunk by chunk.

Question: Are bunzip files supported?

download file lost it's extension name

request response headers has 'content-type' attribute, it' mime type can convert to extension name

Zip files do not seem to extract properly

I'm probably doing something wrong, but when I attempt to download and extract a zip file, it generates a directory for every single file and then puts the files deeply nested within those directories. For example, try this....

download('http://themefortress.com/getreverie4', 'app', { extract: true });

npm 1.3.11
node v0.10.20

When doing the same thing with tarballs, everything works fine.

Allow passing http/s.request options

Perhaps as opts.request which would be an object passed to got as the second arg at https://github.com/kevva/download/blob/master/index.js#L102.

This would allow setting auth and cert related options.

I can send a PR if cool

User error

~~Upgraded from node 0.10.32 to 0.10.33 and download stopped working.~~

Upgraded from an old version of download to the new version, and did not account for the api changes. My bad, 👍 the new api!

Download Stream

Is there a way to get the file download stream?

Update dependencies

The request dependency is out of date.

EISDIR after extract

var download = new Download()
    .get( "https://s3.amazonaws.com/mozilla-games/emscripten/releases/emsdk-portable.tar.gz", "temp", {extract:true} )
    .use( progress() );

download.run( function(error) {
    if (error) throw error;
});

/project/lib/install.js:6
    if (error) throw error;
                     ^
Error: EISDIR, open 'temp/emsdk_portable/'

download location for caching (when extracting)

Is there currently a way to set it?

Support .xz format

When trying to extract a .tar.xz file, it maintain the original file instead of extracting it. Using .tar.gz works correctly.

"Undefined is not a function" error

buffer.js:203
    buf.copy(buffer, pos);
        ^
TypeError: undefined is not a function
    at Function.Buffer.concat (buffer.js:203:9)
    at DuplexWrapper.<anonymous> (/home/piranna/Proyectos/NodeOS/node_modules/nodeos-barebones/node_modules/nodeos-cross-toolchain/node_modules/download/node_modules/read-all-stream/index.js:35:21)
    at DuplexWrapper.g (events.js:199:16)
    at DuplexWrapper.emit (events.js:129:20)
    at /home/piranna/Proyectos/NodeOS/node_modules/nodeos-barebones/node_modules/nodeos-cross-toolchain/node_modules/download/node_modules/stream-combiner2/node_modules/duplexer2/node_modules/readable-stream/lib/_stream_readable.js:934:16
    at process._tickCallback (node.js:355:11)

I get this each time that I try to use it, seems Buffer object don't have a copy() method. I'm using Node.js 0.12.0, if that matters... The script is the same as yesterday.

extracting files is really really really slow

I was using nodewebkit which uses this package which goes fine until it gets to the extracting phase, which hangs for upwards of 5 minutes extracting this package. Yet when I use the tar on my system, it takes 2 seconds:

$ time tar xzf node-webkit-v0.8.5-linux-x64.tar.gz 

real    0m2.040s
user    0m1.864s
sys 0m0.824s

That is an unbearably slow delay.

strip option is wrong

According to the docs it's behaviour is equivalent to --strip-components for tar. That argument only strip the first directories until it finds one that has a file or several directories, but download and decompress modules are doing it for all the files, making the decompression wrong. This is particular a problem for Java namespaced classes, where is usual to have several chained namespace folders.

Remove `vinyl` dependencies

Make use of regular streams instead.

Downloading multiple files

Could it be possible that downloading more than 150 files with this package cause some errors? At the moment it throws "Maximum call stack size exceeded".

download does not support proxy

When set proxy via ENV http_proxy, download ignores the proxy.

Knowing when decompress finished

I'm doing a download which i want it to be extracted at the end. I am then listening to some events of the returned stream (especially the 'end' from the res). This is about the ending of the download but not about the whole process (i.e, download + decompression).

How am i able (if possible now) to know when everything finished?

kevva / download Goto Github PK

download's People

Stargazers

Watchers

Forkers

download's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs