GithubHelp home page GithubHelp logo

zeelq / node-walk Goto Github PK

View Code? Open in Web Editor NEW

This project forked from neerolyte/node-walk

0.0 2.0 0.0 81 KB

A semi-port of python's os.walk

License: Apache License 2.0

JavaScript 99.67% Shell 0.33%

node-walk's Introduction

node-walk

nodejs walk implementation.

This is somewhat of a port python's os.walk, but using Node.JS conventions.

  • EventEmitter
  • Asynchronous
  • Chronological (optionally)
  • Built-in flow-control
  • includes Synchronous version (same API as Asynchronous)

As few file descriptors are opened at a time as possible. This is particularly well suited for single hard disks which are not flash or solid state.

Installation

npm install --save walk

Getting Started

Choose wisely the path you walk, like so:

"use strict";

var walk    = require('walk')
  , fs      = require('fs')
  , path    = require('path')
  , walker  = walk.walk("/tmp", { followLinks: false })
  ;

walker.on("file", fileHandler);
walker.on("errors", errorsHandler); // plural
walker.on("end", endHandler);

Where your handlers might look something like these:

'use strict';

function fileHandler(root, fileStat, next) {
  fs.readFile(path.resolve(root, fileStat.name), function (buffer) {
    console.log(fileStat.name, buffer.byteLength);
    next();
  });
}

function errorsHandler(root, nodeStatsArray, next) {
  nodeStatsArray.forEach(function (n) {
    console.error("[ERROR] " + n.name)
    console.error(n.error.message || (n.error.code + ": " + n.error.path));
  });
  next();
}

function endHandler() {
  console.log("all done");
}

Common Events

All single event callbacks are in the form of function (root, stat, next) {}.

All multiple event callbacks callbacks are in the form of function (root, stats, next) {}, except names which is an array of strings.

All error event callbacks are in the form function (root, stat/stats, next) {}. stat.error contains the error.

  • names
  • directory
  • directories
  • file
  • files
  • end
  • nodeError (stat failed)
  • directoryError (stat succedded, but readdir failed)
  • errors (a collection of any errors encountered)

A typical stat event looks like this:

{ dev: 16777223,
  mode: 33188,
  nlink: 1,
  uid: 501,
  gid: 20,
  rdev: 0,
  blksize: 4096,
  ino: 49868100,
  size: 5617,
  blocks: 16,
  atime: Mon Jan 05 2015 18:18:10 GMT-0700 (MST),
  mtime: Thu Sep 25 2014 21:21:28 GMT-0600 (MDT),
  ctime: Thu Sep 25 2014 21:21:28 GMT-0600 (MDT),
  birthtime: Thu Sep 25 2014 21:21:28 GMT-0600 (MDT),
  name: 'README.md',
  type: 'file' }

Advanced Example

Both Asynchronous and Synchronous versions are provided.

(function () {
  "use strict";

  var walk = require('walk')
    , fs = require('fs')
    , options
    , walker
    ;

  options = {
    followLinks: false
    // directories with these keys will be skipped
  , filters: ["Temp", "_Temp"]
  };

  walker = walk.walk("/tmp", options);

  // OR
  // walker = walk.walkSync("/tmp", options);

  walker.on("names", function (root, nodeNamesArray) {
    nodeNamesArray.sort(function (a, b) {
      if (a > b) return 1;
      if (a < b) return -1;
      return 0;
    });
  });

  walker.on("directories", function (root, dirStatsArray, next) {
    // dirStatsArray is an array of `stat` objects with the additional attributes
    // * type
    // * error
    // * name
    
    next();
  });

  walker.on("file", function (root, fileStats, next) {
    fs.readFile(path.join(root, fileStats.name), function () {
      // doStuff
      next();
    });
  });

  walker.on("errors", function (root, nodeStatsArray, next) {
    next();
  });

  walker.on("end", function () {
    console.log("all done");
  });
}());

Sync

Note: You can't use EventEmitter if you want truly synchronous walker (although it's synchronous under the hood, it appears not to be due to the use of process.nextTick()).

Instead you must use options.listeners for truly synchronous walker.

Although the sync version uses all of the fs.readSync, fs.readdirSync, and other sync methods, I don't think I can prevent the process.nextTick() that EventEmitter calls.

(function () {
  "use strict";

  var walk = require('walk')
    , fs = require('fs')
    , options
    , walker
    ;

  // To be truly synchronous in the emitter and maintain a compatible api,
  // the listeners must be listed before the object is created
  options = {
    listeners: {
      names: function (root, nodeNamesArray) {
        nodeNamesArray.sort(function (a, b) {
          if (a > b) return 1;
          if (a < b) return -1;
          return 0;
        });
      }
    , directories: function (root, dirStatsArray, next) {
        // dirStatsArray is an array of `stat` objects with the additional attributes
        // * type
        // * error
        // * name
        
        next();
      }
    , file: function (root, fileStats, next) {
        fs.readFile(fileStats.name, function () {
          // doStuff
          next();
        });
      }
    , errors: function (root, nodeStatsArray, next) {
        next();
      }
    }
  };

  walker = walk.walkSync("/tmp", options);

  console.log("all done");
}());

API

Emitted Values

  • on('XYZ', function(root, stats, next) {})

  • root - the containing the files to be inspected

  • stats[Array] - a single stats object or an array with some added attributes

    • type - 'file', 'directory', etc
    • error
    • name - the name of the file, dir, etc
  • next - no more files will be read until this is called

Single Events - fired immediately

  • end - No files, dirs, etc left to inspect

  • directoryError - Error when fstat succeeded, but reading path failed (Probably due to permissions).

  • nodeError - Error fstat did not succeeded.

  • node - a stats object for a node of any type

  • file - includes links when followLinks is true

  • directory - NOTE you could get a recursive loop if followLinks and a directory links to its parent

  • symbolicLink - always empty when followLinks is true

  • blockDevice

  • characterDevice

  • FIFO

  • socket

Events with Array Arguments - fired after all files in the dir have been stated

  • names - before any stat takes place. Useful for sorting and filtering.

    • Note: the array is an array of strings, not stat objects
    • Note: the next argument is a noop
  • errors - errors encountered by fs.stat when reading ndes in a directory

  • nodes - an array of stats of any type

  • files

  • directories - modification of this array - sorting, removing, etc - affects traversal

  • symbolicLinks

  • blockDevices

  • characterDevices

  • FIFOs

  • sockets

Warning beware of infinite loops when followLinks is true (using walk-recurse varient).

Performance

Tested on my /System containing 59,490 (+ self) directories (and lots of files). The size of the text output was 6mb.

find: time bash -c "find /System -type d | wc" 59491 97935 6262916

real  2m27.114s
user  0m1.193s
sys 0m14.859s

find.js:

Note that find.js omits the start directory

time bash -c "node examples/find.js /System -type d | wc"
59490   97934 6262908

# Test 1 
real  2m52.273s
user  0m20.374s
sys 0m27.800s

# Test 2
real  2m23.725s
user  0m18.019s
sys 0m23.202s

# Test 3
real  2m50.077s
user  0m17.661s
sys 0m24.008s

In conclusion node.js asynchronous walk is much slower than regular "find".

LICENSE

node-walk is available under the following licenses:

  • MIT
  • Apache 2

Copyright 2011 - Present AJ ONeal

node-walk's People

Contributors

cainus avatar claflamme avatar davidmurdoch avatar micahstubbs avatar pdehaan avatar ralphtheninja avatar sid3y1 avatar stevenvachon avatar zectbynmo avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.