GithubHelp home page GithubHelp logo

thecodrr / fdir Goto Github PK

View Code? Open in Web Editor NEW
1.3K 17.0 82.0 1.81 MB

⚡ The fastest directory crawler & globbing library for NodeJS. Crawls 1m files in < 1s

Home Page: https://thecodrr.github.io/fdir/

License: MIT License

JavaScript 9.85% TypeScript 90.15%
fdir fast nodejs javascript os sys fs filesystem walker directory

fdir's Introduction

The Fastest Directory Crawler & Globber for NodeJS

The Fastest: Nothing similar (in the NodeJS world) beats fdir in speed. It can easily crawl a directory containing 1 million files in < 1 second.

💡 Stupidly Easy: fdir uses expressive Builder pattern to build the crawler increasing code readability.

🤖 Zero Dependencies*: fdir only uses NodeJS fs & path modules.

🕺 Astonishingly Small: < 2KB in size gzipped & minified.

🖮 Hackable: Extending fdir is extremely simple now that the new Builder API is here. Feel free to experiment around.

* picomatch must be installed manually by the user to support globbing.

Support

Do you like this project? Support me by donating, creating an issue, becoming a stargazer, or opening a pull request. Thanks.

🚄 Quickstart

Installation

You can install using npm:

$ npm i fdir

or Yarn:

$ yarn add fdir

Usage

import { fdir } from "fdir";

// create the builder
const api = new fdir().withFullPaths().crawl("path/to/dir");

// get all files in a directory synchronously
const files = api.sync();

// or asynchronously
api.withPromise().then((files) => {
  // do something with the result here.
});

Documentation:

Documentation for all methods is available here.

📊 Benchmarks:

Please check the benchmark against the latest version here.

🙏Used by:

fdir is downloaded over 200k+ times a week by projects around the world. Here's a list of some notable projects using fdir in production:

Note: if you think your project should be here, feel free to open an issue. Notable is anything with a considerable amount of GitHub stars.

  1. mdn/yari
  2. streetwriters/notesnook
  3. zhangdaren/miniprogram-to-uniapp
  4. imba/imba
  5. moroshko/react-scanner
  6. netlify/build
  7. FredKSchott/snowpack*
  8. yassinedoghri/astro-i18next
  9. immich-app/CLI
  10. selfrefactor/rambda
  • snowpack has since been discontinued.

🦮 LICENSE

Copyright © 2023 Abdullah Atta under MIT. Read full text here.

fdir's People

Contributors

cramshaw avatar dependabot[bot] avatar ianvs avatar krinkle avatar kyleknighted avatar omgimalexis avatar papb avatar simov avatar stonecypher avatar thecodrr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fdir's Issues

Typescript def seems to miss exports

As sync now returns Output, we need a way to tell the compiler which output it is.
One thing we could do is just cast x = x as PathsOutput; but that needs the types combined to create Output to be exported in the .d.ts file.

Errors thrown inside .filter() are swallowed

Sample code to reproduce:

const { fdir } = require("fdir");

const api = new fdir()
  .withFullPaths()
  .filter((filePath) => {
    if (Math.random() > 0.5) {
      throw new Error("Oh crap!");
    }
    return true;
  })
  .crawl("./client");
console.log(api.sync().length);

You can run this over and over and it will just console log out a different number each time.

Suppose it wasn't if (Math.random() > 0.5) { but something like return doubleCheck(fiePath) you wouldn't get any feedback about the accidental typo of fiePath.

Fdir can't get content of subfolders

System: Windows Server 2019 Version 1809 (Build 17763.737)
Node.js Version: 12.18.2
fdir Version: 3.4.3

In my project, I have a OneDrive sync, mapped as Network drive.
When i run fdir on this, I only get the files in the root directory of the Network Drive.
But, when i map another folder as network drive, with the same code,
fdir outputs all dirs and files, as it should.

My Code:

const fdir = require("fdir")

var paths = "R:\\"


var files = new fdir().withDirs().withMaxDepth(3).crawlWithOptions(paths,{includeBasePath:true, includeDirs:true, suppressErrors:false}).sync();

files.forEach(file_path => {
	console.log(file_path)
})

Screenshot of the execution in Drive R:\ :
image

Screenshot of the execution in Drive S:\ :
image

Do you have an idea of a fix for the issue?
How can i debug fdir? How can i see errors?

A big THANK YOU is in order

Cheekily, just wanted to abuse an issue for a good cause.

MDN Web Docs (now) uses fdir and it's looking extremely promising.
We used to use glob.sync() and a whole mixed bag of other path.join() and tricky stuff.
Today I was able to replace all of it with fdir and I did it in such a way that I kept the old function so I could compare.

All the numbers are in mdn/yari#3537 and it might be a bit hard to read, but "So basically, the new function is 10x faster." is easy to understand :)

Awesome work!

Also, how I wish Node could get a low-level C++ native implementation, in the standard library, of path.join() and those guys to make this problem go away.

Need help debugging an issue?? with fdir

I've had an issue opened over on a repo - glenn2223/vscode-live-sass-compiler#145

I can't for the life of me figure out why fdir has returned a file count of 0 when the filters applied all return true for one file

They have an SSH connection in VS Code but I don't see why that would be a problem


I have copied out a section of code that produces the below output (copied from this comment)

Code

const isMatch = picomatch(fileList, { ignore: excludeItems, dot: true, nocase: true });

OutputWindow.Show(OutputLevel.Trace, "Searching folder", null, false);

const searchLogs: Map<string, string[]> = new Map<string, string[]>();

const searchFileCount = (
    (await new fdir()
        .crawlWithOptions(basePath, {
            filters: [
                (filePath) =>
                    filePath.toLowerCase().endsWith(".scss") ||
                    filePath.toLowerCase().endsWith(".sass"),
                (filePath) => {
                    const result = isMatch(path.relative(basePath, filePath));

                    searchLogs.set(`Path: ${filePath}`, [
                        `  isMatch: ${result}`,
                        `   - Base path: ${basePath}`,
                        `   - Rela path: ${path.relative(basePath, filePath)}`,
                    ]);

                    return result;
                },
                (filePath) => {
                    const result =
                        path
                            .toNamespacedPath(filePath)
                            .localeCompare(path.toNamespacedPath(sassPath), undefined, {
                                sensitivity: "accent",
                            }) === 0;

                    searchLogs
                        .get(`Path: ${filePath}`)
                        ?.push(
                            `  compare: ${result}`,
                            `   - Orig file path: ${filePath}`,
                            `   - Orig sass path: ${sassPath}`
                        );

                    return result;
                },
            ],
            includeBasePath: true,
            onlyCounts: true,
            resolvePaths: true,
            suppressErrors: true,
        })
        .withPromise()) as OnlyCountsOutput
).files;

const x = await new fdir()
    .crawlWithOptions(basePath, {
        includeBasePath: true,
        group: true,
        resolvePaths: true,
        suppressErrors: true,
    })
    .withPromise();

OutputWindow.Show(OutputLevel.Trace, "FDIR OUTPUT", [JSON.stringify(x)]);

OutputWindow.Show(OutputLevel.Trace, "Search results", undefined, false);

searchLogs.forEach((logs, key) => {
    OutputWindow.Show(OutputLevel.Trace, key, logs, false);
});

And here is the section from the output

Searching folder
FDIR OUTPUT
[
   {
      "dir":"/home/dave/MagentoAPI",
      "files":[ ]
   },
   {
      "dir":"/home/dave/MagentoAPI/dev",
      "files":[
         "/home/dave/MagentoAPI/dev/docker-compose.yml",
         "/home/dave/MagentoAPI/dev/dockerfile"
      ]
   },
   {
      "dir":"/home/dave/MagentoAPI/dev/.app",
      "files":[
         "/home/dave/MagentoAPI/dev/.app/composer.json",
         "/home/dave/MagentoAPI/dev/.app/composer.lock"
      ]
   },
   {
      "dir":"/home/dave/MagentoAPI/dev/.vscode",
      "files":[
         "/home/dave/MagentoAPI/dev/.vscode/settings.json"
      ]
   },
   {
      "dir":"/home/dave/MagentoAPI/dev/mysql-data",
      "files":[
         /// Bunch of SQL files removed for brevity
      ]
   },
   {
      "dir":"/home/dave/MagentoAPI/dev/php-confs",
      "files":[
         "/home/dave/MagentoAPI/dev/php-confs/override.ini"
      ]
   },
   {
      "dir":"/home/dave/MagentoAPI/dev/src",
      "files":[
         /// Bunch of PHP files removed for brevity
         "/home/dave/MagentoAPI/dev/src/AttributesSamplePayload.json",
         "/home/dave/MagentoAPI/dev/src/MageAPIdev.code-workspace",
         "/home/dave/MagentoAPI/dev/src/newfile.txt",
         "/home/dave/MagentoAPI/dev/src/style.css",
         "/home/dave/MagentoAPI/dev/src/workspace.code-workspace"
      ]
   },
   {
      "dir":"/home/dave/MagentoAPI/dev/.app/vendor",
      "files":[
         "/home/dave/MagentoAPI/dev/.app/vendor/autoload.php"
      ]
   }
]
--------------------
Search results
Path: /home/dave/MagentoAPI/dev/src/styles/main.scss
  isMatch: true
   - Base path: /home/dave/MagentoAPI
   - Rela path: dev/src/styles/main.scss
  compare: true
   - Orig file path: /home/dave/MagentoAPI/dev/src/styles/main.scss
   - Orig sass path: /home/dave/MagentoAPI/dev/src/styles/main.scss

Typescript definitions seems a bit wrong

Thank you for this fast and useful library.

I have a problem, as VSCode is not happy that you declare that you will return String[] from fdir.async method. If it is string[] then the issue is no longer there.

Crawl with options group interface no longer correct in v5.2.0 (v5.1.0 was the same)

crawlWithOptions API in combination with withPromise no longer outputs a GroupOutput as per the typings.

The test script:

const { fdir } = require('../');

const test = async (dir) => {
  const files = await new fdir()
    .crawlWithOptions(dir, {
      includeBasePath: true,
      group: true,
    })
    .withPromise();

  console.log(files);
};

test(`${__dirname}/dir`);
$ tree fdir-api-change/
fdir-api-change/
├── dir
│   ├── a
│   │   ├── a.txt
│   │   └── b
│   │       ├── b.txt
│   │       └── x.txt
│   └── dir.txt
└── test.js
$ git checkout 8f1c4b9 && node fdir-api-change/test.js 
HEAD is now at 8f1c4b9 fix: make all tests pass
[
  'home/source/fdir/fdir-api-change/dir/': [ 'home/source/fdir/fdir-api-change/dir/dir.txt' ],
  'home/source/fdir/fdir-api-change/dir/a/': [ 'home/source/fdir/fdir-api-change/dir/a/a.txt' ],
  'home/source/fdir/fdir-api-change/dir/a/b/': [
    'home/source/fdir/fdir-api-change/dir/a/b/b.txt',
    'home/source/fdir/fdir-api-change/dir/a/b/x.txt'
  ]
]
$ git checkout 8f1c4b9^ && node fdir-api-change/test.js
Previous HEAD position was 8f1c4b9 fix: make all tests pass
HEAD is now at 0cafa69 feat: refactor & minor performance improvements
[]
$ git checkout 8f1c4b9^^ && node fdir-api-change/test.js
Previous HEAD position was 0cafa69 feat: refactor & minor performance improvements
HEAD is now at 16d0790 feat: add withRelativePaths option (fix #51)
[
  {
    dir: 'home/source/fdir/fdir-api-change/dir',
    files: [ 'home/source/fdir/fdir-api-change/dir/dir.txt' ]
  },
  {
    dir: 'home/source/fdir/fdir-api-change/dir/a',
    files: [ 'home/source/fdir/fdir-api-change/dir/a/a.txt' ]
  },
  {
    dir: 'home/source/fdir/fdir-api-change/dir/a/b',
    files: [
      'home/source/fdir/fdir-api-change/dir/a/b/b.txt',
      'home/source/fdir/fdir-api-change/dir/a/b/x.txt'
    ]
  }
]

Unless I'm missing something very obvious (entirely possible), I have to lock at 5.1.0 and miss out on performance boosts or change my api to call Object.entries, which I assume is much slower than the original version.

Fdir failing to pickup deep paths?

I'm having trouble getting fdir to work like globby does, as I'm trying to replace it. In fact it seems to fail pretty randomly.

[Info  - 8:40:11 PM] Glob /home/razze/Development/elm-spa-example/src/**/*.elm
[Info  - 8:40:11 PM] Globby 33 - /home/razze/Development/elm-spa-example/src/Api.elm,/home/razze/Development/elm-spa-example/src/Article.elm,/home/razze/Development/elm-spa-example/src/Asset.elm,/home/razze/Development/elm-spa-example/src/Author.elm,/home/razze/Development/elm-spa-example/src/Avatar.elm,/home/razze/Development/elm-spa-example/src/CommentId.elm,/home/razze/Development/elm-spa-example/src/Email.elm,/home/razze/Development/elm-spa-example/src/Loading.elm,/home/razze/Development/elm-spa-example/src/Log.elm,/home/razze/Development/elm-spa-example/src/Main.elm,/home/razze/Development/elm-spa-example/src/Page.elm,/home/razze/Development/elm-spa-example/src/PaginatedList.elm,/home/razze/Development/elm-spa-example/src/Profile.elm,/home/razze/Development/elm-spa-example/src/Route.elm,/home/razze/Development/elm-spa-example/src/Session.elm,/home/razze/Development/elm-spa-example/src/Timestamp.elm,/home/razze/Development/elm-spa-example/src/Username.elm,/home/razze/Development/elm-spa-example/src/Viewer.elm,/home/razze/Development/elm-spa-example/src/Api/Endpoint.elm,/home/razze/Development/elm-spa-example/src/Article/Body.elm,/home/razze/Development/elm-spa-example/src/Article/Comment.elm,/home/razze/Development/elm-spa-example/src/Article/Feed.elm,/home/razze/Development/elm-spa-example/src/Article/Slug.elm,/home/razze/Development/elm-spa-example/src/Article/Tag.elm,/home/razze/Development/elm-spa-example/src/Page/Article.elm,/home/razze/Development/elm-spa-example/src/Page/Blank.elm,/home/razze/Development/elm-spa-example/src/Page/Home.elm,/home/razze/Development/elm-spa-example/src/Page/Login.elm,/home/razze/Development/elm-spa-example/src/Page/NotFound.elm,/home/razze/Development/elm-spa-example/src/Page/Profile.elm,/home/razze/Development/elm-spa-example/src/Page/Register.elm,/home/razze/Development/elm-spa-example/src/Page/Settings.elm,/home/razze/Development/elm-spa-example/src/Page/Article/Editor.elm
[Info  - 8:40:11 PM] Fdir 33 - /home/razze/Development/elm-spa-example/src/Api/Endpoint.elm,/home/razze/Development/elm-spa-example/src/Api.elm,/home/razze/Development/elm-spa-example/src/Article/Body.elm,/home/razze/Development/elm-spa-example/src/Article/Comment.elm,/home/razze/Development/elm-spa-example/src/Article/Feed.elm,/home/razze/Development/elm-spa-example/src/Article/Slug.elm,/home/razze/Development/elm-spa-example/src/Article/Tag.elm,/home/razze/Development/elm-spa-example/src/Article.elm,/home/razze/Development/elm-spa-example/src/Asset.elm,/home/razze/Development/elm-spa-example/src/Author.elm,/home/razze/Development/elm-spa-example/src/Avatar.elm,/home/razze/Development/elm-spa-example/src/CommentId.elm,/home/razze/Development/elm-spa-example/src/Email.elm,/home/razze/Development/elm-spa-example/src/Loading.elm,/home/razze/Development/elm-spa-example/src/Log.elm,/home/razze/Development/elm-spa-example/src/Main.elm,/home/razze/Development/elm-spa-example/src/Page/Article/Editor.elm,/home/razze/Development/elm-spa-example/src/Page/Article.elm,/home/razze/Development/elm-spa-example/src/Page/Blank.elm,/home/razze/Development/elm-spa-example/src/Page/Home.elm,/home/razze/Development/elm-spa-example/src/Page/Login.elm,/home/razze/Development/elm-spa-example/src/Page/NotFound.elm,/home/razze/Development/elm-spa-example/src/Page/Profile.elm,/home/razze/Development/elm-spa-example/src/Page/Register.elm,/home/razze/Development/elm-spa-example/src/Page/Settings.elm,/home/razze/Development/elm-spa-example/src/Page.elm,/home/razze/Development/elm-spa-example/src/PaginatedList.elm,/home/razze/Development/elm-spa-example/src/Profile.elm,/home/razze/Development/elm-spa-example/src/Route.elm,/home/razze/Development/elm-spa-example/src/Session.elm,/home/razze/Development/elm-spa-example/src/Timestamp.elm,/home/razze/Development/elm-spa-example/src/Username.elm,/home/razze/Development/elm-spa-example/src/Viewer.elm
[Info  - 8:40:11 PM] Glob /home/razze/Development/elm-spa-example/tests/**/*.elm
[Info  - 8:40:11 PM] Globby 1 - /home/razze/Development/elm-spa-example/tests/RoutingTests.elm
[Info  - 8:40:11 PM] Fdir 1 - /home/razze/Development/elm-spa-example/tests/RoutingTests.elm
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/NoRedInk/elm-json-decode-pipeline/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 1 - /home/razze/.elm/0.19.1/packages/NoRedInk/elm-json-decode-pipeline/1.0.0/src/Json/Decode/Pipeline.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 11 - /home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser/AnimationManager.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser/Dom.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser/Events.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Browser/Navigation.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Expando.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/History.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Main.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Metadata.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Overlay.elm,/home/razze/.elm/0.19.1/packages/elm/browser/1.0.0/src/Debugger/Report.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 18 - /home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Array.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Basics.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Bitwise.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Char.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Debug.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Dict.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/List.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Maybe.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Platform.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Process.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Result.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Set.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/String.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Task.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Tuple.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Elm/JsArray.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Platform/Cmd.elm,/home/razze/.elm/0.19.1/packages/elm/core/1.0.0/src/Platform/Sub.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 5 - /home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html.elm,/home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html/Attributes.elm,/home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html/Events.elm,/home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html/Keyed.elm,/home/razze/.elm/0.19.1/packages/elm/html/1.0.0/src/Html/Lazy.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/http/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 3 - /home/razze/.elm/0.19.1/packages/elm/http/1.0.0/src/Http.elm,/home/razze/.elm/0.19.1/packages/elm/http/1.0.0/src/Http/Internal.elm,/home/razze/.elm/0.19.1/packages/elm/http/1.0.0/src/Http/Progress.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/json/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 2 - /home/razze/.elm/0.19.1/packages/elm/json/1.0.0/src/Json/Decode.elm,/home/razze/.elm/0.19.1/packages/elm/json/1.0.0/src/Json/Encode.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/time/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 1 - /home/razze/.elm/0.19.1/packages/elm/time/1.0.0/src/Time.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 5 - /home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url.elm,/home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url/Builder.elm,/home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url/Parser.elm,/home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url/Parser/Internal.elm,/home/razze/.elm/0.19.1/packages/elm/url/1.0.0/src/Url/Parser/Query.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm-explorations/markdown/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 1 - /home/razze/.elm/0.19.1/packages/elm-explorations/markdown/1.0.0/src/Markdown.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/rtfeldman/elm-iso8601-date-strings/1.1.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 1 - /home/razze/.elm/0.19.1/packages/rtfeldman/elm-iso8601-date-strings/1.1.0/src/Iso8601.elm
[Info  - 8:40:11 PM] Fdir 0 - 
[Info  - 8:40:11 PM] Glob /home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/**/*.elm
[Info  - 8:40:11 PM] Globby 15 - /home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Expect.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Float.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Fuzz.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Lazy.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/MicroRandomExtra.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/RoseTree.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Shrink.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Fuzz/Internal.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Lazy/List.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Expectation.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Fuzz.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Internal.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Runner.elm,/home/razze/.elm/0.19.1/packages/elm-explorations/test/1.0.0/src/Test/Runner/Failure.elm
[Info  - 8:40:11 PM] Fdir 0 - 
const x = new fdir().glob(`**/*.elm`).withFullPaths().crawl(globUri).sync();

    const y = globby
      .sync(`${globUri}/**/*.elm`, { suppressErrors: true });

I also tried const x = new fdir().glob(${globUri}/**/*.elm).withFullPaths().crawl(".").sync(); which seemed to also not find the same things.

I tried to extend the depths, but that did also not help.

v7.0 Plan

fdir has reached maximum performance. I am not saying that to discourage others from trying to make a faster directory crawler. Not at all. But I have stripped fdir down to its bones and haven't noticed any significant increase in speed. The only space left for improvement is directly in NodeJS internals. So where does that leave us?

I had initially planned to freeze the API of fdir but fdir has no proper "API". What I would be freezing, most probably, would be more features. I can't do that though. Not yet, anyway. Why?

  1. fdir is not feature complete. I don't want to make just the fastest directory crawler but also the best one.
  2. I don't want any one using fdir to ever feel like it can't do X or Z. An impossible feat but...
  3. I am not happy with the current state of the codebase. Even though an API freeze doesn't mean I can't change code, I want to be able to break things.

So what's the plan for v6.0?

  • Modernize the codebase — the whole codebase is ES3 or something. In my quest for performance, I went to the extreme. Too much, it seems. Maybe we can even use Typescript this time around (but I like keeping things simple).
  • Strip the core directory crawler down to the bare minimum — right now, everything flows in and out of fdir with a free pass. The result is that there is a lot going on at one time. It is time to simplify things...and that brings me to:
  • Make fdir pluggable — after some initial thought, this is the best way forward. The idea is to allow anyone to directly tap into the crawling process and control it.
  • Move all the extra features & flags to their own plugins — this will enable a more robust system. There will be an "official" set of plugins that will act as a foundation for others to play with the Plugin API.
  • Improve documentation — this doesn't even need explaining but yes! It's time to move to Docsaurus 😆

Does this mean that the Builder API is dying? Nope. I like the Builder API and it'll stay. It will act as a kind of a proxy for the plugins underneath. That means the Builder API will also need to be extensible.

The end goal is to make the following features possible:

  1. Streams & Iterators
  2. Globbing (the current one is a joke, I am talking about a proper globber)
  3. Multiple filesystems (locking fdir to fs is a bad idea)
  4. Deno support, maybe (not sure about this)
  5. And other stuff...you guys can suggest.

In the end, the idea is to reduce complexity, increase flexibility, improve readability, and maintain the performance (and become the de-facto directory crawler in the Nodeverse).


In any case, that's the plan. I would love to hear what you guys think about this. Ideas, suggestions, possible ways to implement a plugin system, ideas for plugins, etc, etc, etc. is all welcome!

What's the correct way to use globbing?

What is the correct way to utilize the globbing functionality in fdir? When I run the following test, it does not properly filter the results.

const fs = require('fs');
const { fdir } = require('fdir');
//make some temp files to scan
if (!fs.existsSync('./temp')) {
    fs.mkdirSync('./temp');
    fs.writeFileSync('./temp/alpha.json', '');
    fs.writeFileSync('./temp/beta.txt', '');
}
//find only the files that match the glob
const results = new fdir().glob('*.txt').crawlWithOptions('./temp', {}).sync();
console.log(results);

The glob seems to have no effect, and I'm instead given all of the files in the directory.
image

Am I using it wrong? I've installed picomatch, and tried this on fdir 5.2.0, 5.1.0, 5.0.0, 4.1.0, and 4.0.0, all with the same results.

Running the same test but fast-glob (which also uses picomatch) returns the expected results.

const results = require('fast-glob').sync('*.txt', { cwd: './temp' });

image

Remove callback API in next major

Code could be simplified by removing the callback API. There are native promise apis for most functions now.

Not sure how performance is effected though - this would be the primary concern.

import { readdir } from 'fs/promises';

try {
  const files = await readdir(path);
  for (const file of files)
    console.log(file);
} catch (err) {
  console.error(err);
}

Would pave the way for supporting modern API interfaces such as Node Streams, Web Streams, AsyncInterators. AbortController could also be used for early-exit scenarios.

Limit files option

Hello, can you please provide an option for a max file limit that once X files have been found, stop further scanning operations?

I understand an operation may be in progress that returns file lists greater than the limit, so if it can just discard the excess.

Use case is scanning directories with large number of files but you only want to pull a few of them out at a time, for example a "drop/pickup" directory where external sources drop huge numbers of files into the directory and the node process scans and it and picks up work out of it, we may have millions of files but only want to process a few thousand at a time.

One primary reason is that memory usage is really high when the folder is large due to generating strings for all of the file names and holding them in memory during the operations, and would like to constrain that.

I tried to implement count limits by filtering like so but this doesn't stop the library from further work, it just more discards the work.

export async function walkDir(dir: string, filter: (file: string, isDirectory: boolean) => boolean | void, limit?: number): Promise<Array<string>> {
    let apiBuilder = (new fdir()).withFullPaths();
    if (limit) {
        let count = 0;
        apiBuilder = apiBuilder.filter(function (file, isDirectory) {
            return (isDirectory ? count < limit : count++ < limit) && (!filter || filter(file, isDirectory));
        });
    } else if (filter) {
        apiBuilder = apiBuilder.filter(filter);
    }
    return apiBuilder.crawl(dir).withPromise();
}

So a way to improve this memory wise would be nice.

Node exits when an error occurs using withPromise

See this example file:

const { fdir } = require("fdir");

async function main() {
  const files = await new fdir()
    .withBasePath()
    .crawl("non-existent") // directory does not exist!
    .withPromise();

  console.log("this will never be written");
}

if (require.main === module) {
  main().catch((e) => {
    process.exitCode = 1;
    console.error(e);
  });
}

(save as index.js, for example, and run node ./index.js)

What is happening?

The Node.js process just exits (exit code 0) and no error is shown. The message "this will never be written" is also not shown.

What is expected to happen?

As the docs say that all errors are suppressed, I would expect files to be an empty array and program flow to continue normally (printing the "this will never be written" message).

Have you gone mad trying to figure out why your program just exits without any message?

I prefer not to say.

Environment

$ node -v
v15.4.0
$ cat node_modules/fdir/package.json | grep version
  "version": "4.1.0",

Is there a workaround?

When adding .withErrors(), the error is reported as expected ("ENOENT: no such file or directory, scandir 'non-existent'") so using that option, together with a try-catch or .catch() could be a possible workaround.

Next release?

Hey there,

do you have any plan for when you would like to do the next release?

Cheers

Benchmarks again globbing libraries

fdir is without a doubt the fastest globbing library as well (I benchmarked) but evidence needs to be published before it can be advertised.

Make a note about the `sync` api being much slower

I think a lot of my early testing of this package was using sync api which made me think it was not that fast.

It's common for a Node dev to reach for sync apis for file access in cli tools...as you wouldn't normally think async would give speed gains, unless you are running in a webserver where you don't want to block the event loop.

But for fdir it offers huge benefits in speed when the threadpool is increased. On macOS at least.

UV_THREADPOOL_SIZE=8

776457
fdir: 2.336s
776457
fdir.sync: 8.799s

UV_THREADPOOL_SIZE=2

776457
fdir: 6.870s
776457
fdir.sync: 7.804s

Crazy its set so low. https://www.sebastienvercammen.be/your-libuv-thread-pool-size-is-too-small/


Also, make a note of increasing UV_THREADPOOL_SIZE.

Feature Request: cwd option

Hi,

Assuming current cwd is /Users/user/Development/project, crawl("src/target") returns:

withFullPaths(): /Users/user/Development/project/src/target/a/b/c,
withBasePath(): src/target/a/b/c,

It would be very nice to have one of the following:

  • withRelativePath(): a/b/c
  • withRelativePath("src"): target/a/b/c.

Thanks,

Typescript support

Will this package ever have typescript support?

depends on 'fdir'. CommonJS or AMD dependencies can cause optimization bailouts.

Filtering performance improvements

Currently, fdir is the fastest directory crawler in the Node.js world even with filtering/globbing. However, the filtering performance is not up to par with non-filtering performance i.e., the gap is too big.

Current:

Running "Synchronous (2642 files, 330 folders)" suite...

  fdir simple sync:
    283 ops/s, ±0.63%   | fastest

  fdir filter sync:
    278 ops/s, ±0.35%   | 1.77% slower

  fdir glob sync:
    259 ops/s, ±0.33%   | slowest, 8.48% slower

Running "Asynchronous (2642 files, 330 folders)" suite...

  fdir simple async:
    468 ops/s, ±2.32%   | fastest

  fdir filter async:
    428 ops/s, ±2.55%   | 8.55% slower

  fdir glob async:
    378 ops/s, ±2.45%   | slowest, 19.23% slower

Okay, filter performance is ~2-10% slower while glob performance is ~10-20% slower. That is quite slow relatively.

So the question is: How do we reduce this performance gap?

  • If we move all the filtering to the end of crawling operation and simply do array.filter over the results, the performance will increase by 2x (I think). However, we might face an issue with grouped output.

Ideas for mocking

Hi, I'm attempting to convert from globby to fdir, and I'm curious if you have any ideas for a good way to mock fdir returns. Previously, we had:

(globby.sync as jest.Mock).mockReturnValueOnce([]);

But, since fdir uses a fluent api, I can't just mock fdir.sync. Is this something you've run into before? And if so, how did you handle it?

[Feature] glob/crawl by symlink path, not real path

As noted in #23 (comment), currently globs match against resolved symlink paths, not the symlink paths themselves. This is a bit counter-intuitive, and I think that the symlink paths should be matched and returned instead, perhaps with an option to return real paths instead.

Feature request: Multiple filters joined via AND

I would like to use a chain of multiple .filter() operations, but have them successively refine the results. For example, I would like to find files where:

  1. extension is .html
  2. the path contains /src/
  3. the path contains /trunk/
  4. the filename does not contain test

I would like to implement that via this code:

const crawler = new fdir()
  .withBasePath()
  .filter(path => path.match(/\.html/))
  .filter(path => path.contains('/src/'))
  .filter(path => path.contains('/trunk/'))
  .filter(path => !path.contains('test'));

The current implementation of multi-filters treats this like an OR, which gives me many undesired results.

Thank you for considering my request!

[suggestion] - Promise support

Hi,
I really appreciate the work you did. Maybe a Promise support could be good for a more flexible code?

Thanks for your work, Olyno.

picomatch not found on base install

I know you're actively developing this however the documentation on the readme doesn't work out of the box

it requires new fdir() not fdir.new()

also I noticed you went through and tried to add an error if they use glob, however these checks don't work and tries to load picomatch regardless which means on a simple npm i fdir and running const fdir = require("fdir").default; results in Error: Cannot find module 'picomatch'

Also good work on this, I would like to add an enhancement which is case insensitivity :)

Need to list path to symlink instead of resolved path

I have a need to list the path to a symlink itself rather than the resolved path of the symlink.
Currently symlinks are ignored unless withSymLinks() is used, however if you use withSymLinks(), then the resolved path of the symlink is returned instead.
From what I can see it may be possible to list the symlink path in Walker, perhaps with a new option/chain.

How to search for a folder by its name and return the folder path?

How to search for a folder by its name and return the folder path instead of returning all the parent directories?

const files = new fdir()
                    .withBasePath()
                    // .withDirs()
                    .withFullPaths()
                    // .glob("F:/CGLibrary/**/*Video*")
                    .filter((path) => path.indexOf("Video") != -1)
                    .crawl("F:/")
                    .sync();
                
                console.log(files);

Compare to native options

I am interested in how much faster it could be using native code. Via child_process or N-API.

Someone already mentioned Rust's https://github.com/jessegrosjean/jwalk. Some other Rust discussion here. Would be cool to have a napi-rs wrapper on it and see the speed diff.

Rust

What’s the fastest way to read a lot of files?

C

JS Glob Libs

JS Matcher Libs

WASI

No implementation exists yet.

Would be good to try with AssemblyScript.

Cli version

Hi. It's "apanzzon" from Reddit ;-)

A fdir-cli version would be interesting.

Due to corona chaos, I currently have very little spare time.

Would avoid using a framework, to avoid dependencies, or split it up to a secondary repo

Permission errors don't seem to be handled.

Permission errors don't seem to be handled.

import * as fdir from 'fdir';

fdir.async('/', {}).then(files => {
    console.log(files);
});
TypeError: Cannot read property 'length' of undefined
    at /Users/xo/code/r/node_modules/fdir/index.js:40:39
    at fs.js:153:23
    at FSReqCallback.req.oncomplete (fs.js:778:9)

Changing the affected line to this.

        fs.readdir(dir, readdirOpts, function(_, dirents) {
          console.log({_,dirents});
          for (var j = 0; j < dirents.length; ++j) {

I get the following.

{
  _: [Error: EACCES: permission denied, scandir '//.fseventsd'] {
    errno: -13,
    code: 'EACCES',
    syscall: 'scandir',
    path: '//.fseventsd'
  },
  dirents: undefined
}

Export type `Options` from `crawlWithOptions(dir, Options)`

I like to use options like:

import {fdir, Options} from 'fdir'
 
  const opts: Options = {
    includeBasePath: true,
    exclude: p => {
      return p.indexOf('node_modules') > -1
    },
    filters: [p => p.endsWith('package.json')],
  }
  const files = new fdir().crawlWithOptions(dir, opts).sync()
  return files
TS2459: Module '"fdir"' declares 'Options' locally, but it is not exported.

Also, thoughts on runtime validation for this? With zod or something.

async/sync maxDepth broken

const fdir = require('fdir')

fdir.async('.', { maxDepth: 99 }).then((res) => {
  console.log(res, res.length)
})

In a rather large folder structure, the async & sync method with maxDepth doesn't work properly:

[
  '/Users/johan/test/test-repos/.DS_Store',
  '/Users/johan/test/test-repos/.gitignore',
  '/Users/johan/test/test-repos/.jscpd.json',
  '/Users/johan/test/test-repos/.mrconfig',
  '/Users/johan/test/test-repos/Jakefile',
  '/Users/johan/test/test-repos/Makefile',
  '/Users/johan/test/test-repos/README.md',
  '/Users/johan/test/test-repos/async.js',
  '/Users/johan/test/test-repos/distribution.txt',
  '/Users/johan/test/test-repos/gamerules.txt',
  '/Users/johan/test/test-repos/games.txt',
  '/Users/johan/test/test-repos/gfw.txt',
  '/Users/johan/test/test-repos/langlib.txt',
  '/Users/johan/test/test-repos/package.json',
  '/Users/johan/test/test-repos/project.txt',
  '/Users/johan/test/test-repos/stats.txt',
  '/Users/johan/test/test-repos/yarn.lock',
  '/Users/johan/test/test-repos/.git/COMMIT_EDITMSG',
  '/Users/johan/test/test-repos/.git/FETCH_HEAD',
  '/Users/johan/test/test-repos/.git/HEAD',
  '/Users/johan/test/test-repos/.git/MERGE_RR',
  '/Users/johan/test/test-repos/.git/ORIG_HEAD',
  '/Users/johan/test/test-repos/.git/config',
  '/Users/johan/test/test-repos/.git/index',
  '/Users/johan/test/test-repos/.git/packed-refs',
  '/Users/johan/test/test-repos/.git/pre-commit',
  '/Users/johan/test/test-repos/.git/smartgit.config',
  '/Users/johan/test/test-repos/node_modules/.yarn-integrity',
  '/Users/johan/test/test-repos/games/.DS_Store',
  '/Users/johan/test/test-repos/games/README.md'
] 30

Without the maxDepth option, fdir properly scans through and finds 204516 files

OSX Catalina / APFS

Getting only directories

Hi! 👋 I'm shopping around for a fast npm globber to use in a visual studio code extension. The other npm libraries are still too slow for our very large project.

This extension needs only directories, not files. I see the withDirs function, but capturing files seems deep in the shared code:

https://github.com/thecodrr/fdir/blob/master/src/api/shared.js#L43-L44

How hard would it be to make an "onlyDirs()" builder or something? Is this something you think would be a good addition? I didn't see anything in filter either that would allow this to work.

Thanks!

Stream API / Async Iterator API

Hello, first of all thanks for this awesome package!!

The documentation says that

Stream API will be added soon.

Is it already a work in progress? I would really like to see this.

Also, I want to suggest providing an Async Iterator API instead of a stream API. The reason is that an Async Iterator can be easily converted into a readable stream using into-stream without loss of performance (into-stream automatically handles backpressure and all stream quirks), while the opposite conversion is very nontrivial (actually I think it's impossible, since readable streams start filling their internal buffers at soon as they begin flowing and therefore can't be converted to a one-step-at-a-time async iterator).

Any slashes should be stripped off of the end before walking.

Notice the double slash in the filename.

import * as fdir from 'fdir';

const go = async () => {
    const files = await fdir.async('/Users/xo/Downloads/', {});
    console.log(files); // ['/Users/xo/Downloads//Download.PDF']
};

go().catch(error => {
    console.error(error);
});

Adjust readme a bit

I have a pretty thick skin, and I like that you're pushing the envelope with performance here. You seem like a really smart guy, but the usage of the words shit come off a bit crass. Just my opinion, take it or leave it 😄.

Globbing doesn't make sense

Currently the API allows a .glob(...patterns) method in the Builder alongside the crawl one. All the globbing libraries (fast-glob & glob) that I have used do not allow the user to specify the source directory.

Maybe glob and crawl should be at an equal API level.

Exclude function needs better docs or changes

https://github.com/thecodrr/fdir/blob/master/documentation.md#excludefunction

Seems to suggest, that you only match on the deepest folder, but it doesn't it seems to hand you the complete path to match on. It also seems to change depending on other parameters.

 const elmJsonGlob = `${globUri}/**/elm.json`;

      let x = new fdir()
        .glob(elmJsonGlob)
        .exclude(
          (dir) =>
            dir.startsWith(".") ||
            dir === "node_modules" ||
            dir === "elm-stuff",
        )
        .crawl(".")
        .sync();

this will lead to dir in the exclude function being something like ./path/.git/file

 const elmJsonGlob = `${globUri}/**/elm.json`;

      let x = new fdir()
        .glob(elmJsonGlob)
        .exclude(
          (dir) =>
            dir.startsWith(".") ||
            dir === "node_modules" ||
            dir === "elm-stuff",
        )
        .withFullPaths()
        .crawl(".")
        .sync();

this will lead to dir in the exclude function being something like home/razze/Development/elm-pages-starter/node_modules/chalk on linux

I feel like that would at least need to be pointed out.

What do I need to migrate from glob to fdir

Hi. I'm an active user of glob and 92% faster seems neat.

In order to switch, I need to know that my existing patterns will be maintained. Will you please create a near-top readme section telling me how the two libraries' pattern outlay lines up?

Some of the fuss between glob and other competitors was when glob chose to remove features to come into line with other standards. I'd like to know where those topics stand, and if they line up, I will probably switch

Thanks for listening

Feature request: pass full path to `exclude` function

Currently, the excludeFn function I pass to crawler.exclude(excludeFn) receives only the basename of the directory as a parameter. I would like to decide to exclude or not based on the full path. I propose passing also path as a second parameter to it, so it is not a breaking change.

fdir is not a constructor

Hi there!

I'm getting this error while trying to create a new builder. Any ideas what's going on?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.