GithubHelp home page GithubHelp logo

terkelg / tiny-glob Goto Github PK

View Code? Open in Web Editor NEW
850.0 11.0 33.0 154 KB

Super tiny and ~350% faster alternative to node-glob

License: MIT License

JavaScript 100.00%
glob globbing glob-pattern filesystem patterns pattern-matching expansion wildcard

tiny-glob's Introduction

Tiny Glob

tiny glob

version CI downloads install size

Tiny and extremely fast library to match files and folders using glob patterns.


"Globs" is the common name for a specific type of pattern used to match files and folders. It's the patterns you type when you do stuff like ls *.js in your shell or put src/* in a .gitignore file. When used to match filenames, it's sometimes called a "wildcard".

Install

npm install tiny-glob

Core Features

  • πŸ”₯ extremely fast: ~350% faster than node-glob and ~230% faster than fast-glob
  • πŸ’ͺ powerful: supports advanced globbing patterns (ExtGlob)
  • πŸ“¦ tiny: only ~45 LOC with 2 small dependencies
  • πŸ‘« friendly: simple and easy to use api
  • 🎭 cross-platform: supports both unix and windows

Usage

const glob = require('tiny-glob');

(async function(){
    let files = await glob('src/*/*.{js,md}');
    // => [ ... ] array of matching files
})();

API

glob(str, options)

Type: function
Returns: Array

Return array of matching files and folders This function is async and returns a promise.

str

Type: String

The glob pattern to match against.

OBS: Please only use forward-slashes in glob expressions. Even on windows

options.cwd

Type: String
Default: '.'

Change default working directory.

options.dot

Type: Boolean
Default: false

Allow patterns to match filenames or directories that begin with a period (.).

options.absolute

Type: Boolean
Default: false

Return matches as absolute paths.

options.filesOnly

Type: Boolean
Default: false

Skip directories and return matched files only.

options.flush

Type: Boolean
Default: false

Flush the internal cache object.

Windows

Though Windows may use /, \, or \\ as path separators, you can only use forward-slashes (/) when specifying glob expressions. Any back-slashes (\) will be interpreted as escape characters instead of path separators.

This is common across many glob-based modules; see node-glob for corroboration.

Benchmarks

glob x 13,405 ops/sec Β±1.80% (85 runs sampled)
fast-glob x 25,745 ops/sec Β±2.76% (59 runs sampled)
tiny-glob x 102,658 ops/sec Β±0.79% (91 runs sampled)
Fastest is tiny-glob
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Name      β”‚ Mean time               β”‚ Ops/sec     β”‚ Diff           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ glob      β”‚ 0.00007459990597268128  β”‚ 13,404.843  β”‚ N/A            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ fast-glob β”‚ 0.000038842529587611705 β”‚ 25,744.976  β”‚ 92.06% faster  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ tiny-glob β”‚ 0.00000974110141018254  β”‚ 102,657.796 β”‚ 298.75% faster β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Advanced Globbing

Learn more about advanced globbing

License

MIT Β© Terkel Gjervig

tiny-glob's People

Contributors

andarist avatar armano2 avatar benmccann avatar forsakenharmony avatar kuitos avatar lukeed avatar mrmlnc avatar terkelg avatar tscpp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tiny-glob's Issues

Future plans?

Hi @terkelg I was wondering if you have any future plans in the works?

Just curious.

I am thinking it could be really nice to use this library in Prettier, they seem to have struggled with the other glob libraries.

Question: Invalidate cache

I thought of this on my morning run. Right now the cache is only being added to. Objects are never removed. So if you have a really long running process and files get deleted those changes are not reflected in tiny-glob.

File name includes part of file Path

Code:

const val = 'd:\current\PROJECT_ROOT\data\sample'
const validImageFormats = ['jpg', 'jpeg', 'png']

tinyGlob(`${val}/**/*.{${validImageFormats.join(',')}}`, {
          filesOnly: true
        })
        .then(files => {
          console.log(files) //Should contain only file names :(
        })
        .catch(err => {
          console.log(err)
        })

Expected Output:
files = ['blah.jpg', 'nope.jpeg', 'yo.png']

Actual Output:
files = ['data\sample\blah.jpg', 'data\sample\nope.jpeg', 'data\sample\yo.png']

Environment Info:
Windows 10
Electron.js with Node.js version 8.9.2

Additional Info:
Tests also fails on Windows 10

Invalid regular expression: /^(([^\\]*)|\.$/: Unterminated group

Environment

  • Windows 10
  • Node.js 12.3.1
  • tiny-glob 0.2.6
  • globrex 0.1.2

Actual behaviour

const tg = require('tiny-glob/sync');

const entries = tg('{*,./package.json,package.json}');

console.dir(entries, { colors: true });
D:\OpenSource\fast-glob\node_modules\globrex\index.js:42
                    path.segments.push(new RegExp(segment, flags));
                                       ^

SyntaxError: Invalid regular expression: /^(([^\\]*)|\.$/: Unterminated group
    at new RegExp (<anonymous>)
    at add (D:\OpenSource\fast-glob\node_modules\globrex\index.js:42:40)
    at globrex (D:\OpenSource\fast-glob\node_modules\globrex\index.js:62:13)
    at module.exports (D:\OpenSource\fast-glob\node_modules\tiny-glob\sync.js:75:20)
    at Object.<anonymous> (D:\OpenSource\fast-glob\test.js:3:17)
    at Module._compile (internal/modules/cjs/loader.js:774:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:785:10)
    at Module.load (internal/modules/cjs/loader.js:641:32)
    at Function.Module._load (internal/modules/cjs/loader.js:556:12)
    at Function.Module.runMain (internal/modules/cjs/loader.js:837:10)

Expected behaviour

I see entries.

[Feature request] Synchronous support

Is it possible to execute tiny-glob synchronously?

My specific use case is in a rollup.config.js file, I need to execute tiny-glob to retrieve a list of file paths, and then use those paths to export a config object. E.g.:

import glob from "tiny-glob";

const components = await glob("src/components/**/*.component.js").then((paths) =>
  paths.map((path) => path.match(/.*[\\\/](.*?).component.js/)[1])
);

export default components.map((component) => ({
  ...
}));

But rollup does not play nice with the await keyword being used

Is there a way to run tiny-glob synchronously? and if not, would this feature be considered for inclusion?

New release?

Now that #44 has been merged could you publish a new release? I'm currently installing from master and would like to be on a version number before I move to production.

[DEP0128] DeprecationWarning

Using non-relative paths in the 'main' field in' package.json' is deprecated.

(node:16276) [DEP0128] DeprecationWarning: Invalid 'main' field in 'C:\Users\elias\Projects\kolint\node_modules\tiny-glob\package.json' of 'src/index.js'. Please either fix that or report it to the module author.

Type definitions for sync.js

There's index.d.ts for index.js. There should probably be a sync.d.ts for sync.js as well I believe. I think the only thing that would need to change is to remove the Promise from the return type of glob

Incorrect result for directories with dot entries

Versions

  • Windows 10 (18362)
  • Node.js 12.0.0

Problem

The tiny-glob package returns results without prefer the dot option.

Reproduce steps

directory/
  - .git
  - .editorconfig
  - .gitignore
  - package.json
const tg = require('tiny-glob');

(async () => {
	const entries = await tg('*');

	console.dir(entries, { colors: true });
})();

Actual behaviour

⚠️ The .editorconfig file excluded as expected, but not .git and .gitignore. The dot option with true as value returns .editorconfig as expected.

[
	'.git',
	'.gitignore',
	'package.json'
]

Expected behaviour

[
	'package.json'
]

Any glob without any globs returns forward slashes on Windows

Usually and as per the documentation, input globs use forward slashes and the result paths use the native file separator.

But here are examples of globs on Windows which return forward slashes (a bug):

glob('C:/') //=> ['C:/']
glob('C:/Users') //=> ['C:/Users']
glob('/') //=> ['/']
glob('/Users') //=> ['/Users']
glob('./././') //=> ['./././']

Basically any glob which has no operators inside such as * or {} just returns the string as-is without any post-processing, even when it contains the wrong path separator. Note that the path must exist for it to not return zero results.

The result I expect is the same path I entered (not resolved or normalized or anything) just with the native path separator.

Arrays of globs with negation

Currently, tiny-glob takes a single glob string.

It would be nice if it could also take an array of glob strings, where negations apply to previous items in the array.

For example, given the files:

src/foo.js
src/bar.js
src/test/baz.js

then the invocation:

await glob([
  'src/**/*.js',
  '!test'
]);

would return:

src/foo.js
src/bar.js

This syntax is supported by fast-glob (ref), and is common in applications such as .gitignore files, and the package.json files array.

Expose CACHE object

Hi @terkelg, like your tiny glob and reading the issue #7 I have same requests where I'm performing fs.stat on the result, which can be avoided.

Hope this gets added, perhaps as getter like glob.cache

Thanks.

Some files are missing when doing glob

I have this code:

              const searchPattern = '{post,page}/**/*.{md}'
              glob(searchPattern, {
                cwd: content,
                absolute: true
              })
                .then(files => {
                  console.log(files.length)
                  callback(null, files)
                })
                .catch(err => callback(err))

And I would expect the files for {post,page}/**/*.{md} = {post}/**/*.{md} + {page}/**/*.{md}, but it's not in my case.

I have 2 files for {page}/**/*.{md}, and 53 files for {post}/**/*.{md}, but only 33 for {post,page}/**/*.{md}. Am I doing something wrong here? The searchPattern just works fine under node-glob, fast-glob.

Square brackets in directory name returns no results

  • Folder setup:
    • folder[1-3]
      • sub-folder-1
        • ...files
      • sub-folder-2
        • ...files
  • Input: path/to/folder[1-3]/*/
  • Call: glob(input, { absolute: true })

I get no results (empty array).

If I remove the "[1-3]" from the input folder name, I get the expected results (the two sub-folders).

Unterminated group error for "{*,sub/*}.html"

const glob = require( "tiny-glob/sync" );

const files = glob( "{*,reporter-html/*}.html", {
	cwd: __dirname + "/../",
	filesOnly: true
} )

With an input like:

  • a.html
  • reporter-html/b.html
  • other/c.html
    SyntaxError: Invalid regular expression: /^(([^/]*)|reporter-html$/: Unterminated group
        at new RegExp (<anonymous>)
        at add (/qunit/node_modules/globrex/index.js:42:40)
        at globrex (/qunit/node_modules/globrex/index.js:62:13)
        at module.exports (/qunit/node_modules/tiny-glob/sync.js:75:20)
        at Object.QUnit.module (/qunit/test/cli/structure.js:57:17)
        at processModule (/qunit/qunit/qunit.js:1325:40)
        at Object.module$1 [as module] (/qunit/qunit/qunit.js:1361:5)
        at Object.QUnit.module (/qunit/test/cli/structure.js:53:8)
        at processModule (/qunit/qunit/qunit.js:1325:40)
        at Object.module$1 [as module] (/qunit/qunit/qunit.js:1361:5)

I tried the following (unsuccessful) workaround:

glob( "{*.html,reporter-html/*.html}" );

// SyntaxError: Invalid regular expression: /^(([^/]*)\.html|reporter-html$/: Unterminated group

Support for double asterisk matches ** like bash globbing

It is not clear from the REAME if this library supports ** in matching patterns. That feature is extremely important for some use cases because foo/*/*.yml is expected to match only one folder deep, while using foo/**/*.yml is expected to go recursively.

Basically ** includes the directory separator while simple one does not. Is this supported? If not it should be stated as it one of the most popular globbing features.

Apparently https://github.com/microsoft/vscode-json-languageservice/blob/main/src/utils/glob.ts implementations supports that.

Reference: https://stackoverflow.com/questions/32604656/what-is-the-glob-character

Iterator Support for large globs

Hi

Consider a method where the return object could be an Iterator or AsyncIterator so that large file globs (as in huge numbers of files) are supported.

ESM version of tiny-glob?

Hello,

Thanks for making such a great library! With the Node team converting more and more to ESM, and other maintainters doing the same, I’d love to see this library upgraded to include an ESM build along with CommonJS.

Is this something the team would be interested in adding alongside CJS? Would you be interested in a PR that shipped both?

support follow symlinks on directories

currently tiny-glob does not find files in symlinked directories

sample:

#!/bin/bash

mkdir dir-1
touch dir-1/file-1
ln -s dir-1 dir-1-linked

# run tiny-glob
node -e "console.log(require('tiny-glob/sync.js')('dir*/**').join('\n'))"

# actual result
dir-1/file-1

# expected result
dir-1/file-1
dir-1-linked/file-1

the problem should be

if (!stats.isDirectory()) {

which does not detect symlinked directories
where stats.isSymbolicLink() == true

fs.realpathSync(path) can be used to fully resolve the link target
but that returns the absolute path ....

other glob tools have this feature normally off
and only enabled with a follow: true option
one reason: circular symlinks give infinite recursion

back and forward slashes issue

Hello,

In your description you say "Though Windows may use /, , or \ as path separators, you can only use forward-slashes (/) when specifying glob expressions. Any back-slashes () will be interpreted as escape characters instead of path separators."

Tests I ran suggest it should be the opposite. I am using Windows 7.

I wanted to list all files except some, so I used forward slashes as per your suggestion '!(**.jpg|css/style.css)'
This kept throwing an error and failing. When I modified it to use back-slashes instead '!(**.jpg|css\\style.css)' it worked flawlessly.

Very slow in a directory with nested directories and one-level pattern

Versions

  • Windows 10 (17763)
  • NVMe (Samsung MZVLB512HAJQ)
  • Node.js 12.0.0

A directory tree

directory
  - file.txt
  - node_modules

Reproduce steps

const tg = require('tiny-glob/sync');

console.time('tiny-glob');
const entries = tg('directory/*');
console.timeEnd('tiny-glob');

console.dir(entries, { colors: true });

Current behaviour

The walk function is called for each directory inside the root directory. And it takes 256ms.

Expected behaviour

One the walk function call. And… it takes around 2-5ms.

how do you do case insensitive glob?

is it possible to do case insensitive match?

trying to match files with jpg extension on Windows OS, but some camera named all files with all cap letters (ex. DC1002.JPG, DCS1000.JPEG) causing glob to not match them. would be good if I can just use My Pictures/**/*.{jpg,png,jpeg} instead of My Pictures/**/*.{jpg,JPG,png,PNG,jpeg,JPEG} plus whatever uppercase and lower case variants the different (brands, models...etc) camera use

Absolute paths provided as the source are changed and error out

Hello!

Big fan of the speed gains here, but hit a snag when I was migrating a project away from fast-glob. Due to how tiny-glob defaults to . as the cwd if you don't pass anything in, it causes the path.join call within walk to accidentally break the absolute path that was provided as the src.

In other words, when I pass in something like:

/hello/absolute/path/here/

It gets turned into this by that join due to that . being prepended to the file path:

hello/absolute/path/here/

Which then throws an error, because that path does not exist.

Not sure how major of a change it'd be, but it may be worth considering hitting a provided source with path.isAbsolute and skip the cwd prepend if it returns true?

I didn't check node-glob, but it does appear to be how fast-glob approaches this situation.

(Somewhat related, but fast-glob also uses process.cwd() instead of defaulting to the string dot . for cwd, which may also be a bit more resilient to any other weird interpretations of that pathing.)

Happy to take a swing at this as well if you're interested. Thank you!

Not working for typescript

After updating to latest version, typescript compiler will throw error on:

import * as glob from 'tiny-glob'

But this is fine in older version.

Doesn't match files without glob

When the cwd option is set tiny-glob will only match patterns that have a glob in it. Without the cwd option everything works as intended.

src/index.js

const glob = require('tiny-glob');

(async function(){
    let withGlob = await glob('*.js', { cwd: 'src' });
    let withoutGlob = await glob('index.js', { cwd: 'src' });
    console.log(withGlob, withoutGlob)
})();

The above logs [ 'index.js' ] []

File paths are always relative to the current directory unlike folders

Reproduction:

Create the following file structure:

a/
    b/
    c.txt

And run the following testcases inside the a/ folder:

glob('../a/*') //=> ['../a/b', 'c.txt']
glob('b/../*') //=> ['b', 'c.txt']
glob('/absolute/path/to/a/*') //=> ['/absolute/path/to/a/b', 'c.txt']

The paths to files are always relative to the current directory, while the paths to folders are somewhat faithful to the glob input.

I think the correct form is to stay faithful to the input, but whatever the choice is the inconsistencies are a problem.

I also think it's weird that even for folders b/.. resolves to nothing while ../a stays as-is, but it kinda makes sense since for the latter you need to know the current folder is a.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.