GithubHelp home page GithubHelp logo

Comments (10)

chenxsan avatar chenxsan commented on August 23, 2024 1

@terkelg Thought it's really difficult for you to understand my situation, I just created a new repo here https://github.com/chenxsan/tiny-glob-demo.

Please run npm install to install deps then run node index.js to check the results.

from tiny-glob.

chenxsan avatar chenxsan commented on August 23, 2024

Here's the one causing problem https://github.com/terkelg/tiny-glob/blob/master/index.js#L34, I can remove it, and everything works as expected in my case. Also, it won't fail any tests.

from tiny-glob.

chenxsan avatar chenxsan commented on August 23, 2024

I just put up a failing test here chenxsan@18f54bc

from tiny-glob.

terkelg avatar terkelg commented on August 23, 2024

Hi Sam! Thanks for having a look at the issue. Does it effect the benchmarks when you remove that line? It can also be a problem with the regex coming from globrex. Can you print the regex and file and test them?

The idea is that every path segment (aka dir name) is checked before tiny-glob starts crawling that directory. When globrex convert a glob it also break it into smaller regex segments for each folder/path segment. This is done so tiny-glob can check each directory and avoid spending time crawling unnecessary folders that never will result in any matches anyway. I suspect the regex for the glob {post,page} could be wrong.

from tiny-glob.

chenxsan avatar chenxsan commented on August 23, 2024

Here's benchmark result after I removed that line:

glob x 13,438 ops/sec ±3.05% (83 runs sampled)
fast-glob x 25,485 ops/sec ±5.20% (76 runs sampled)
tiny-glob x 55,162 ops/sec ±6.97% (55 runs sampled)
Fastest is tiny-glob
┌───────────┬─────────────────────────┬────────────┬────────────────┐
│ Name      │ Mean time               │ Ops/sec    │ Diff           │
├───────────┼─────────────────────────┼────────────┼────────────────┤
│ glob      │ 0.00007441320916659413  │ 13,438.474 │ N/A            │
├───────────┼─────────────────────────┼────────────┼────────────────┤
│ fast-glob │ 0.00003923935167426461  │ 25,484.621 │ 89.64% faster  │
├───────────┼─────────────────────────┼────────────┼────────────────┤
│ tiny-glob │ 0.000018128419620750812 │ 55,162.006 │ 116.45% faster │
└───────────┴─────────────────────────┴────────────┴────────────────┘

from tiny-glob.

chenxsan avatar chenxsan commented on August 23, 2024

Here's the lexer variable:

{ regex: /^(post|page)\/((?:[^\/]*(?:\/|$))*)([^\/]*)\.md$/,
  segments: [ /^(post|page)$/, /^((?:[^\/]*(?:\/|$))*)$/, /^([^\/]*)\.md$/ ],
  globstar: '/^((?:[^\\/]*(?:\\/|$))*)$/' }

And part of my directory structure:

$ tree ./post

image
All those .md files right under post are included while others inside subdirectory of post are filtered out.

So I just added a console.log(rgx, file) right before if (rgx && !rgx.test(file)) continue;, here's the printed result:

/^(post|page)$/ 'draft'
/^(post|page)$/ 'page'
/^((?:[^\/]*(?:\/|$))*)$/ 'about'
/^(post|page)$/ 'post'
/^([^\/]*)\.md$/ 'firefox-os'
/^([^\/]*)\.md$/ 'github-pages-custom-domain'
/^([^\/]*)\.md$/ 'markdown-and-table'
/^([^\/]*)\.md$/ 'srcset and sizes'
/^([^\/]*)\.md$/ 'telegram-scam-bitcoin'
...
...

Those are folders right under post, they should be walked into too. So the problem here might origin from the level value?

from tiny-glob.

terkelg avatar terkelg commented on August 23, 2024

What's going on with the directory names? Can you post the non-escaped strings for some of them?

from tiny-glob.

chenxsan avatar chenxsan commented on August 23, 2024

Those're chinese characters.

image

But I don't think it matters. Folders with english names like firefox-os are ignored by glob too.

from tiny-glob.

terkelg avatar terkelg commented on August 23, 2024

Thanks a lot @chenxsan. I'll have a look at this when I get some spare time. I appreciate the help and information you provided

from tiny-glob.

pavelloz avatar pavelloz commented on August 23, 2024

Its possible that i have the same problem, but in different form.

I prepared repo with test case: git clone https://github.com/pavelloz/tg-testcase && npm i && node test.js
https://github.com/pavelloz/tg-testcase/

Shortcut:

Structure:

tinyglob-testcase|master ⇒ tree modules 
modules
└── test
    ├── private
    │   └── views
    │       └── pages
    │           └── mypage.liquid
    └── public
        └── views
            ├── pages
            │   └── page.liquid
            └── partials
                ├── data
                │   ├── one.liquid
                │   └── two.json
                └── hello.liquid

9 directories, 5 files

Code:

const tg = require('tiny-glob');

tg('**', {
  cwd: 'modules/test',
  filesOnly: true
}).then(files => {
  console.log('Non-filtered.', files.length);
  console.log(files);
});

tg('{private,public}/**', {
  cwd: 'modules/test',
  filesOnly: true
}).then(files => {
  console.log('Filtered by private/public (broken)', files.length);
  console.log(files);
});


tg('**/{private,public}/**', {
  cwd: 'modules/test',
  filesOnly: true
}).then(files => {
  console.log('Filtered, with workaround/hack applied.', files.length);
  console.log(files);
});

Results

tinyglob-testcase|master ⇒ node test.js 
Filtered by private/public (broken) 1
[ 'private/views/pages/mypage.liquid' ]
Non-filtered. 5
[
  'private/views/pages/mypage.liquid',
  'public/views/pages/page.liquid',
  'public/views/partials/data/one.liquid',
  'public/views/partials/data/two.json',
  'public/views/partials/hello.liquid'
]
Filtered, with workaround/hack applied. 5
[
  'private/views/pages/mypage.liquid',
  'public/views/pages/page.liquid',
  'public/views/partials/data/one.liquid',
  'public/views/partials/data/two.json',
  'public/views/partials/hello.liquid'
]

from tiny-glob.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.