Comments (4)
But I guess when doing this one would know they are ignored and not run
gix clean -xd
without specifying a path that omits them. Maybe precious files will ultimately be the solution to that.
Yes, precious files would add a layer of protection (even though you shouldn't take my word for it :)), even though with the already implemented change it would at least detect directly ignored repositories as such, so one has to -r
explicitly.
Is this just for top-level ignored directories (i.e. those that are not subdirectories of ignored directories and that are ignored because they match an ignore pattern)?
If it applies to git repositories in all ignored directories no matter how deep down, then it would make it easier to understand
-r
because it would always be needed to delete nested repositories. Maybe that would go against the desire to be as fast as possible by limiting traversal. Then again, maybe that is not a problem in deletion, where one is always taking the time to traverse at least once anyway (since a recursive deletion traverses fully).
No, it's just for the top-level, no nesting. Indeed, this is for performance reasons and there is that warning indicating that ignored directories may include repositories (but they may not be one anymore).
This performance I really must protect, as for instance in GitButler with node_modules/
and target/
, it takes ~0.04s with gix clean -xd
, but ~3.8s with git clean -nxd
. It's day and night.
Interestingly, gix clean -xd --skip-hidden-repositories non-bare
is still faster than git
.
❯ hyperfine -M3 -w1 'git clean -nxd' 'gix clean -xd --skip-hidden-repositories non-bare'
Benchmark 1: git clean -nxd
Time (mean ± σ): 4.404 s ± 0.738 s [User: 0.390 s, System: 3.587 s]
Range (min … max): 3.958 s … 5.255 s 3 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 2: gix clean -xd --skip-hidden-repositories non-bare
Time (mean ± σ): 3.229 s ± 0.248 s [User: 0.408 s, System: 2.621 s]
Range (min … max): 3.075 s … 3.515 s 3 runs
Warning: The first benchmarking run for this command was significantly slower than the rest (3.515 s). This could be caused by (filesystem) caches that were not filled until after the first run. You are already using the '--warmup' option which helps to fill these caches before the actual benchmark. You can either try to increase the warmup count further or re-run this benchmark on a quiet system in case it was a random outlier. Alternatively, consider using the '--prepare' option to clear the caches before each timing run.
Summary
gix clean -xd --skip-hidden-repositories non-bare ran
1.36 ± 0.25 times faster than git clean -nxd
from gitoxide.
Thanks so much for researching this and the deep analysis - I was particularly impressed by thinking of submodules that have worktrees in the superproject.
Regarding the behaviour of git clean -f
, I thought you might be interested in the undocumented 'double-force feature', which would then indeed remove nested repositories. It's worth noting that at some point git
seemingly removed nested ignored repositories, but stopped doing so in favor of -ff. This is no argument at all for not protecting worktrees, just something I thought you might find interesting.
With that said, I also believe that it should never touch the worktrees of the repository it is run in, while being unsure of what to do with worktrees of submodules that are reaching into the superproject. My take here is that it would probably be so unlikely that it basically never happens, and if it does it's more of an accident. As such, it should probably show up as eligible to be cleaned. If one day that shouldn't be desired anymore, then implementing this will be trivial by collecting worktrees of all submodules recursively as well.
I see two points of action here:
- add a way to classify worktrees and pass in their locations so that the algorithm can identify them. Having this as part of the API also makes callers aware.
- when the mode is 'for deletion', always double-check if an ignored directory is also a repository based on the passed criterion (i.e. fast if it has
.git
entry inside or slow by actual repository check). This would also mean that a 'status' would identify such a worktree as untracked, butclean
would know more. That way, accidental deletion while ignored will be prevented, even if such repository isn't a worktree. (One will have to specify-r
).
from gitoxide.
Regarding the behaviour of git clean -f, I thought you might be interested in the undocumented 'double-force feature', which would then indeed remove nested repositories. It's worth noting that at some point git seemingly removed nested ignored repositories, but stopped doing so in favor of -ff. This is no argument at all for not protecting worktrees, just something I thought you might find interesting.
Thanks--I was totally unaware of -ff
!
With that said, I also believe that it should never touch the worktrees of the repository it is run in, while being unsure of what to do with worktrees of submodules that are reaching into the superproject. My take here is that it would probably be so unlikely that it basically never happens, and if it does it's more of an accident. As such, it should probably show up as eligible to be cleaned. If one day that shouldn't be desired anymore, then implementing this will be trivial by collecting worktrees of all submodules recursively as well.
It occurs to me--and maybe this is what you're already thinking of--that there is case where a submodule's git worktree
managed worktree being present in the superproject is if it is .gitignore
d in the superproject and created there deliberately so that it exists alongside the submodule's main worktree, in the same way that one would usually create a repository's extra worktrees in the parent directory of the repository's main working tree.
But I guess when doing this one would know they are ignored and not run gix clean -xd
without specifying a path that omits them. Maybe precious files will ultimately be the solution to that.
when the mode is 'for deletion', always double-check if an ignored directory is also a repository based on the passed criterion
Is this just for top-level ignored directories (i.e. those that are not subdirectories of ignored directories and that are ignored because they match an ignore pattern)?
If it applies to git repositories in all ignored directories no matter how deep down, then it would make it easier to understand -r
because it would always be needed to delete nested repositories. Maybe that would go against the desire to be as fast as possible by limiting traversal. Then again, maybe that is not a problem in deletion, where one is always taking the time to traverse at least once anyway (since a recursive deletion traverses fully).
from gitoxide.
Please note that this was implemented as breaking change, which will prevent me from publishing a patch release unfortunately. Thus this fix will be released in a month or two with the next regular 'breaking' one.
Technically, this is breaking just for gix-dir
but not for gix
, but the publishing system doesn't analyse the public API at all and thus relies on me flagging breaking changes, which are propagated 'downstream' within the workspace.
from gitoxide.
Related Issues (20)
- `gix-mailmap` parser treats mappings with old but not new names as an error HOT 3
- Checkout fails when Windows symlinks have strangely named targets HOT 2
- PermissionDenied checking Windows symlink target is misinterpreted as collision HOT 3
- Test suite does not assert directory symlink creation HOT 2
- `gix_mailmap::Snapshot` does not implement `Debug` or `Eq` HOT 4
- gix cannot clone a repo with a branch called HEAD HOT 2
- ssh clone does not correctly detect the location of ssh.exe HOT 12
- parsing failure of invalid author/committer line - missing space before email HOT 3
- gix-diff make_diff_repo test fixture archive is always regenerated HOT 2
- many_different_states fails on Windows with GIX_TEST_IGNORE_ARCHIVES=1 HOT 3
- 9 tests rely on commands like `ln -s` making copies instead of symlinks on Windows HOT 4
- gix-config set configuration values HOT 1
- OSS-Fuzz issue 70323 HOT 1
- Something went wrong... & Merge conflict after working on another device HOT 1
- `gix clean -xde` deletes whole repo if `.gitignore` lists `*` or `/` HOT 6
- The kstring integration in gix-attributes is unsound HOT 5
- Fallback to git commands? HOT 3
- audit uses of `as_ref()` and remove those that are ambiguous HOT 5
- gitoxide fails to compile with bstr 1.9.2 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gitoxide.