GithubHelp home page GithubHelp logo

Comments (12)

Jaykul avatar Jaykul commented on June 8, 2024 3

I would love to add globbing to Windows too. With the rise of cross-platform apps, it would be awesome to be able to just count on it working in the shell.

However, I see two problems with that:

  1. Avoiding problems with old apps
  2. Compatibility with task scheduler (and CMD)

I think if there was a way to black-list problem apps from globbing (or make it opt-in), that would solve the first one, but I don't have any idea how to make apps which expect shell globbing work in task scheduler except to wrap them in PowerShell...

from powershell-rfc.

andyleejordan avatar andyleejordan commented on June 8, 2024 2

I'd like to add to the motivations: extending this globbing to native tools also brings PSDrives:

> New-PSDrive -Name Temp -PSProvider FileSystem -Root /tmp

Name           Used (GB)     Free (GB) Provider      Root    
----           ---------     --------- --------      ----         
Temp               51.08         28.92 FileSystem    /tmp     

> /bin/ls temp:
babel-8502FUB
clr-debug-pipe-10485-985232-in
clr-debug-pipe-10485-985232-out
clr-debug-pipe-12822-282429-in
clr-debug-pipe-12822-282429-out
...

from powershell-rfc.

DerpMcDerp avatar DerpMcDerp commented on June 8, 2024 2

Why is it so important that it has to look like:

gcc *.c

instead of:

gcc (glob *.c)

from powershell-rfc.

andyleejordan avatar andyleejordan commented on June 8, 2024 1

@adityapatwardhan and I agree after further investigation that it really is not possible with the current state of PowerShell to retain relative paths while resolving globs, but we can do a work-around where we strip the current directory from glob expansions. This isn't ideal, as it loses information about ../.. embedded in strings, but avoids the rough edge cases of subcommands sharing the names of files in a directory; it also reduces the noise.

I've added this implementation to the prototype.

from powershell-rfc.

TSlivede avatar TSlivede commented on June 8, 2024 1

I completely understand the 'Motivation' part and agree that things like gcc *.c absolutely must be supported, otherwise powershell won't be accepted as a 'real' shell on linux.

I still think, this isn't the right way to do it. If I understand it correctly, this spec suggests to glob/expand every argument, even if it was quoted or given within a variable. In my opinion, this shouldn't happen, as it makes calling behavior very unreliable. One never really knows if an argument will be expanded to something else.

I think, globbing should only be done, if a glob-character is given literally and unquoted.

The usage of the 'verbatim marker' --% to stop globbing also leads to several problems:

  • One can't glob the second argument, but not the first
  • completely different syntax (single quote not accepted to quote arguments with spaces, ... , ...)
  • one can't expand regular variables
  • ...

Thats because IIRC --% was intended for windows apps, that need a specially formated commandline.

Don't get me wrong, I really like the Idea of globbing for native tools, but I think every bash user, who types echo '*' would be very surprised if the star was expanded.

from powershell-rfc.

andyleejordan avatar andyleejordan commented on June 8, 2024

The proposed prototype is available at PowerShell/PowerShell#2325. Note that the prototype does not yet support both hidden and normal files.

from powershell-rfc.

andyleejordan avatar andyleejordan commented on June 8, 2024

I've updated the prototype to use a slightly different API that lets me pass through a new CmdletProviderContext with Force = true so that all files (including hidden) are globbed.

from powershell-rfc.

Jaykul avatar Jaykul commented on June 8, 2024

Does it affect Windows too?

from powershell-rfc.

andyleejordan avatar andyleejordan commented on June 8, 2024

Per the spec, no. But that is an open question we'd like input on.

from powershell-rfc.

andyleejordan avatar andyleejordan commented on June 8, 2024

I'm going to reproduce the RFC here so people can read it more easily:


RFC: 0009
Author: Andrew Schwartzmeyer
Status: Draft
Version: 0.1

Area: Globbing

Extend globbing to native tools

This is a request for comment on the extension of PowerShell globbing to native tools.
Currently, PowerShell supports globbing of file path arguments to cmdlets;
but the support does not extend to the use of native tools instead of cmdlets.

Motivation

For example, Get-ChildItem *.md, will list all files ending with .md.
After parameter binding has occurred, the LocationGlobber is invoked,
which expands the wildcard * and resolves the expanded paths.
However /bin/ls *.md fails with:

/bin/ls: cannot access '*.md': No such file or directory

because the native Linux tool ls expects the calling shell
(in this case PowerShell) to have expanded * to a space separated list of relative paths.

This problem is not as prevalent on Windows because Windows command-line tools,
in the general case, perform their own globbing when needed,
since neither the Windows command prompt nor PowerShell have performed it for them.

Conversely, Linux command-line tools were developed in an ecosystem
where all calling shells perform globbing of wild cards in arguments.
Since we can expect Linux users of PowerShell to use Linux command-line tools,
we are highly motivated to support this expected shell behavior for those tools.

This author first ran into the problem when attempting to compile a small C program while in PowerShell:

gcc *.c

which is use case similar to many others,
and the lack of globbing by PowerShell ended my attempt.

An additional motivation is the fact that the expansion of ~ to the user's home directory
is provided by PowerShell's glob expansion.
Without globbing, it is sent to native tools literally.

Specification

Extending this globbing to native tools is conceptually easy.
While the native command parameter binder does not support advanced PowerShell binding mechanisms
(since native tools are not cmdlets),
we can call the LocationGlobber on each argument individually,
and pass the expanded arguments to the native tool.

However, this comes with multiple caveats:

First, we cannot expect to use any of our own common parameters,
because the native tool handles its own options and arguments.
So we lose the ability to specify, for instance, -Hidden,
and instead must default to expanding all possible files.

Second, we must not fail. If the argument cannot be expanded,
we have to pass it unchanged to the native tool.
This is because, without PowerShell's parameter bindings,
we cannot distinguish --option, some_arbitrary_argument,
and a/file/path/to/glob*.
So we have to attempt the globbing on each argument,
only replace the globbed argument if it succeeds;
and pass the original argument if it fails.

Third, we should continue to respect the verbatim marker,
--%, so that all globbing can be turned off entirely.

Fourth, PowerShell's globbing does not support escape characters.
So Get-ChildItem *is still expanded. This is because the cmdlet supports the option of-LiteralPathwhich disables globbing; but this parameter is not available to native tools. Supporting escape characters in theLocationGlobber` would be a major breaking change,
and so both out of scope, and likely out of consideration.
The suggested solution for those needing to use glob characters literally
is to use the supported verbatim marker.
Thus point number three is even more important.

With these caveats in mind, the following implementation is proposed:

This entire extension of globbing should be compiled only for the
Linux configuration of PowerShell, so as to not break existing Windows behavior.

If the verbatim marker is specified;
remove it and do no globbing (keep existing behavior).

Else, for each argument, attempt to expand it using the existing API:

LocationGlobber.GetGlobbedProviderPathsFromMonadPath

Specifically set allowNonexistingPaths = false so that commands such as git reflog
continue to work (where reflog would otherwise be a non-existing path to glob to /pwd/reflog).
This unfortunately prevents mkdir ~/newdir from working as expected,
since ~/newdir should be globbed to /home/user/newdir even though it does not yet exist;
but there is no way to differentiate it from a random argument.

Do not check if the argument contains glob characters using the API:

LocationGlobber.StringContainsGlobCharacters

Since ~ is not treated as a glob character, but is expanded by the globbing system.

If the expansion succeeds, replace the expanded argument with the
space-separated concatenation of the resulting expansions.
If any error in the expansion occurs (exception or no results),
keep the original argument.

Before expanding, setup the context to include both hidden and normal files;
existing Linux shells do not differentiate these types when expanding globs.
This author cannot think of a scenario where glob expansion should prefer only one type or the other,
so this does not need to be configurable.

Alternate Proposals and Considerations

Absolute versus relative paths

The default behavior of PowerShell's globber is to resolve to the absolute path of the match;
but existing Linux environments expect globbing to return relative paths.
(However, the major exception to this rule is expansion of ~ to
the absolute path of the user's home directory.)

Common globbing behavior on Linux looks like these examples:

  • The argument *.md expands to README.md CHANGELOG.md.
  • The argument ~/bin expands to /home/user/bin.
  • The argument ~/sub/../* expands to /home/user/sub/../one /home/user/sub/../two,
    (i.e. the ~ is expanded but the .. is left intact).

But using PowerShell's default behavior, globbing looks like this:

  • The argument *.md expands to /path/to/README.md /path/to/CHANGELOG.md.
  • The argument ~/bin expands to /home/user/bin
    (this will not need to be changed).
  • The argument ~/sub/../* expands to /home/user/one /home/user/two,
    (i.e. the ~ is expanded but the .. is resolved away).

The differences between these two behaviors is mostly cosmetic.
While there are a few edge cases where relative paths will need to be kept intact,
for instance, when creating relative symlinks with ln -s ../ previous_dir,
the majority of use cases will see no difference between absolute and relative paths,
(the former are simply noisier).
There is an additional edge case where a subcommand shares
the same name as file existing in the current directory.

A contrived example:

> mkdir status
> git status
git: '/home/andrew/src/PowerShell/status is not a git command. See 'git --help'.
> git --% status
On branch globbing
Your branch is up-to-date with 'origin/globbing'.

These edge cases can be handled with the use of the verbatim marker to disable globbing.

With the additional consideration that disabling path resolution within
PowerShell's globber would introduce lots of extra (potentially breaking) changes,
the initial extension of globbing to native tools should operate with resolved (absolute) paths.

However, if we could remove the absolute path resolution,
then status would be expanded to status, resolving the above edge case.
Likewise we could set allowNonexistingPaths = true,
and enable the use of mkdir ~/newdir.
So if these change can be made with minimal chance of breaking changes,
it should be made.

Configurability

It may be desirable for the extension of native globbing to be togglable,
perhaps by a new PowerShell preference variable, $NativeGlobbingPreference.
If the extension is brought back to the Windows build of PowerShell,
this will be necessary, so as to not break existing Windows PowerShell behavior.
However, it would be sufficient for the initial implementation to exist only on Linux,
where a breaking change such as this is not breaking anything,
and togglability is available via the verbatim marker.

Cross-platform inconsistency

For tools that exist on both Windows and Linux, such as Git and GCC,
we should avoid breaking scripts which use these tools.
But this might not be that difficult.

Git, for instance, implements its own globbing on wildcard characters passed
to it literally. This was done so that git add file* works as expected by Git users
on Windows where globbing is not normally done.
This is also the case for Git on Linux in PowerShell;
it does not differentiate between platforms,
but it handles both cases by globbing when there is a glob character,
and accepting pre-globbed arguments when the shell performed the globbing.

GCC through MinGW (i.e. not Cygwin), behaves differently on Windows.
GCC on Linux, per the example above, does not perform its own globbing.
But GCC on Windows through MinGW performs globbing.
Regardless, if the shell starts to perform globbing for GCC through MinGW,
it will not break as it will receive file paths,
which are an accepted input.

Thus, at least for the initial implementation,
there does not seem to be a requirement to implement globbing on Windows.

from powershell-rfc.

andyleejordan avatar andyleejordan commented on June 8, 2024

So, while I recognize that the current implementation is not Bash-like globbing, the unfortunate truth is that to get globbing that behaves like Bash, the current prototype cannot be used, as it uses PowerShell's built-in globbing.

If we do not want (cannot) reuse PowerShell's existing globbing engine, then an entire new globbing engine will need to be implemented. Additionally, the current parameter binder would need to be modified to pass all arguments verbatim to the new engine, as currently the binder receives arguments stripped of both surrounding quotes (so can't detect '*') and stripped of PowerShell's escape character (backtick i.e. <backtick>*), which leads to the current limitations.

The current proposal and prototype is to reuse PowerShell's existent globbing engine. If the @PowerShell/powershell-committee decides that it is insufficient, then this RFC should be rejected, and someone should open a new one for implementing Bash-like globbing.

I personally would suggest accepting this RFC and prototype as-is to enable some globbing (which can be disabled with --% in the cases where it breaks things), with the intention of somebody being unhappy enough with it to write a new RFC and implement a Bash-like globber.

from powershell-rfc.

TSlivede avatar TSlivede commented on June 8, 2024

Well, I think globbing doesn't necessarily need to be Bash-like.
(There are good reasons to make it similar to Bash, but if that's much more work, ok.)

Personally, I think it's absolutely important not to use --% for disabling globing.
If I remember correctly, --% was added, to allow powershell on windows, to call executables with a non standard commandline. (executables, that don't follow these rules)
In my opinion, it was always kind of a hack, because --% does not say "copy next string argument verbatim to commandline without modification" but instead "disable the powershell parser and enable some kind of cmd parsing".

Using --% to disable globbing, brings all those windows commandline-parsing-problems over to *nix and I think this must not happen.

To be constructive, here some ideas:
I think the best to avoid --% would be adjusting the parser (or parameter binder) to detect if * was quoted or literal.
If this is too much work, maybe something like @DerpMcDerp suggested.
And if that's too verbose, one could maybe introduce a new operator to enable (opt-in) globbing or a operator to disable (opt-out) globbing for one argument. (I'd prefer opt-in)

from powershell-rfc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.