GithubHelp home page GithubHelp logo

bitranox / igittigitt Goto Github PK

View Code? Open in Web Editor NEW
22.0 22.0 5.0 97 KB

A spec-compliant gitignore parser for Python

License: MIT License

Makefile 1.83% Python 70.96% Jupyter Notebook 1.91% Shell 23.99% Batchfile 1.31%

igittigitt's People

Contributors

bitranox avatar mapleccc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

igittigitt's Issues

Default ignore file in home directory

  • **I'm submitting a ... **
    • bug report
    • feature request
    • support request

Git provides for reading default rules from the user's home directory (e.g. ~/.gitignore). This requires special treatment since ignore rules are interpreted relative to the directory in which the ignore file is located and not just the user's home directory.

Emulating this functionality is already present with the optional base_dir argument to _parse_rule_file (except it is currently broken). I would like to recommend that the base_dir parameter be returned to working order, and that _parse_rule_file be promoted to a public method (i.e. just remove the leading underline '_').

Fixing base_dir can be accomplised with a single line of code:

    path_base_dir = base_dir if base_dir is not None else path_rule_file.parent

User's can then effectively append multiple default ignore files as if they were all part of a single ignore file in the target directory.

 igittigitt.parse_rule_file("~/.myignore', base_dir=target_dir)
 igittigitt.parse_rule_files(target_dir, '.myignore')

not parsing subdirs of ignored directories

  • **I'm submitting a ... **
    • bug report
    • feature request
    • support request

If behavior is intended, I need support, if not, it might be a bug

  • What is the current behavior?

Currently files in subdirectories of ignored directories are not matched.

rule in .gitignore: src_old/
file not matched: src_old/foo/bar.py

  • **If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem

.gitignore

src_old/
.idea/

code

import igittigitt
import pprint
import pathlib

current_path = pathlib.Path(__file__).parent.absolute()

gitignore_parser = igittigitt.IgnoreParser()
gitignore_parser.parse_rule_files(current_path)

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(gitignore_parser.rules)

print(gitignore_parser.match(pathlib.Path(f'{current_path}/src_old')))
print(gitignore_parser.match(pathlib.Path(f'{current_path}/src_old/foo.py')))
print(gitignore_parser.match(pathlib.Path(f'{current_path}/src_old/foo')))
print(gitignore_parser.match(pathlib.Path(f'{current_path}/src_old/foo/bar.py')))

output

[   IgnoreRule(pattern_fnmatch='/home/md/PycharmProjects/test/**/.idea', pattern_original='.idea/', is_negation_rule=False, source_file=PosixPath('/home/md/PycharmProjects/test/.gitignore'), source_line_number=2),
    IgnoreRule(pattern_fnmatch='/home/md/PycharmProjects/test/**/.idea/*', pattern_original='.idea/', is_negation_rule=False, source_file=PosixPath('/home/md/PycharmProjects/test/.gitignore'), source_line_number=2),
    IgnoreRule(pattern_fnmatch='/home/md/PycharmProjects/test/**/src_old', pattern_original='src_old/', is_negation_rule=False, source_file=PosixPath('/home/md/PycharmProjects/test/.gitignore'), source_line_number=1),
    IgnoreRule(pattern_fnmatch='/home/md/PycharmProjects/test/**/src_old/*', pattern_original='src_old/', is_negation_rule=False, source_file=PosixPath('/home/md/PycharmProjects/test/.gitignore'), source_line_number=1),
    IgnoreRule(pattern_fnmatch='/home/md/PycharmProjects/test/.idea', pattern_original='.idea/', is_negation_rule=False, source_file=PosixPath('/home/md/PycharmProjects/test/.gitignore'), source_line_number=2),
    IgnoreRule(pattern_fnmatch='/home/md/PycharmProjects/test/.idea/*', pattern_original='.idea/', is_negation_rule=False, source_file=PosixPath('/home/md/PycharmProjects/test/.gitignore'), source_line_number=2),
    IgnoreRule(pattern_fnmatch='/home/md/PycharmProjects/test/src_old', pattern_original='src_old/', is_negation_rule=False, source_file=PosixPath('/home/md/PycharmProjects/test/.gitignore'), source_line_number=1),
    IgnoreRule(pattern_fnmatch='/home/md/PycharmProjects/test/src_old/*', pattern_original='src_old/', is_negation_rule=False, source_file=PosixPath('/home/md/PycharmProjects/test/.gitignore'), source_line_number=1)]
True
True
True
False
  • What is the expected behavior?

Every subdir and file in subdirs of a ignored dir should be matched.

  • What is the motivation / use case for changing the behavior?

in 1.0.6 it was matching all sub* of a ignored directory

  • Please tell us about your environment:
  • Release Number of the Repository used : 2.0.0
  • Python Version : 3,8.5
  • OS, OS Version : Linux gfz 5.8.6-1-MANJARO #1 SMP PREEMPT Thu Sep 3 14:19:36 UTC 2020 x86_64 GNU/Linux
  • Other information (e.g. detailed explanation, stack traces, related issues, suggestions how to fix, links for us to have context, eg. stackoverflow
    , gitter, etc)

Use glob.glob instead of Pathlib.glob to follow symlink

  • **I'm submitting a ... **

    • bug report
    • feature request
    • support request
  • What is the current behavior?

parse_rule_files relies on Pathlib's glob method that is affected by a bug that prevents it from following symlinks. As a result, .gitignore files under symlinked directories are ignored.

  • **If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem

Place a .gitigore file in a symlinked directory and use parse_rule_files on the parent folder.

  • What is the expected behavior?

Following this solution, it would be best to rely on glob.glob that follows symlinks correctly.

  • What is the motivation / use case for changing the behavior?

  • Please tell us about your environment:

  • Release Number of the Repository used : 2.0.3
  • Python Version : 3.6.3
  • OS, OS Version : MacOS
  • Other information (e.g. detailed explanation, stack traces, related issues, suggestions how to fix, links for us to have context, eg. stackoverflow
    , gitter, etc)

https://stackoverflow.com/questions/46529760/getting-glob-to-follow-symlinks-in-python

Proposed solution:

rule_files = sorted(list(
                        glob(
                            f"{os.path.abspath(base_dir)}/**/.gitignore", recursive=True
                        )
                    )
                )

CLI to query local directory

The CLI isnt very useful at the moment.

It would be useful to be able to specify an ignore file, and list files like git ls-files, and even better be able to give glob(s) args that the program expands and then filters out anything which should be ignored.

Handle symlinks correctly

This is a follow up to issue #16. The glob issue is now resolved, however, there are still problems that I encountered.

  1. Using pathlib.Path().resolve() follows symlinks

The match() method resolve()s the target before matching it against the ignore rules. That means that you may not be able to match against a file under a symlinked directory. Consider the following example:

prey/
   if-you-can.txt
catch_me/ -> prey/
.gitignore

where catch_me is a symlink to prey. The gitignore contains the following:

catch_me

Now, match('catch_me') will fail as it is resolved to prey first before matching against the rule happens. A solution would be to avoid the resolving, for example:

       def match(self, file_path) -> bool:
                str_file_path = os.path.abspath(file_path)
                is_file = os.path.isfile(str_file_path)
                match = self._match_rules(str_file_path, is_file)
                if match:
                    match = self._match_negation_rules(str_file_path)
                return match
  1. Add FOLLOW flag

In version 3 and later, globmatch does not follow symlinks unless the FOLLOW flag is set, e.g.

if wcmatch.glob.globmatch(
                            str_file_path,
                            [self.last_matching_rule.pattern_glob],
                            flags=wcmatch.glob.DOTGLOB
                            | wcmatch.glob.GLOBSTAR
                            | wcmatch.glob.FOLLOW,

I believe the follow flag should be included.

Doesn't correctly handle base directories with a symlink in their components

  • **I'm submitting a ... **
    • bug report
    • feature request
    • support request

Currently, if the base directory provided to an IgnoreParser method is a symlink or a descendant of a symlink, the parser will not match any files provided to it.

MVCE:

#!/usr/bin/env python3
from pathlib import Path
from igittigitt import IgnoreParser

dir01 = Path("/tmp/igittigitt01")
dir01.mkdir()
dir02 = Path("/tmp/igittigitt02")
dir02.symlink_to(dir01)
(dir02 / "file.txt").touch()

parser = IgnoreParser()
parser.add_rule("*.txt", dir02)
print(parser.match(dir02 / "file.txt"))  # This prints "False", but it should print "True"
  • What is the expected behavior?

A rule added for a directory path p should be honored for any filepath that starts with p.

  • What is the motivation / use case for changing the behavior?

A rule added for a directory path p should be honored for any filepath that starts with p.

  • Please tell us about your environment:
  • Release Number of the Repository used : 2.1.0
  • Python Version : 3.9.13
  • OS, OS Version : Debian 11 (bullseye)
  • Other information (e.g. detailed explanation, stack traces, related issues, suggestions how to fix, links for us to have context, eg. stackoverflow
    , gitter, etc)

I believe this bug is due to the use of the pathlib.Path.resolve() method, which resolves symlinks.

issue with negates pattern

  • **I'm submitting a ... **

    • bug report
    • feature request
    • support request
  • What is the current behavior?
    Based on gitignore specification it is not possible to re-include a file if a parent directory of that file is excluded.
    https://git-scm.com/docs/gitignore#_pattern_format
    I try to exclude specific directory based on example

Example to exclude everything except a specific directory foo/bar (note the /* - without the slash, the wildcard would also exclude everything within foo/bar):
 $ cat .gitignore
    # exclude everything except directory foo/bar
    /*
    !/foo
    /foo/*
    !/foo/bar
  • **If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem
import igittigitt
"""
Based on https://git-scm.com/docs/gitignore#_examples
Example to exclude everything except a specific directory foo/bar 
(note the /* - without the slash, the wildcard would also exclude everything within foo/bar):
 $ cat .gitignore
    # exclude everything except directory foo/bar
    /*
    !/foo
    /foo/*
    !/foo/bar
"""

gitignore = igittigitt.IgnoreParser()
base_path = "/example/"
gitignore.add_rule("/*", base_path)
gitignore.add_rule("!/foo", base_path)
gitignore.add_rule("/foo/*", base_path)
gitignore.add_rule("!/foo/bar", base_path)
assert gitignore.match(base_path + "foo/bar/file.txt") == False
assert gitignore.match(base_path + "foo/other/tile.txt") == True #failed on current version
  • What is the expected behavior?
    Based above example gitignore.match(base_path + "foo/other/tile.txt") should return True

  • Please tell us about your environment:

  • Release Number of the Repository used : 2.0.4
  • Python Version : 3.8.6
  • OS, OS Version : MacOS

Match ignore file with different base paths?

  • **I'm submitting a ... **

    • bug report
    • feature request
    • support request
  • Do you want to request a feature or report a bug?
    None

  • What is the current behavior?

  • **If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem
    Not a bug

  • What is the expected behavior?

  • What is the motivation / use case for changing the behavior?

  • Please tell us about your environment:

  • Release Number of the Repository used : v2.1.4
  • Python Version : 3.10.11
  • OS, OS Version : Windows 11
  • Other information (e.g. detailed explanation, stack traces, related issues, suggestions how to fix, links for us to have context, eg. stackoverflow
    , gitter, etc)

I want to process all files and subdirectories in a given directory and match them with ignore file.
For example,
My project is located in: D:\project
The ignore file is located in D:\project\.ignore and contains *.txt
I am trying to pass a system wide file path or directory to check if they match or not. like c:\myfile.txt
All files and directories are not matched if the basepath is not the same.
Is it even possible to do something like this?

'IgnoreParser' object has no attribute 'parse_rule_file'

  • **I'm submitting a ... **

    • bug report
    • feature request
    • support request
  • Do you want to request a feature or report a bug?

Report a bug.

  • What is the current behavior?
    The example in the README appears to be incorrect or outdated. The parse_rule_file does not appear to exist. If I attempt to use this function, Python raises an error: AttributeError: 'IgnoreParser' object has no attribute 'parse_rule_file'.

  • **If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem
    Running Python in a directory with a .gitignore file:

>>> import igittigitt
>>> parser = igittigitt.IgnoreParser()
>>> parser.parse_rule_file(".gitignore")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'IgnoreParser' object has no attribute 'parse_rule_file'
  • What is the expected behavior?
    I can't find any parse_rule_file function in the source code. Guessing parse_rule_files is the intended function to use.
>>> parser.parse_rule_files(pathlib.Path('/home/bitranox/project/'))
  • What is the motivation / use case for changing the behavior?
    I spent way too long trying to get this inexistant function working, And I'd rather no one else ever has to go through the pain again.

  • Please tell us about your environment:

  • Release Number of the Repository used : v2.0.4
  • Python Version : 3.9
  • OS, OS Version : Windows 10, build 20H2
  • Other information (e.g. detailed explanation, stack traces, related issues, suggestions how to fix, links for us to have context, eg. stackoverflow
    , gitter, etc)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.