GithubHelp home page GithubHelp logo

codeowners-checker's People

Contributors

ag4ta avatar bamorim avatar dhamidi avatar ekadlecova avatar gnechita-toptal avatar id-ilych avatar lowang avatar mmrazik avatar mpapis avatar nilbus avatar ojab avatar rusllonrails avatar teonimesic avatar zhukovpe avatar zinovyev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

codeowners-checker's Issues

Specify the requirement for Ruby 2.5

codeowners-checker uses yield_self, which requires Ruby 2.5. However, it happily installs under Ruby 2.4.

$ ruby -v
ruby 2.4.4p296 (2018-03-28 revision 63013) [x86_64-linux]
$ codeowners-checker check
/home/test-user/.gem/ruby/gems/codeowners-checker-1.0.1/lib/codeowners/checker/code_owners.rb:41:in `list': undefined method `yield_self' for []:Array (NoMethodError)

Missing fetch command

When running the fetch command via CLI, an error indicates the command does not exist:

$ codeowners-checker fetch
Could not find command "fetch".

I installed via gem install codeowners-checker, which may not have updated the gemspec since the change was made (?)

Support to ignore @ghost

I have a scenario in my CODEOWNERS file, which I define as the owner of a path is @ghost because it's a trick to avoid getting review requests. The problem is that when I run the checker through GH actions, it returns this error:

[err] line 13: User "@ghost" is not a member of the organization

(because in fact, this "user" does not exist)

Reading the documentation, I thought that perhaps not_owned_checker_skip_patterns could solve this problem. But I couldn't solve it ๐Ÿ˜ข

Do you have any other ideas?

Thanks ๐ŸŒท

Refactor `Line`

Rationale

Right now descendant's of Line use parse in initialize (mostly Pattern) and the initialize method always takes a raw line as param. Because of this we can not initialize the Pattern with pattern, *owners and always need to parse.

Possible solution

Move the parse to be class method so in Line.build would call return klass.parse(line) if klass.match?(line) and for patterns Pattern.parse would call Pattern.new(pattern, *owners) and now Pattern.new becomes available for building pattern lines.

List unrecognized lines when running the checker with --no-interactive option

When running the checker with --no-interactive option, it lists new files which were introduced and patterns which are contained in the CODEOWNERS file but missing in the repository. However, when an unrecognized line is found it is not listed as an inconsistency.

The unrecognized lines should be listed as part of the inconsistencies.

List PR files for team to review

Some PR's with a lot of files are triggered by Github to be reviewed by multiple teams, add an option to show a list of files give a PR and team / person (might be hard to interpolate person from team)

Fix suggesting groups

When checking for new files and suggesting subgroups, if a pattern is chosen to be added at end of the CODEOWNERS file once, the method subgroups_owned_by always returns an empty array for the following patterns even if groups should be found. As a result, the only option is to add the patterns at the end of the file.

Ordering of search paths for CODEOWNERS file

directories = ['', '.github', 'docs', '.gitlab']

Github and Gitlab have different ideas on where to look for the CODEOWNERS file:

Find out which order we should use and fix it, if necessary contact Github/Gitlab to fix the order.

It is possible we might need to check the repository remote(s) to decide on an order.

Another consideration is we might actually issue a warning if there are multiple files. (add tests for having multiple files)

Discussion: where we want to go with this tool

Hi Folks, I want to share some thoughts that I've been having while playing around with this gem and implementing the cleanup command.

The reason for opening an issue here is to open space for discussion on these sensitive topics and then we should follow up by creating issues per thing we want to actually do.

DISCLAIMER: This is an attempt to do a mental dump of what is in my head, so I probably missed a lot of points. Also, I've been looking at this thing for less than one week, so I probably made a lot of mistakes, please feel free to correct me. This is the reason why I'm creating the issue. To discuss.

Overview

A wild guess is that the direction we want to take with this tool changed drastically over time. It probably started as way to manage CODEOWNERS in a "easier way" but still very manual, therefore the "interactive mode" and overtime started drifting more towards just reporting errors and automating some fixes (Like rubocop -a I'd say).

My guess comes from the fact that interactive mode is the default and should be disabled with --no-interactive.

Simplifying what we do

As all projects as they evolve, a lot of complexity arises and here I'll make some bold suggestions on removing features to make the code simpler and then being able to move forward in a smooth way.

Take these suggestions with a grain of salt as they come from my very narrow experience with this and talks with @jonatas and @dennissivia.

With that in mind, the two things I'd remove are:

  • Interactive mode: this was probably once the "heart" of the system, but a lot of complexity arises from that piece of code and make changes hard to be done. I'd strip that out entirely and focus first on just error reporting and later on error correction automatically. I think properly suggesting errors and fixing them on a text editor is not that big of a downgrade. In some cases, it may be even better (if you have a lot of problems currently, the interactive mode will make you follow a specific path, while if you go on your own, you can choose your path and maybe fix more important issues first)
  • The "group" concept on CODEOWNERS: I don't see value on extending the "AST" of a CODEOWNERS file to make it look like it is structured with "groups". We are "inventing" special comments to group things. I'd just treat the CODEOWNERS file as it is: a list of rules, comments and blank lines (and occasionally, some malformed lines that should be reported). I think this is not that complex.

Where we want to go

Again, this is based on my conversations with @jonatas and @dennissivia, so take again with a grain of salt.

We want to:

  • Report errors better:
    • Being able to report only a subset of errors is good (like only changed as described on #82)
    • Being able to maybe report only a type of error
  • Make it easy to fix errors:
    • Automatically fix some errors (like removing a rule if no file match)
    • Commands to make fixing errors easier (like adding a group to files missing rules as described on #81) or renaming a team
  • Not messing up (one problem we realized when implementing #80 is that sorting lines can be dangerous because of precedence)

With that in mind, I I'd like to make some suggestions on features/architectural changes we should do before:

  • Remove the git-related features such as --from and --to
  • Split every "check" into it's own class, like Rubocop's Cops. Rules would take as input: list of files in repo, the list of rules, the list of owners, whitelist. Preload or all those inputs to avoid checks doing unnecessary work and to make it easy to test things (we can pass a list of files without actually
  • Implement something like "Unit of Work" to allow multiple fixes to be applied consistently (e.g., deleting a line should not automatically delete from the file, but instead mark for deletion and later on removed from the file)
  • Add an optional "fix" method for the "Problem" class and implement them for some of the checks
  • Re-implement git related features on top of this simpler system by simply running checks with the inputs before and after (--from and --to), and somehow diffing the generated problems

The checks

We have currently 4 checks that we implement:

  • File missing rule
  • Rule missing file
  • Rule with nonexistent owner
  • Invalid line

As far as #80 goes, I think the best way to address some of it's points are by creating new checks:

  • Check to verify owner syntax (@name, @org/team, [email protected])
  • Check to verify duplicated lines (fix: remove all but one)
  • Check to verify duplicated owners in one rule (e.g.: file @owner @owner, fix: remove duplicate occurrences)
  • Check to verify valid pattern (for example, !a is a valid pathspec, but it is not valid to have a in a CODEOWNERS file, and according to the comment of Github staff on this question would make the file unparseable. (fix: just remove the line)

CLI interface

If we organize our code with the "Check" and "Problem" model, then I think a nice way of automating fixes in a predictable and discoverable CLI interface is to use the name of the check and maybe some strategy on input. Exemples:

codeowners-check \
  --on-rule-without-file remove-rule \
  --on-rule-with-nonexistent-owner set-to:@owner \
  --on-invalid-rule remove-rule \
  --on-file-without-rule set-to:@owner \
  --on-duplicate-rule keep-first \
  --on-duplicate-owner remove-duplicates

The default behavior would be just warn or nothing, which means no fix would be made and the problem would just be warned.

With that, the check could also be disabled altogether. Some suggestions on how to do it

codeowners-check --disable-rule-without-file --disable-invalid-rule
codeowners-check --rule-without-file=false --invalid-rule=false
codeowners-check --disable-checks=rule-without-file,invalid-rule
codeowners-check --checks={all but rule-without-file and invalid-rule}

Feature: Cleanup command

After some discussion I am suggesting to add a cleanup command to tackle the following problems:

Problem

When a CODEOWNERS file becomes bigger the following problems arise:

  • Duplicate patterns
  • Hard to find what belongs to the same owner (lack of sorting)
  • Obsolete patterns (no file matches the given pattern)

Proposal

Add a new command --cleanup to do the following things.

  • Sort the file based on the rules listed below (See sorting
  • Remove duplicate patterns (only full duplications)
  • Expand each pattern and if finds 0 files, remove the pattern

Sorting

Our initial discussion lead to the following proposal:

  • Sort (and thus group) by owner
  • Within one owner sort patterns alphabetically

Related issues

#21 also asks for some of these topics.

Whitespace painting of patterns

  1. Implement option to allow no_whitespace, preserve and enforce
  2. Implement detecting whitespace painting
  3. Implement the whitespace painting of patterns according to the option

example whitespace painting:

/file.rb        @owner
/dir/another.rb @owner

Add tests for Group

# Everything
* @owner

# Group1

## Group2

### Group3
directory/alien_file.rb @Michal

=>

# Everything
* @owner

Fix grouping of content in CODEOWNERSHIP file

Examples:

Problems:

  • multiple possible paths: CODEOWNERS or docs/CODEOWNERS or .github/CODEOWNERS
  • comments are used for grouping patterns or multiple patterns
  • comments can be standalone
  • meaningful double empty lines
  • owner can be a team, user or email

Ideas:

  • we could introduce different record types for comments, filters, empty lines
  • every new empty line would be starting a new group and all the records following it will be linked with the group number
  • when adding new entries they should:
    • find groups with the requested owner
    • list the groups and ask to pick one or create one
    • in the group patterns should be ordered alphabetically
  • when removing patterns we should remove the whole group if it is the last pattern
  • when parsing file ask what to do with unrecognized patterns

Folder patterns is not recognized properly

Steps to reproduce:

#!/bin/bash
mkdir .github/
printf "lib/ @jonatas\nspec/ @other" > .github/CODEOWNERS
printf "@jonatas\n@other" > .github/OWNERS
touch lib/foobar.rb
touch spec/foobar.rb
git add lib spec
git commit -m "test"
bin/codeowners-checker check .

Actual behavior:

# NOTE: Gem recognize new file as the missing owner but they should be identified by the pattern
File added: "lib/foobar.rb". Add owner to the CODEOWNERS file?
(y) yes
(i) ignore
(q) quit and save
 [y, i, q] y
Owners:
1 - @jonatas
2 - @other
Choose owner, add new one or leave empty to use "".
New owner:  1
Possible groups to which the pattern belongs:
# NOTE: Gem recognizes the possible group for that file but for spec/foobar.rb it's missing
1 - lib/ @jonatas
Choose group:  1
File added: "spec/foobar.rb". Add owner to the CODEOWNERS file?
(y) yes
(i) ignore
(q) quit and save
 [y, i, q] y
Owners:
1 - @jonatas
2 - @other
Choose owner, add new one or leave empty to use "".
New owner:  2
Add to the end of the CODEOWNERS file?
Commit changes? n

Expected:

as we have matching owners for these new files via patterns I think codeowners shouldn't complain about that so the next output seems what we need to have in this case

โœ… File is consistent

Ask to create group when adding pattern

Right now when adding a pattern users can specify into which group the pattern should be added, add an option to create new group (and ask where to put it). Main::suggest_subgroups_for_pattern

Add command to view the tree version of the groups

We have to_tree implemented already, it would be nice to have a viewer with the groups nesting shown.
Try to find out if there are better charracters to represent the tree then the current:

 + group start
 | group continuation (0-*)
 \ group end

Feature suggestion: PR mode

Motivation

At the moment all files that do not have an owner are always taking into consideration for the final result. We also introduced a whitelist concept to ignore certain files/directories.
However, for a code-base with a lot of a legacy fixing exiting issues would just block their progress. This is why tools like pronto allow rubocop and other linters to only run against "changed files".

Possible solution

Thus I suggest to add a similar mode for the codeowners-checker so that we can only test the files that have been changed. That flag could be named --pr-mode or --changed-only to indicate that only changed files are checked.
This should also have the benefit of being much faster for big code bases.

Validate owners with a file / external API

There was one instance where owner/team was wrong and it broke the automatic reviewers process on Github, we should be able to validate owners with either a file or when possible external APIs like Github or Gitlab.

As a first step we could add regexp to validate one of: @person @company/team [email protected].

Fix Pattern matching

The tests are failing for ./spec/codeowners/checker/group/pattern_spec.rb.

  • use File.fnmatch for matching patterns
    • maybe we can use git ignore for the validations?
  • add tests for patterns starting with /
  • add tests for patterns including [a-z] or ?

Stale???

Is this project no longer being supported? Dependencies are outdated and becoming a problem for utilization.

Fix pattern matching inconsistent with gitignore definition

Our current implementation of matching files using ruby-git is not optimal and is inconsistent with gitignore definition. It ignores / at the beginning of patterns and doesn't differentiate between * and **.

@git.diff(@from, @to).path(patterns).name_status.keys

@git.ls_files(pattern.gsub(%r{^/}, '')).any?

Ideas:

  • we could use ruby-git only for getting the files from git and match the files using fnmatch
  • the problem with this approach is that it is very slow due to the number of files and patterns that need to be compared and we need to find a way to speed it up
  • we could segregate the lists of files by initial letters, so we can limit the amount of matches by having a hash { a: [app/..., apq/...], b: ... } which would considerably reduce the number of comparisons

Plugins

Depends on #21 - configuring which checks to enable

Consider using plugin architecture for finding the checks so the checker can be configured with external gems / libs that can be private or are very specific to a group/company

Add default values

Usually, you need to check
in the current folder
current changes
against remote master
it is easier to write Code::Ownership::Checker.check!
than providing the same parameters each time.

Extend Code Ownership Checker configuration

The current configuration creates a default team in a git config file and we're probably going to expand to multiple teams and other details.

The intent here is to extend and introduce a decent configuration level:

  1. Introduce multiple teams configuration instead of just one team
  2. With step 2 we can also have a configuration of the GitHub username or more users and teams

Extending configuration can be also useful for future ideas like blacklist or ignore certain folders or files that the code owners will never cover.

Dependabot can't evaluate your Ruby dependency files

Dependabot can't evaluate your Ruby dependency files.

As a result, Dependabot couldn't check whether any of your dependencies are out-of-date.

The error Dependabot encountered was:

Bundler::Dsl::DSLError with message: 
[!] There was an error parsing `Gemfile`: 
[!] There was an error while loading `codeowners-checker.gemspec`: uninitialized constant Codeowners. Bundler cannot continue.

 #  from /home/dependabot/dependabot-updater/dependabot_tmp_dir/codeowners-checker.gemspec:42
 #  -------------------------------------------
 #    unless ENV['TRAVIS']
 >      unless Codeowners::Cli::SuggestFileFromPattern.installed_fzf?
 #        "sanitized"
 #  -------------------------------------------
. Bundler cannot continue.

 #  from /home/dependabot/dependabot-updater/dependabot_tmp_dir/Gemfile:5
 #  -------------------------------------------
 #  
 >  gemspec
 #  -------------------------------------------

You can mention @dependabot in the comments below to contact the Dependabot team.

Open source Code Ownership Checker

The https://github.com/toptal/codeowners-checker seems like a great candidate to be open source.

The code is not related to our core and the tool can be useful for other companies that use the same ownership boundaries.

The objective of the task is:

  1. Remove all Toptal references in the internal code
  2. Write the necessary documentation about how to use the tool
  3. Release as open source
  4. Announce to the world

Feature: auto-add

I suggest adding a command to automatically add changed files (that don't match any existing rule) to the given owner. This would allow a contributor to assign all the changes to a given team.

Purpose

Instead of manual changes, the author would be able to just assign all changed files to a given team.

Syntax

This is just a suggestion to have a concrete examples, I would be happy to discuss better options

codeowners-checker --auto-add @mycomp/team1

Semantics

The checker will automatically find all files, create the smallest amount of patterns (optional)
and add these rules to the codeowners file. Since inferring the most effective pattern seems hard, we could also start with simple approaches and automatically open an editor or something like that.

Add new checks

Checks should be executed in this order:

  • Check comments for patterns
    • matching existing files
    • avoid duplicates with existing patterns
    • ask to delete commented patterns that don't match existing files
  • groups without comments
    • add comment
    • merge with other group
  • missing references
  • useless patterns
    • avoid duplicating patterns when suggesting new ones #7
  • detect duplicate patterns
  • move groups with same owner to subgroups of a "domain" group - ask for title

It should be possible to configure which checks to run (git config?)

Suggestions should not overlap each other

We created an example of a useless pattern named Gumfile that should suggest Gemfile but the Gemfile is already in place in the code owners file. In this case, it should suggest to add one more owner instead of adding a new line to the owners' file. Or suggest to delete the line, or ignore the suggestion.

We're not sure about the best approach, we need to research and see what fits better for our flow.

Analyzing the issue we brought the following points:

If we suggest something that is already in the code owners we're going to duplicate the ownership lines.
If we find multiple patterns matching the same file, and the pattern is generic, covering a folder, it's ok.

But, if the pattern is only covering one file and it's duplicated in the code owners definition we could merge the owners and make it a single line.

We're confused with the direction of the implementation:

  1. Focus on what is duplicated can lead us to fix the issue but will not avoid suggesting duplicates
  2. Process all files against all patterns will be expensive to process
  3. If we remove all suggestions, maybe it will not bring a good suggestion and we'll need to upgrade our fuzzy match search.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.