GithubHelp home page GithubHelp logo

truegitcodechurn's People

Contributors

flacle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

truegitcodechurn's Issues

Please add testing

Hi @flacle !

First, thank you all for your work. Some colleagues of mine have expressed interest in this script and have requested a few features in line with some of the current issues (which I would also like to contribute ;) ).

Before I work on these features, I wanted to establish a safety net so I don't break anything. As such, I would like to request (and then submit) some unit tests to make sure functionality does not regress.

Add plotting mechanisms/options

Sometimes you may want to have the tool export a PNG (or SVG) of a plot as part of a pipeline or for ease-of-use. Users should be able to do optionally request this in a simple way.

Churn on particular commit [QUESTION]

First of all, I appreciate you for creating this awesome package. I have one question. If you have time, can you please tell me how you would calculate the churn of a particular commit? I mean, suppose, I committed 3 commits and I need to find the churn for each separate commit. The thing is now I need to check if the changes has been done within 21 days in that particular commit. How can I make sure the churn is based on 21 days for each commit?

understanding the jargon used in the code

Sorry for a naive question. Can you please clarify what exactly is contribution and churn in this output, please?

contribution: 11000
churn: -900

Is here contribution means totally new work without any changes in the code? I mean brand new code that does not replace any older code?

Is here churn means only the changes in the same lines of code?

I think it will be better to consider efficiency and legacy refactor.

Improve analytics data by tracking remove and add counts by line in `files`

Related to several other issues, it would make analytics easier if the files structure, rather than storing data presently as:

{
        "README.md": {
            2: 0,
            8: 0,
            10: 0,
            11: 0,
            ...
            24: 2,
            31: 0,
            33: 1,
            35: 1,
            37: 3,
            41: 12,
        },
        "gitcodechurn.py": {
            0: 0,
            1: 190,
            2: 4,
            4: 0,
            11: -1,
            15: 6,
            16: 5,
            37: 2,
           ...
            167: 1,
            172: 0,
            173: 2,
            189: 1,
            191: 1,
            192: 5,
            193: 0,
            196: 2,
            197: 0,
            198: 25,
            200: 1,
            217: 14,
            223: 1,
            224: 1,
        },
    }

instead tracked the count of removed and count of added. This additional data would allow more detailed analytics and nuance to questions regarding user specific churn and other questions.

I propose we instead utilize a structure such as:

{
        "README.md": [
            {"added": 0, "removed": 0, "line_number": 2},
            {"added": 3, "removed": 1, "line_number": 42},
            ....
       }
}

I would be happy to submit a PR in support of this.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 650113: invalid continuation byte

Getting this error on my latest run:

Traceback (most recent call last):
File "/truegitcodechurn/./gitcodechurn.py", line 264, in
main()
File "/truegitcodechurn/./gitcodechurn.py", line 94, in main
[files, contribution, churn] = get_loc(
File "/truegitcodechurn/./gitcodechurn.py", line 121, in get_loc
results = get_proc_out(command, dir).splitlines()
File "/truegitcodechurn/./gitcodechurn.py", line 247, in get_proc_out
return process.communicate()[0].decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 650113: invalid continuation byte

Add ability to exclude certain folders

Sometimes project folders or configuration files get checked in, these can or cannot be part of efforts to increase software quality. Users of this tool should be able to specify an optional parameter to exclude specific folders.

Search by commit instead of dates

Search by commit would probably be optional arguments.

  • There are two arguments, start and end commit hashes.
  • Search by default is the short hash, any hash longer than this gets truncated to its short version, any hash shorter than this gets an error message.
  • If the timestamps of the start and end commits are not in order (reversed, equal) then the order has to be fixed, in case the same commit is used for both start and end date then the git command should just take one hash.
  • The optional commit arguments should override the positional dates arguments.

Reference: #10

Transition codebase to classes

Currently the script is a simple set of functions, encapsulating all functionality with OO allows for better interoperability within existing systems/workflows.

Add contribution guideline

Given the recent increase in support (really awesome), it makes sense to add a contribution guideline, as suggested by GitHub.

Improve user experience

True Git Code Churn can use some additional minor enhancements:

  • change the order of before & after in the documentation and usage descriptions
  • add more specific usage copy in the read me
  • print also the author to reduce copy & paste mistakes in case outputs are manually copied over into a sheet

dir argument is not picked up

Hi,

Please assist, I am trying to test the script but i am getting an error with regards to passing the "dir" argument. Here's how i am calling the script

python ./gitcodechurn.py after="2021-09-21โ€ before="2021-09-27โ€ author="" dir="/Users/admin/Desktop/jmeter" -exdir="/Users/admin/Desktop/jmeter/bin"
usage: python [/]gitcodechurn.py after="YYYY[-MM[-DD]]" before="YYYY[-MM[-DD]]" author="flacle" dir="[/]path" [-exdir="[*/]path"]
gitcodechurn.py: error: the following arguments are required: dir

I am using python 3.10

Code churn per file

It would be interesting to be able to spit out the churn rate for each file in the repository (or a subfolder there of), to detect potential fragile code (high churn rate could indicate more bugs). Would such a feature be possible to add?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.