GithubHelp home page GithubHelp logo

casperdcl / git-fame Goto Github PK

View Code? Open in Web Editor NEW
582.0 3.0 33.0 405 KB

:star: Pretty-print `git` repository collaborators sorted by contributions

License: Other

Makefile 7.43% Python 79.89% Shell 3.61% Roff 8.69% Dockerfile 0.37%
git git-blame blame code-analysis cost loc git-log author commit shortlog

git-fame's Introduction

git-fame

Pretty-print git repository collaborators sorted by contributions.

Py-Versions PyPI Conda-Forge Docker Snapcraft

Coverage-Status Branch-Coverage-Status Codacy-Grade Libraries-Rank PyPI-Downloads

DOI-URI LICENCE OpenHub-Status Sponsor-Casper

~$ git fame --cost hour,month --loc ins
Processing: 100%|██████████████████████████| 1/1 [00:00<00:00,  2.16repo/s]
Total commits: 1775
Total ctimes: 2770
Total files: 461
Total hours: 449.7
Total loc: 41659
Total months: 151.0
| Author               |   hrs |   mths |   loc |   coms |   fils |  distribution   |
|:---------------------|------:|-------:|------:|-------:|-------:|:----------------|
| Casper da Costa-Luis |   228 |    108 | 28572 |   1314 |    172 | 68.6/74.0/37.3  |
| Stephen Larroque     |    28 |     18 |  5243 |    203 |     25 | 12.6/11.4/ 5.4  |
| pgajdos              |     2 |      9 |  2606 |      2 |     18 | 6.3/ 0.1/ 3.9   |
| Martin Zugnoni       |     2 |      5 |  1656 |      3 |      3 | 4.0/ 0.2/ 0.7   |
| Kyle Altendorf       |     7 |      2 |   541 |     31 |      7 | 1.3/ 1.7/ 1.5   |
| Hadrien Mary         |     5 |      1 |   469 |     31 |     17 | 1.1/ 1.7/ 3.7   |
| Richard Sheridan     |     2 |      1 |   437 |     23 |      3 | 1.0/ 1.3/ 0.7   |
| Guangshuo Chen       |     3 |      1 |   321 |     18 |      7 | 0.8/ 1.0/ 1.5   |
| Noam Yorav-Raphael   |     4 |      1 |   229 |     11 |      6 | 0.5/ 0.6/ 1.3   |
| github-actions[bot]  |     2 |      1 |   186 |      1 |     51 | 0.4/ 0.1/11.1   |
...

The distribution column is a percentage breakdown of loc/coms/fils. (e.g. in the table above, Casper has written surviving code in 172/461 = 37.3% of all files).


Table of contents

Installation

Latest PyPI stable release

PyPI PyPI-Downloads Libraries-Dependents

pip install git-fame

Latest development release on GitHub

GitHub-Status GitHub-Stars GitHub-Commits GitHub-Forks GitHub-Updated

Pull and install:

pip install "git+https://github.com/casperdcl/git-fame.git@main#egg=git-fame"

Latest Conda release

Conda-Forge

conda install -c conda-forge git-fame

Latest Snapcraft release

Snapcraft

snap install git-fame

Latest Docker release

Docker

docker pull casperdcl/git-fame
docker run --rm casperdcl/git-fame --help
docker run --rm -v </local/path/to/repository>:/repo casperdcl/git-fame

Register alias with git

On Windows, run:

git config --global alias.fame "!python -m gitfame"

This is probably not necessary on UNIX systems. If git fame doesn't work after restarting the terminal on Linux & Mac OS, try (with single quotes):

git config --global alias.fame '!python -m gitfame'

Tab completion

Optionally, systems with bash-completion can install tab completion support. The git-fame_completion.bash file needs to be copied to an appropriate folder.

On Ubuntu, the procedure would be:

$ # Ensure completion works for `git` itself
$ sudo apt-get install bash-completion

$ # Install `git fame` completions
$ sudo wget \
    https://raw.githubusercontent.com/casperdcl/git-fame/main/git-fame_completion.bash \
    -O /etc/bash_completion.d/git-fame_completion.bash

followed by a terminal restart.

Changelog

The list of all changes is available on the Releases page: GitHub-Status.

Usage

git fame              # If alias registered with git (see above)
git-fame              # Alternative execution as python console script
python -m gitfame     # Alternative execution as python module
git-fame -h           # Print help

For example, to print statistics regarding all source files in a C++/CUDA repository (*.c/h/t(pp), *.cu(h)), carefully handling whitespace and line copies:

git fame --incl '\.[cht][puh]{0,2}$' -twMC

It is also possible to run from within a python shell or script.

>>> import gitfame
>>> gitfame.main(['--sort=commits', '-wt', '/path/to/my/repo'])

Documentation

Py-Versions

Usage:
  git-fame [--help | options] [<gitdir>...]

Arguments:
  <gitdir>       Git directory [default: ./].
                 May be specified multiple times to aggregate across
                 multiple repositories.

Options:
  -h, --help     Print this help and exit.
  -v, --version  Print module version and exit.
  --branch=<b>   Branch or tag [default: HEAD] up to which to check.
  --sort=<key>   [default: loc]|commits|files|hours|months.
  --loc=<type>   surv(iving)|ins(ertions)|del(etions)
                 What `loc` represents. Use 'ins,del' to count both.
                 defaults to 'surviving' unless `--cost` is specified.
  --excl=<f>     Excluded files (default: None).
                 In no-regex mode, may be a comma-separated list.
                 Escape (\,) for a literal comma (may require \\, in shell).
  --incl=<f>     Included files [default: .*]. See `--excl` for format.
  --since=<date>  Date from which to check. Can be absoulte (eg: 1970-01-31)
                  or relative to now (eg: 3.weeks).
  --cost=<method>  Include time cost in person-months (COCOMO) or
                   person-hours (based on commit times).
                   Methods: month(s)|cocomo|hour(s)|commit(s).
                   May be multiple comma-separated values.
                   Alters `--loc` default to imply 'ins' (COCOMO) or
                   'ins,del' (hours).
  -R, --recurse  Recursively find repositories & submodules within <gitdir>.
  -n, --no-regex  Assume <f> are comma-separated exact matches
                  rather than regular expressions [default: False].
                  NB: if regex is enabled ',' is equivalent to '|'.
  -s, --silent-progress    Suppress `tqdm` [default: False].
  --warn-binary  Don't silently skip files which appear to be binary data
                 [default: False].
  -e, --show-email  Show author email instead of name [default: False].
  --enum         Show row numbers [default: False].
  -t, --bytype             Show stats per file extension [default: False].
  -w, --ignore-whitespace  Ignore whitespace when comparing the parent's
                           version and the child's to find where the lines
                           came from [default: False].
  -M             Detect intra-file line moves and copies [default: False].
  -C             Detect inter-file line moves and copies [default: False].
  --ignore-rev=<rev>       Ignore changes made by the given revision
                           (requires `--loc=surviving`).
  --ignore-revs-file=<f>   Ignore revisions listed in the given file
                           (requires `--loc=surviving`).
  --format=<format>        Table format
      [default: pipe]|md|markdown|yaml|yml|json|csv|tsv|tabulate.
      May require `git-fame[<format>]`, e.g. `pip install git-fame[yaml]`.
      Any `tabulate.tabulate_formats` is also accepted.
  --manpath=<path>         Directory in which to install git-fame man pages.
  --log=<lvl>    FATAL|CRITICAL|ERROR|WARN(ING)|[default: INFO]|DEBUG|NOTSET.

If multiple user names and/or emails correspond to the same user, aggregate git-fame statistics and maintain a git repository properly by adding a .mailmap file.

FAQs

Options such as -w, -M, and -C can increase accuracy, but take longer to compute.

Note that specifying --sort=hours or --sort=months requires --cost to be specified appropriately.

Note that --cost=months (--cost=COCOMO) approximates person-months and should be used with --loc=ins.

Meanwhile, --cost=hours (--cost=commits) approximates person-hours.

Extra care should be taken when using ins and/or del for --loc since all historical files (including those no longer surviving) are counted. In such cases, --excl may need to be significantly extended. On the plus side, it is faster to compute ins and del compared to surv.

Examples

CODEOWNERS

Generating CODEOWNERS:

# bash syntax function for current directory git repository
owners(){
  for f in $(git ls-files); do
    # filename
    echo -n "$f "
    # author emails if loc distribution >= 30%
    git fame -esnwMC --incl "$f" | tr '/' '|' \
      | awk -F '|' '(NR>6 && $6>=30) {print $2}' \
      | xargs echo
  done
}

# print to screen and file
owners | tee .github/CODEOWNERS

# same but with `tqdm` progress for large repos
owners \
  | tqdm --total $(git ls-files | wc -l) \
    --unit file --desc "Generating CODEOWNERS" \
  > .github/CODEOWNERS

Zenodo config

Generating .zenodo.json:

git fame -wMC --format json \
  | jq -c '{creators: [.data[] | {name: .[0]}]}' \
  | sed -r -e 's/(\{"name")/\n    \1/g' -e 's/:/: /g' \
  > .zenodo.json

Contributions

GitHub-Commits GitHub-Issues GitHub-PRs OpenHub-Status

All source code is hosted on GitHub. Contributions are welcome.

LICENCE

Open Source (OSI approved): LICENCE

Citation information: DOI-URI

Authors

OpenHub-Status

We are grateful for all GitHub-Contributions.

git-fame's People

Contributors

bhalbayrak avatar casperdcl avatar foosel avatar ignatenkobrain avatar j-mortara avatar spier avatar waldyrious avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

git-fame's Issues

Excluding a folder

Hi There,

Would you please advice on how to exclude a folder (for example .git folder)?

Thanks

tqdm._tqdm.TqdmKeyError: "Unknown argument(s): {'lock_args': (False,)}"

$ git fame --cost hour,month                                                                                                                                                  
Traceback (most recent call last):
  File "/home/math/.pyenv/versions/3.8.1/bin/git-fame", line 8, in <module>
    sys.exit(main())
  File "/home/math/.pyenv/versions/3.8.1/lib/python3.8/site-packages/gitfame/_gitfame.py", line 422, in main
    run(args)
  File "/home/math/.pyenv/versions/3.8.1/lib/python3.8/site-packages/gitfame/_gitfame.py", line 370, in run
    for res in mapper(statter, gitdirs):
  File "/home/math/.pyenv/versions/3.8.1/lib/python3.8/site-packages/gitfame/_gitfame.py", line 231, in _get_auth_stats
    for fname in tqdm(file_list, desc=gitdir if prefix_gitdir else "Processing",
  File "/home/math/.local/lib/python3.8/site-packages/tqdm/_tqdm.py", line 855, in __init__
    raise (TqdmDeprecationWarning(dedent("""\
tqdm._tqdm.TqdmKeyError: "Unknown argument(s): {'lock_args': (False,)}"
  • Python 3.8.1
  • git-fame==1.12.1
  • tqdm==4.32.2

mailmap support

Not sure if this has it already, but it needs mailmap support.

When author name contains Chinese characters, redundant spaces added.

git-fame-1.14.0, zsh and bash both found this problem.

When git repo author name contains both ascii and Chinese characters, redundant spaces added for Chinese characters.

You could use different Chinese character length to test: such like "张三丰", "李四", different character lengths could got different redutant spaces

Table gets misaligned if author names contain fullwidth characters

I had previously reported this to oleander/git-fame-rb#95, and just checked that this issue also occurs in this project. Quoting myself from that issue:

Running git fame in the a repository with authors whose names contain fullwidth characters (like Chinese, Japanese or Korean characters) results in incorrect alignment of the table.

For example, here's the (truncated) output for the https://github.com/oilshell/oil repo:

| Author                  |     loc |   coms |   fils |  distribution   |
|:------------------------|--------:|-------:|-------:|:----------------|
| Andy Chu                | 1693341 |   5111 |   4407 | 98.6/93.1/87.3  |
| Andy C                  |   15372 |    189 |    247 | 0.9/ 3.4/ 4.9   |
| granttrec               |    2288 |      6 |    103 | 0.1/ 0.1/ 2.0   |
| :                       |       : |      : |      : | :               |
| Aleks Kamko             |     114 |      2 |      8 | 0.0/ 0.0/ 0.2   |
| Matt Singletary         |      85 |      1 |      2 | 0.0/ 0.0/ 0.0   |
| Crestwave               |      74 |      1 |      2 | 0.0/ 0.0/ 0.0   |
| 조성빈                     |      73 |      1 |      3 | 0.0/ 0.0/ 0.1   |
| Batuhan Taskaya         |      72 |      1 |      6 | 0.0/ 0.0/ 0.1   |
| Yorwba                  |      70 |      8 |      9 | 0.0/ 0.1/ 0.2   |
| myfreeweb               |      60 |      1 |     58 | 0.0/ 0.0/ 1.1   |
| :                       |       : |      : |      : | :               |
| Rory O’Kane             |       0 |      1 |      0 | 0.0/ 0.0/ 0.0   |
| Waldir Pimenta          |       0 |      1 |      0 | 0.0/ 0.0/ 0.0   |
| joris                   |       0 |      2 |      0 | 0.0/ 0.0/ 0.0   |

How to exclude multiple files & directories

Hi! Im trying to exclude multiple files or directories from output, but seems it does not work properly.
I'm not very familiar with Python or Bash, so it could be my mistake.

Example:

First, trying to run without --excl flag
Input: git fame -e -s
Output:

Total commits: 80
Total ctimes: 102
Total files: 47
Total loc: 12012

This is ok, looks good.

Next, trying to exclude one file:
Input: git fame -e -s --excl=yarn.lock
Output:

Total commits: 80
Total ctimes: 101
Total files: 46
Total loc: 1518

Looks good also.

But when i try to exclude both a file AND a directory, it gives me the same result, as for without --excl flag
Input: git fame -e -s --excl=yarn.lock,src
Output:

Total commits: 80
Total ctimes: 102
Total files: 47
Total loc: 12012

When i try to add -n flag, it gives me the same result, as for only one yarn.lock file
Input: git fame -e -s --excl=yarn.lock,src -n
Output:

Total commits: 80
Total ctimes: 101
Total files: 46
Total loc: 1518

wrong loc after author change or repo import

A while ago, I needed to incorporate someone else's code into our STIR repo. Unfortunately, I had commited his code in the original repo. So I manipulated history to assign some commits to him. (Sadly, I didn't record exactly what I did but I followed roughly https://stackoverflow.com/a/28845565/15030207). I then incorporated the repo into our STIR repo and moved files. (Roughly along the lines of https://stackoverflow.com/a/1684694/15030207). Checking now output of git fame, the relevant author doesn't get the credit he (=Carles Falcon) deserves. Am I doing something wrong with the call to git fame?

As running git fame -wMC on STIR take a very long time, I've tried to show an example with one of the relevant files is https://github.com/UCL/STIR/blob/master/src/recon_buildblock/PinholeSPECTUB_Weight3d.cxx, original name was wm_SPECT_mph2/weight3d_SPECT_mph.cpp`. For that I get

$ git fame  -wMC '--incl=PinholeSPECTUB_Weight3d.cxx|weight3d_SPECT_mph.cpp'
Total commits: 8021
Total ctimes: 30
Total files: 2
Total loc: 73
| Author                 |   loc |   coms |   fils |  distribution   |
|:-----------------------|------:|-------:|-------:|:----------------|
| Matthew Strugari       |    69 |     15 |      1 | 94.5/ 0.2/50.0  |
| Kris Thielemans        |     4 |   5625 |      1 | 5.5/70.1/50.0   |
| Alaleh Rashidnasab     |     0 |      1 |      0 | 0.0/ 0.0/ 0.0   |
| Alexander C. Whitehead |     0 |      5 |      0 | 0.0/ 0.1/ 0.0   |
| Alexey Zverovich       |     0 |      6 |      0 | 0.0/ 0.1/ 0.0   |
| Ander Biguri           |     0 |     63 |      0 | 0.0/ 0.8/ 0.0   |
| Ashley Gillman         |     0 |     50 |      0 | 0.0/ 0.6/ 0.0   |
| Benjamin Thomas        |     0 |     22 |      0 | 0.0/ 0.3/ 0.0   |
| C. Ross Schmidtlein    |     0 |      2 |      0 | 0.0/ 0.0/ 0.0   |
| Carles Falcon          |     0 |      2 |      0 | 0.0/ 0.0/ 0.0   |
...

Note that the "Total loc" reported is 73, while actually the file has some 1100 lines. All the ones from Carles are excluded for some reason. Of course, I could be making a mistake with the --incl option but Carles gets no credit when I don't specify an include.

git blame reports the correct thing, see here. Carles gets the correct number of commits, so maybe it's the way that I merged the original repo into STIR? (e.g. the commit does not "follow" from the first STIR commit).

The "re-authored" commit is UCL/STIR@dd6fdee. The PR with the move (and other fixes) is UCL/STIR#1100

Add ability to exclude commits (.git-blame-ignore-revs)

Every once in a while, major surgery might take place on repositories, for instance to

  • fix incorrectly committed line-endings
  • send the complete source code through an auto-formatter (clang-format)
  • ...

Any commits associated with any such activity are outliers in the contribution to the fame attribution of git repositories. It would therefore be nice if there was a feature available that allows exclusion of specific commits from fame attribution, specifically:

  • the commit would be dropped from the commit count (coms)
  • the commit's immediate impact of change (loc, fils) would be disregarded
  • (undecided regarding distribution - but for consistency with coms, that would change, too)
  • summary would continue to indicate the numbers as they are, but with "(? ignored) appended"

One way to tackle this from an implementation point of view could be checking for .git-blame-ignore-revs, which is a pattern / convention which seems to be gaining in popularity, see e.g. https://chromium.googlesource.com/chromium/src.git/+/f0596779e57f46fccb115a0fd65f0305894e3031/.git-blame-ignore-revs

Also, git-blame (https://git-scm.com/docs/git-blame) has a feature that makes it possible to ignore a list of commits quite convenient, either directly on the command-line or by referencing a file (e.g. the .git-blame-ignore-revs from above): --ignore-revs and --ignore-revs-file as well as the git config option blame.ignoreRevsFile (see docs for git blame)

As blame and fame are so closely intertwined, I consider quite appropriate to tack onto (model after) the blame configuration options ;)

Double -C option

Git blame command has a weird -C option:
When this option is given twice, the command additionally looks for copies from other files in the commit that creates the file

I don't think that this syntax should be copied here probably just change git_blame_cmd.append("-C") to git_blame_cmd.append("-C -C"). Otherwise it attributes all the lines in a newly created file to its committer even if it is a complete copy of an already existed file.

explain in details the metrics

Could you please explain in details the metrics:

Total ctimes: 2770
Total hours: 449.7
Total loc: 41659
Total months: 151.0

Thanks in advance

Bug in the counting of lines of code

Hi,

I found a bug in the counting of lines of code. Actually, git-fame only counts one line for a block of lines owned by the same author. Here is an example :

When committing this code, as there is only one author, git-fame counts one line of code.

I propose a correction, which you can see here : https://github.com/trinity357/git-fame (The last three commits are the modifications I applied.)
After these changes, git-fame detects all the lines of code.

May I send you a pull request ?

available as Homebrew package?

Python devs can use pipx but for people coming from other languages on macOS having a Homebrew package would be good.

Get lines of code by file extension for a particular author?

Thanks for building this tool!

I have a repo, and I want to know how many lines of Rust code I've contributed to it. Is that something this tool can tell me?

If I run git fame -t, I get

Total ._None_ext: 15309
Total .bat: 41
...
Total .rs: 81316
...
Total commits: 1006
Total ctimes: 29296
Total files: 3871
Total loc: 510071
| Author                  |   loc |   coms |   fils |  distribution   |
|:------------------------|------:|-------:|-------:|:----------------|
| Elango                  | 90506 |     65 |    505 | 17.7/ 6.5/13.0  |
| Manish Goregaokar       | 84052 |    163 |    695 | 16.5/16.2/18.0  |
| Ting-Yu Lin             | 62542 |     18 |     54 | 12.3/ 1.8/ 1.4  |
| Erik Nordin             | 56941 |     21 |    265 | 11.2/ 2.1/ 6.8  |
| Shane F. Carr           | 48403 |    347 |    613 | 9.5/34.5/15.8   |
...

The list at the top tells me contributions by language, and the table at the bottom tells me contributions by user. I want contributions by language by user. Alternatively, I would be okay invoking a command such as

$ git fame -e "[email protected]" -t

which would get analysis only for commits with that email address.

JSON Output

Exporting the results and LOC as a JSON file would be a great feature

Fails to count LOC and files when passing multiple repos

As soon I pass multiple repos, only the commit count is calculated. Lines of code and files are zero.

git fame ./repo_a ./repo_b
./repo_a: 100%|################################################################################################################################################################################| 63/63 [00:00<00:00, 434.77file/s]
./repo_b: 100%|###########################################################################################################################################################################| 63/63 [00:00<00:00, 435.13file/s]
Repos: 100%|######################################################################################################################################################################################| 2/2 [00:00<00:00, 12.81repo/s]
Total commits: 1238
Total ctimes: 0
Total files: 0
Total loc: 0

`--loc=surv` has more loc than `--loc=ins,del` (counts bin files)

--loc=surv counts all files including binary files (like image files), but --loc=ins,del doesn't? Is there an option to automatically exclude these files? (without using --excl for every single extension)
How is the surviving code 50879 lines, but the total loc for ins,del 30139?
I tried --warn-binary for it but it didn't seem to do anything.

F:\xampp\htdocs\test>git fame --loc=surv -t
Processing: 100%|███████████████████████████████████████████████████████████████| 77/77 [00:11<00:00,  6.49file/s]
Total .css: 3866
Total .gitignore: 1
Total .html: 56
Total .jpeg: 349
Total .jpg: 35950
Total .js: 658
Total .php: 4057
Total .png: 5145
Total .sql: 413
Total .svg: 384
Total commits: 181
Total ctimes: 1463
Total files: 112
Total loc: 50879

F:\xampp\htdocs\test>git fame --loc=ins,del -t
Processing: 100%|█████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.25repo/s]
Total .css: 8861
Total .gitignore: 3
Total .html: 56
Total .js: 982
Total .json: 72
Total .md: 172
Total .php: 16035
Total .sql: 3906
Total .svg: 12
Total .txt: 40
Total commits: 181
Total ctimes: 397
Total files: 144
Total loc: 30139

automate handling of binary files

Unsure as to why this warning comes up, but it seems like it can be safely silenced for a prettier git-fame output:

WARNING:gitfame._gitfame:src/best_params.pickle:'utf-8' codec can't decode byte 0x80 in position 390: invalid start byte

snap tidy

  • automate snap deployment
  • add completion
  • ship with all dependencies (i.e. git)? Disadvantages is that may need --classic.

snap package not compatible with mounts

Hi,
I use bitbucket and I don't manage to display anything while running a simple git fame command.
I have the following error on my git folder. as you can see obviously the folder exists and as we have a callback from git status, it is a git repository :

nicolas@nicolas-XXXXXX:/var/www/mobile$ git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
nicolas@nicolas-XXXXXX:/var/www/mobile$ git fame /var/www/mobile
fatal: cannot change to '/var/www/mobile': No such file or directory
Processing: 100%|###################################################################################| 1/1 [00:00<00:00, 235.70file/s]
fatal: cannot change to '/var/www/mobile': No such file or directory
Total 
| Author   | loc   | coms   | fils   |  distribution   |
|----------|-------|--------|--------|-----------------|

If you have any idea of something I might do wrong, advices are more than welcome.
Thanks
Nicolas

PS : I tried also in sudo and in other git repos, same results.

My env :
Ubuntu 20.04

document stats

I understand loc (lines of surviving code), coms (commits), but how is are hours etc calculated? would be nice if it would be documented what it means

"Dubious ownership in repository" error running Docker container

I tried to run git-fame via Docker using the following commands but got an "fatal: detected dubious ownership in repository at '/repo'" error:

# Build playground
cd "$(mktemp -d)"
git init .
echo "foo" > file1
echo "bar" > file2
git commit -a -m "First commit"
# Run git-fame
docker run --rm casperdcl/git-fame --help
docker run --rm -v "$(pwd)":/repo casperdcl/git-fame

Obtained output is:

fatal: detected dubious ownership in repository at '/repo'
To add an exception for this directory, call:

	git config --global --add safe.directory /repo
Processing: 100%|██████████| 1/1 [00:00<00:00, 695.69file/s]
error: too many arguments given outside repository
usage: git shortlog [<options>] [<revision-range>] [[--] <path>...]
   or: git log --pretty=short | git shortlog [<options>]

    -c, --committer       group by committer rather than author
    -n, --numbered        sort output according to the number of commits per author
    -s, --summary         suppress commit descriptions, only provides commit count
    -e, --email           show the email address of each author
    -w[<w>[,<i1>[,<i2>]]]
                          linewrap output
    --group <field>       group by field

Total 
| Author   | loc   | coms   | fils   |  distribution   |
|----------|-------|--------|--------|-----------------|

I guess the git config --global --add safe.directory /repo should be done in the casperdcl/git-fame Docker image?

Still getting UnicodeEncodeError

Hi there. First of all thank you for this tool, which seems very handy.

I ran into some errors with a lot of my repositories though, all related to unicode. I've seen you added fixes for that in the past, but I still receive the following a lot (with both stable and GitHub version):

Blame: 100%|#############################################################################| 1/1 [00:00<00:00, 10.64it/s]
Total commits: 35279
Total files: 1
Total loc: 4
Traceback (most recent call last):
  File "C:\Python27\lib\runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "C:\Python27\lib\runpy.py", line 72, in _run_code
    exec code in run_globals
  File "c:\p\hissrc\src\git-fame\gitfame\__main__.py", line 2, in <module>
    main()  # pragma: no cover
  File "c:\p\hissrc\src\git-fame\gitfame\_gitfame.py", line 271, in main
    run(args)
  File "c:\p\hissrc\src\git-fame\gitfame\_gitfame.py", line 263, in run
    print(tabulate(auth_stats, stats_tot, args["--sort"]))
  File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\ufffd' in position 1265: character maps to <undefined>

As you can see I've narrowed it down and found one particluar file with only one commit and one author that causes the error. The author is in the format "surname lastname <[email protected]>", and that seems to be the problem.

I tried using different code pages in the console, but they either don't change anything (850, 1252) or are not supported by Python 2.7 (65001).

Any ideas?

--since has somewhat confusing behaviour

While you can argue if this is a bug or a feature, when using "--since " the first committer following that date will be attributed all the lines not changed after that date.
This is a consequence of the way git blame interprets the --since parameter, and perhaps makes sense in that context, but in the git-fame context I would expect lines not changed after the since to be ignored, or at least to have an option to get that count (lines changed after date by author).

Why LOC statistic is very different compared to other tools?

I think that the tool doesn't count lines of code properly, because it shows completely different number compared to other tools like cloc. For example the command git fame -t on a repository I'm working on gives the result:

Total .babel: 29
Total .bash-completion: 12
Total .cfg: 8
Total .gitignore: 34
Total .json: 18
Total .markdown: 511
Total .md: 5
Total .nim: 1782
Total .nimble: 637
Total .txt: 10
Total .yml: 12
Total .zsh-completion: 51
Total commits: 842
Total ctimes: 838
Total files: 314
Total loc: 3109

cloc . --vcs=git on the same repository gives:

-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Nim                             46            869            680           4928
JSON                             2              0              0             33
YAML                             1              6              1             23
Markdown                         1              1              0              4
-------------------------------------------------------------------------------
SUM:                            50            876            681           4988
-------------------------------------------------------------------------------

which seems to me to be more close to the actual count for the languages which both tools detected. For bigger repositories the difference is larger. Also the Ruby version of git fame gives third different numbers which are even more unrealistic, but the last sentence is for their bug tracker. :) The point is that the differences which the different tools give are huge and I'm not sure on which tool to believe.

Bug: LOC from multiple repos not averaged

When I add multiple repos as command line arguments to git-fame, their lines of code are not averaged because check_output reports fatal: no such path [repo-name/path] in HEAD in ~ line 287, _gitfame.py.
Root of the bug: prefix_gitdir is set if multiple repos are given, which joins the git directory to the fname.

Example:

Setup

git clone [email protected]:casperdcl/git-fame.git
cd git-fame
python -m gitfame ../git-fame ../git-fame

Output

Total commits: 492
Total ctimes: 0
Total files: 0
Total loc: 0
| Author                   |   loc |   coms |   fils |  distribution   |
|:-------------------------|------:|-------:|-------:|:----------------|
| Casper da Costa-Luis     |     0 |    470 |      0 | 0.0/95.5/ 0.0   |
| Cory Carson (Salesforce) |     0 |     10 |      0 | 0.0/ 2.0/ 0.0   |
...

Desired output

Total ctimes: 1798
Total files: 31
Total loc: 4550
| Author                   |   loc |   coms |   fils |  distribution   |
|:-------------------------|------:|-------:|-------:|:----------------|
| Casper da Costa-Luis     |  4536 |    470 |     28 | 99.7/95.5/90.3  |
| Cory Carson (Salesforce) |    10 |     10 |      1 | 0.2/ 2.0/ 3.2   |
...

Environment

OS: Arch Linux
python version : Python 3.9.2

Patch which works for me

As I am unfamiliar with the code base and do not know why prefix_gitdir is set, I did not open a PR for this. My version can be found here I am unsure if my solution is really a fix or if it causes other problems for different command line options. However, I attached the patch which fixes the issue for me below:

diff --git a/gitfame/_gitfame.py b/gitfame/_gitfame.py
index 59916e9..01ce6f7 100755
--- a/gitfame/_gitfame.py
+++ b/gitfame/_gitfame.py
@@ -279,14 +279,16 @@ def _get_auth_stats(
     for fname in tqdm(file_list, desc=gitdir if prefix_gitdir else "Processing",
                       disable=silent_progress, unit="file"):
 
-      if prefix_gitdir:
-        fname = path.join(gitdir, fname)
       try:
         blame_out = check_output(
             base_cmd + [branch, fname], stderr=subprocess.STDOUT)
       except Exception as err:
         getattr(log, "warn" if warn_binary else "debug")(fname + ':' + str(err))
         continue
+
+      if prefix_gitdir:
+          fname = path.join(gitdir, fname)
+
       log.log(logging.NOTSET, blame_out)
 
       # Strip boundary messages,

Possible documentation glitch: "!python: event not found"

Hi there,

I tried out this tool today, looks nice! :)

When installing this I ran into the following issue

$ git config --global alias.fame "!python -m gitfame"
-bash: !python: event not found

My environment:

  • OSX
  • bash

I was able to fix it by using single quotation marks i.e.

$ git config --global alias.fame '!python -m gitfame'

Not sure if that issue is reproducible by others?
Happy to send a PR for this but I figured I first confirm this with you.

Cheers :)

What is `--loc=surv`?

Hello
What is --loc=surv? I can't find any definition for it anywhere, neither in the repo, nor anywhere on the net. When to use it?

Recursing stats blocked

Hello

I try to create report of all my repositories but with -R one repository block on DEBUG:gitfame._get_auth_stats:358:{...
my command with folder containing lots of repository
python3 -m gitfame --cost hour,month --loc ins --format csv --incl '\.[cht][puh]{0,2}$' --log NOTSET -eR ./folder/*

If i run it of repo blocked it works with command
python3 -m gitfame --cost hour,month --loc ins --format csv --incl '\.[cht][puh]{0,2}$' --log NOTSET -eR ./folder/repo13.git/

How i can obtain more logs or what I can do to understand the problem?

Regards

recursively find repos

When running on a folder containing multiple repos (but not being a repo itself), e.g. git-fame ~, the app errors:

fatal: not a git repository (or any of the parent directories): .git
Blame: 100%|██████████| 1/1 [00:00<00:00, 183.01file/s]
error: too many arguments given outside repository

Since I wanted the fame for all repos on my system, I had to improvise and came up (again for ~) with

git-fame --cost hour -e -M -C --sort commits `find ~ -name .git -type d -prune -exec dirname {} \; 2>/dev/null | xargs`

This sums over every repo in my home folder, which is what I wanted.

This however picks up trash repos that i forgot to remove from top level home or secondary repos like ~/.cache/pre-commit/XXX, so would actually only like git-fame ~/private_repos ~/work_repos to exclude trash thats not in one of the provided folders.

Error

If I try to run gitfame on the github.com/ansible/ansible repo:

The following output appears:

Blame:   4%|█████▌                                                                                                                                          | 126/3265 [00:12<09:12,  5.68it/s]Traceback (most recent call last):
  File "/usr/bin/gitfame", line 9, in <module>
    load_entry_point('git-fame==1.1.0', 'console_scripts', 'gitfame')()
  File "/usr/lib/python3.5/site-packages/gitfame/_gitfame.py", line 239, in main
    run(args)
  File "/usr/lib/python3.5/site-packages/gitfame/_gitfame.py", line 181, in run
    auths = RE_AUTHS.findall(blame_out)
TypeError: cannot use a string pattern on a bytes-like object

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.