GithubHelp home page GithubHelp logo

Respect .gitignore about cloc HOT 29 CLOSED

aldanial avatar aldanial commented on June 17, 2024 13
Respect .gitignore

from cloc.

Comments (29)

AlDanial avatar AlDanial commented on June 17, 2024 12

Good tip, thanks; as a git novice I continue to be surprised by git's power and flexibility.

If you haven't tried it already, cloc --vcs git should do the same thing.

from cloc.

AlDanial avatar AlDanial commented on June 17, 2024 11

Git commit 55e616e on the master branch implements --vcs. Please give it a try with
--vcs git, --vcs svn, or any file name generator such as --vcs 'find . -type f -name "*.c" -size +500k', for example, to count only C files greater than 500 kB in size.

from cloc.

rstacruz avatar rstacruz commented on June 17, 2024 9

As a workaround, this can work:

cloc $(git ls-files)

...as long as you don't have spaces in your path. A bit cumbersome, though.

from cloc.

suweller avatar suweller commented on June 17, 2024 2

First off, @AlDanial thanks for making and maintaining cloc.
Secondly, if you'd like to make cloc respect git, run it from git using:

# ~/.gitconfig
[alias]
  cloc = !cloc $(git ls-files)

Run git cloc from a repository root and voila.

from cloc.

AlDanial avatar AlDanial commented on June 17, 2024 1

I first started looking at obeying .gitignore in July 2015. The more I looked, the less fun it seemed to implement. Yes, it would be a nice feature. I'll certainly entertain pull requests. However, until I have a burst of enthusiasm about this problem, implementation is a ways off.

If anyone is interested in moving this along, we can tackle this independently of cloc, that is to say, given a stand-alone solution, I'll handle integrating it into cloc. The problem can be reduced to this: given a text file containing a sorted directory tree (eg the output of find . -type f | sort), read each .gitignore and apply its rules to the tree. Output would be a list of files which survive all .gitignore's.

from cloc.

mbovel avatar mbovel commented on June 17, 2024 1

Hi guys,

I had exactly this problem today (counting line of codes in a git repo) and came across this issue.

As said earlier in this discussion

cloc $(git ls-files)

works like a charm, except with files containing a space in their name.

However, that seems to work, even with spaces in names:

git ls-files > list.txt
cloc --list-file=list.txt

So, +1 for the idea of @rstacruz: juste be able to pass file list via stdin.

Parsing .gitignore, communicating with vcs or anything in this direction doesn't seem like the job of cloc to me.

from cloc.

rstacruz avatar rstacruz commented on June 17, 2024

also, maybe this might be as simple as having a flag to read filenames from stdin?

git ls-files | cloc --stdin-files

from cloc.

zbeekman avatar zbeekman commented on June 17, 2024

also, maybe this might be as simple as having a flag to read filenames from stdin?

You can use xargs too, shorter than adding an option no one will remember to read file names from stdin:

mkdir empty
cd empty
cloc .
git ls-files .. | xargs cloc

from cloc.

rstacruz avatar rstacruz commented on June 17, 2024

wouldnt that suffer from the spaces problem too?

from cloc.

zbeekman avatar zbeekman commented on June 17, 2024

First of all, my apologies, I did miss that point; I'm quite tired at the moment. However, how do you propose that the --stdin-files flag will fix the word splitting issue? Force one file per line? I'm not convinced that it's cloc's job to figure out what an appropriate IFS is... What if I want to do something like echo */*.c | cloc and the file names or directories have spaces in them? Are we supposed to assume the IFS will be limited to \n?

As a work around for this particular case, git ls-files -z | xargs -0 cloc definitely works when paths have spaces in them...

from cloc.

zbeekman avatar zbeekman commented on June 17, 2024

(I do agree that adding support for .gitignore and other VCS ignore files is certainly worthwhile, however.)

from cloc.

zbeekman avatar zbeekman commented on June 17, 2024

Language preference for said implementation?

(To be clear, I'm not necessarily volunteering, I am so slammed right now,
but I think I can find a pretty straight forward way to do this in
bash... Operative word here is "think" being as I haven't dug into the
implementation details yet...)

Also, would it make sense to rely on the VCS to determine what to include
and what not to include? Rather than parsing .gitignore (and also please
note users may have global ignore files) the VCS could return a list of
files for the project (which would take into account .gitignore) and then
use that to determine what to show? This would ease the implementation for
SVN, CVS, hg, etc.

On Fri, Jan 15, 2016 at 11:54 PM AlDanial [email protected] wrote:

I first started looking at obeying .gitignore in July 2015. The more I
looked, the less fun it seemed to implement. Yes, it would be a nice
feature. I'll certainly entertain pull requests. However, until I have a
burst of enthusiasm about this problem, implementation is a ways off.

If anyone is interested in moving this along, we can tackle this
independently of cloc, that is to say, given a stand-alone solution, I'll
handle integrating it into cloc. The problem can be reduced to this: given
a text file containing a sorted directory tree (eg the output of find .
-type f | sort), read each .gitignore and apply its rules to the tree.
Output would be a list of files which survive all .gitignore's.


Reply to this email directly or view it on GitHub
#49 (comment).

from cloc.

AlDanial avatar AlDanial commented on June 17, 2024

Ideally the implementation would be in Python or Perl, but bash is plenty good enough.

Re: relying on an external VCS--that's what the "git ls-files" work-around earlier in the thread is all about. I'm not keen on making cloc do system calls unless there's no other way to do it (cloc currently does system calls to archive tools like tar and zip).

from cloc.

zbeekman avatar zbeekman commented on June 17, 2024

I get the desire to avoid system calls, but that means much more code will be needed, and less duplication between VCSs. Also, it complicates the user's interaction: Is cloc expected to respect settings in global VCS config files? How does cloc find these? Otherwise cloc --vcs-files-only will potentially create different results from git ls-files -z | xargs -0 cloc which has the potential to be a source of great confusion.

from cloc.

zbeekman avatar zbeekman commented on June 17, 2024

I guess an alternative implementation would be to explicitly pass the ignore file to cloc:

cloc --vcs-ignore-file=.gitignore and then it should be clear that cloc is only parsing the .ignore file. I think the syntax is pretty similar among VCSs for ignore files too...

from cloc.

AlDanial avatar AlDanial commented on June 17, 2024

Or perhaps cloc --vcs="system call to VCS to list versioned files" which in git would amount to cloc --vcs="git ls-files" and in Subversion (I think) cloc --vcs="svn ls -R".

Again I'm not keen on the system calls but am coming around to the thinking that this may be the right solution here. The documentation will explicitly state that whatever the user puts in quotes will be invoked as a system call and the output treated as a list of files for cloc to consider. Subsequent filters like --match-d, --not-match-d, --match-f, --not-match-f, would still apply.

from cloc.

rstacruz avatar rstacruz commented on June 17, 2024

I like that. I've hacked up a script that looks like this:

# git-cloc
git ls-files $* | \
  grep -v -E '(coverage|log|tmp|temp|vendor|fixture|fixtures|dist|cassettes)/' | \
  tr '\n' '\0' | \
  xargs -0 cloc ...

to be able to use --not-match-d would be great.

from cloc.

controversial avatar controversial commented on June 17, 2024

The way I do this personally is by pushing to remote (git push respects .gitignore of course), then cloning the remote into a separate folder, and then running cloc on that.

Example (... is output I excluded for clarity):

$ git push origin master
... To https://github.com/The-Penultimate-Defenestrator/wikipedia-map.git ...
$ cd ~/Desktop
$ git clone https://github.com/The-Penultimate-Defenestrator/wikipedia-map.git
...
$ cloc wikipedia-map

It's not pretty, but it's certainly a reasonable workaround.

from cloc.

AlDanial avatar AlDanial commented on June 17, 2024

That's a nice tip, thanks.

My planned implementation will hopefully be more simple (namely a single step) but
I've been traveling lately and haven't had time to work on cloc. I'm shooting for
a commit to the dev branch for this feature within two weeks.

from cloc.

controversial avatar controversial commented on June 17, 2024

Cool, thanks.

On Mon, Feb 15, 2016 at 11:47 PM AlDanial [email protected] wrote:

That's a nice tip, thanks.

My planned implementation will hopefully be more simple (namely a single
step) but
I've been traveling lately and haven't had time to work on cloc. I'm
shooting for
a commit to the dev branch for this feature within two weeks.


Reply to this email directly or view it on GitHub
#49 (comment).

from cloc.

AlDanial avatar AlDanial commented on June 17, 2024

That is, in fact, how I plan to implement cloc --vcs=git. Under the hood it calls git ls-files and works with that file list. Similarly --vcs=svn will invoke svn ls -R to get a file list.

This coming Friday I'll have time to implement this.

from cloc.

AlDanial avatar AlDanial commented on June 17, 2024

Also: if you crank the verbose level to 2 (with -v 2) or more, you'll see exactly which files the --vcs XX command has generated.

I don't know of git repos that have files with spaces in them but I don't expect these to be an issue.

from cloc.

mbovel avatar mbovel commented on June 17, 2024

Tested with --vcs=git and --vcs='find . -name *.js', with and without spaces in file names.

Works great, thank you very much!

from cloc.

rstacruz avatar rstacruz commented on June 17, 2024

neat! curious:

I'm not keen on making cloc do system calls unless there's no other way to do it (cloc currently does system calls to archive tools like tar and zip).

considering --vcs=git uses git ls-files under the hood, what made you change your mind on the point above? there actually /is/ another way (manually do a gitignore-aware directory traversal), though cumbersome.

not that i suggest you take that route (imho outsourcing the work to git ls-files is preferable), just curious on your thought process.

from cloc.

AlDanial avatar AlDanial commented on June 17, 2024

It's a matter of practicality. The time I have to work on cloc is quite limited. I'd weighed different approaches to parsing .gitignore for six months (July-Dec. 2015) without progress. Other requests for new language support and bug fixes naturally keep coming in and must be attended to. Bottom line was that I saw no feasible alternative to doing the system calls.

@mbovel -- thanks for testing!

from cloc.

AlDanial avatar AlDanial commented on June 17, 2024

The latest release of cloc is 1.70, haven't gotten to 2.0 yet.
If you run cloc like so

  cloc --version

it will tell you which version you're using.

Get the latest release from https://github.com/AlDanial/cloc/releases

On Fri, Jul 15, 2016 at 11:16 AM, Fernando Montoya <[email protected]

wrote:

I have installed cloc 2.0, and I am getting this error:

👉 cloc --vcs git
Unknown option: vcs


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#49 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABZG3U88y_b6QBpk5qraTI0H7-avQ0qtks5qV86agaJpZM4HGLY9
.

from cloc.

bernardoadc avatar bernardoadc commented on June 17, 2024

@suweller I've tried with no success

# ~/.gitconfig
[alias]
  cloc = !cloc --vcs=git

Any idea why? throws:

Can't create unknown regex: $RE{comment}{C++} at (..)/cloc/lib/cloc line 9619.
...propagated at (..)/cloc/lib/cloc line 4789.

While running it directly does work

from cloc.

suweller avatar suweller commented on June 17, 2024

Both methods work for me now so I can't reproduce your error.
You could try the way I suggested -using ls-files-, maybe that solves your issue.

from cloc.

sohailsomani avatar sohailsomani commented on June 17, 2024

This is sufficient. Just putting it in for the next time I need to search for it ;-)

git ls-tree -r master --name-only -- path/you/want | grep -v anything | grep -v ignored | xargs cloc

from cloc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.