GithubHelp home page GithubHelp logo

gh-language's Introduction

GitHub Language Analyzer

This is an extension to the gh command-line tool for analyzing the count of programming languages used in repositories across a GitHub organization. It retrieves a list of repositories and their associated languages, and then aggregates the data to produce a report of language frequency.

Pre-requisites

  1. Install the GitHub CLI: https://github.com/cli/cli#installation
  2. Confirm that you are authenticated with an account that has access to the org you would like to analyze:
gh auth status

Installation

To install this extension, run the following command:

gh extension install CallMeGreg/gh-language

Usage

Count command

Display the count of programming languages used in repos across an organization.

gh language count YOUR_ORG_NAME
Screenshot 2024-04-23 at 11 17 09 AM

Optionally specify the repo limit (--limit) and/or the number of languages to return (--top)

gh language count YOUR_ORG_NAME --limit 1000 --top 20

Optionally filter by a specific language (--language)

gh language count YOUR_ORG_NAME --language Java

Note

The --language flag values are case-sensitive.

Trend command

Display the breakdown of programming languages used in repos across an organization per year, based on the repo creation date.

gh language trend YOUR_ORG_NAME
Screenshot 2024-04-23 at 11 18 06 AM

Optionally specify the repo limit (--limit) and/or the number of languages to return (--top)

gh language trend YOUR_ORG_NAME --limit 1000 --top 20

Optionally filter by a specific language (--language)

gh language trend YOUR_ORG_NAME --language Java

Note

The --language flag values are case-sensitive.

Help

For help, run:

gh language -h
Usage:
  language [command]

Available Commands:
  count       Analyze the count of programming languages used in repos across an organization
  help        Help about any command
  trend       Analyze the trend of programming languages used in repos across an organization over time

Flags:
  -h, --help              help for language
  -L, --language string   The language to filter on
  -l, --limit int         The maximum number of repositories to evaluate (default 100)
  -t, --top int           Return the top N languages (ignored when a language is specified) (default 10)

Use "language [command] --help" for more information about a command.

License

This tool is licensed under the MIT License. See the LICENSE file for details.

gh-language's People

Stargazers

Caleb Queern avatar Chris Carini avatar

Watchers

Caleb Queern avatar Greg Mohler avatar

Forkers

cqueern

gh-language's Issues

trend command output is not in chronological order

Thanks for the new feature in 1.3.0 that adds a trend command to visualize the frequency of languages per year (based on repo creation date).

Is it expected behavior for the output to be in non-chronological order? See screenshot for the output when I ran

gh language trend microsoft

...where the data goes in this order: 2023, 2021, 2018, 2017, 2015, 2024, 2022, 2020, 2019, 2016, 2014

Screenshot 2024-04-24 at 9 24 38 PM

How many repos were scanned in the analysis?

The output currently displays the maximum number of repos scanned...

Limiting to 100 repositories.

...but doesn't tell us how many were actually scanned.

Could we change it so that instead of output like this:

Limiting to 100 repositories.
Returning the top 10 languages.
3 repos (60%) that include the language: Python
3 repos (60%) that include the language: Shell
3 repos (60%) that include the language: C
2 repos (40%) that include the language: Makefile
2 repos (40%) that include the language: Batchfile
1 repos (20%) that include the language: Assembly
1 repos (20%) that include the language: AMPL
1 repos (20%) that include the language: FreeMarker
1 repos (20%) that include the language: Roff
1 repos (20%) that include the language: XSLT

it would share how many were actually scanned and look like this:

Limiting to 100 repositories.
5 repositories identified.
Returning the top 10 languages.
3 repos (60%) that include the language: Python
3 repos (60%) that include the language: Shell
3 repos (60%) that include the language: C
2 repos (40%) that include the language: Makefile
2 repos (40%) that include the language: Batchfile
1 repos (20%) that include the language: Assembly
1 repos (20%) that include the language: AMPL
1 repos (20%) that include the language: FreeMarker
1 repos (20%) that include the language: Roff
1 repos (20%) that include the language: XSLT

I know it's not much work to figure out that in this example 3 is 60% of 5, but a user would probably appreciate saving them that work.

What's the breakdown by year?

The output currently displays the frequency of languages since an organization's inception.

Would it be possible to add an option to the output that would display the frequency of languages used by year? When this option is invoked the output might look something like the below. The numbers in this example don't add up, just showing a possible way to communicate the relative frequencies over time.

Analyzing organization: AcmeCo
Limiting to 100 repositories.
Returning the top 10 languages.
-----------------------------------------------------
For repos created in 2023
 88 repos (88%) that include the language: HTML
 77 repos (77%) that include the language: Ruby
 17 repos (17%) that include the language: Shell
 16 repos (16%) that include the language: Python
  4 repos ( 4%) that include the language: SCSS
-----------------------------------------------------
For repos created in 2022
 88 repos (88%) that include the language: HTML
 77 repos (77%) that include the language: Dockerfile
 17 repos (17%) that include the language: CSS
 16 repos (16%) that include the language: Java
-----------------------------------------------------
For repos created in 2021
 88 repos (88%) that include the language: HTML
 77 repos (77%) that include the language: C
 17 repos (17%) that include the language: Shell
 16 repos (16%) that include the language: Python
 11 repos (11%) that include the language: C++

This would be valuable because it would show an organization's decisions about the languages they're invested in over time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.