GithubHelp home page GithubHelp logo

adamlkl / githubdatavisualisation Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 18.88 MB

License: MIT License

Python 98.49% HTML 0.91% CSS 0.10% JavaScript 0.11% Tcl 0.38%
python d3 dc flask mongodb rest-api bootstrap crossfilter html javascript css

githubdatavisualisation's Introduction

GithubDataVisualisation

Demo Video

Setting up

You will need Flask, PyGithub, PyMongo for this program to work.

I have removed my access token to extract json information from Github, but if you want to try it, you can request for it at Github Access Token Generator Page

Running

python app.py

However, obviously this won't work by simply cloning my repo won't work. As mentioned you need to download the above tools for it.

Another alternative that I have used is utilising PyCharm to make the assignment, to avoid downloading massive libraries. Pycharm provide convenient interpreters so you dont have to go through a lot of work to set up your working directory and environment.

When a remote Python interpreter is added, at first the PyCharm helpers are copied to the remote host. PyCharm helpers are needed to run remotely the packaging tasks, debugger, tests and other PyCharm features. Next, the skeletons for binary libraries are generated and copied locally. Also all the Python library sources are collected from the Python paths on a remote host and copied locally along with the generated skeletons. Storing skeletons and all Python library sources locally is required for resolve and completion to work correctly. PyCharm checks remote helpers version on every remote run, so if you update your PyCharm version, the new helpers will be uploaded automatically and you don't need to recreate remote interpreter. SFTP support is required for copying helpers to the server.

Python interpreters can be configured on the following levels:

Current project: selected Python interpreter will be used for the current project.

New project: selected Python interpreter will be used for the new project instead of the default one.

More explanation can be found at here.

Assignment Explanation

For this assignment, I have decided to compare the size of each repository in the Bloomberg repository and then the languages they have used.

Tools I have used for this assignment are d3.js, dc.js, dc.css, and crossfilter.js

D3.js, is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS. D3โ€™s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.

Dc.js is a javascript charting library with native crossfilter support, allowing highly efficient exploration on large multi-dimensional datasets (inspired by crossfilter's demo). It leverages d3 to render charts in CSS-friendly SVG format. Charts rendered using dc.js are data driven and reactive and therefore provide instant feedback to user interaction.

Dc.js is an easy yet powerful javascript library for data visualization and analysis in the browser and on mobile devices.

Crossfilter.js is a JavaScript library for exploring large multivariate datasets in the browser. Crossfilter supports extremely fast (<30ms) interaction with coordinated views, even with datasets containing a million or more records; we built it to power analytics for Square Register, allowing merchants to slice and dice their payment history fluidly.

By leveraging crossfilter's exquisite filtering capabilities, I can show information of size of each repo, (in this case I have changed it to the top 10 since there are way too much repositories in Bloomberg Github Account). By clicking on each segments on the pie chart, you can see languages used in the corresponding repo(I am trying to make it work with loc of each languages but it's not easy with crossfilter). You can also select the languages and display the repositories that uses them.

If the pie chart came out empty, chances are I am prevented from accessing it when I was extracting it, there are 2 instances that I have found to have this problem: Chromium.bb and TypeScript. I have removed Chromium.bb as it's size is too big, approximately more than 10 times larger than the 2nd largest repository. So I decided to take it out to restore the balance between arcs in the pie chart.

Pie Chart Display

http://127.0.0.1:5000/

This is the snapshot of the result program.

  • Pie chart displaying size and languages used in whole github repo.

* Pie chart displaying size and languages used in one github repo.

* Pie chart displaying size and languages used in combined github repo.

* Pie chart displaying repositories that uses the corresponding language selected.

* Pie chart displaying repositories that uses the corresponding languages selected.

Repo_Data2.json Link

I have also set up a database for the project using mongodb to store repo_data2.json. The advantage of setting up a database is that eventually I wanted to display more than just one Github Repository, and this would require massive amount of fast querying, which we can let the database deal with, unless you want your program to crash.

To view the stored json values, please set your link to

http://127.0.0.1:5000/bloombergdata/repo_data2

The result should come as below.

Remarks

As this program is not what I envisioned in the first place, there will be more upgrades to it such as displaying it on dashboard, displaying more data such as contributors, and multiple repo together. So stay tuned.

githubdatavisualisation's People

Contributors

adamlkl avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.