GithubHelp home page GithubHelp logo

nogibjj / aad64_individual-project1 Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 338 KB

Continuous Integration using GitHub Actions of Python Data Science Project

Dockerfile 0.68% Makefile 0.31% Python 3.44% Jupyter Notebook 95.56%

aad64_individual-project1's Introduction

aad64_Individual-Project1

Continuous Integration using GitHub Actions of Python Data Science Project

Summary

This project is an example of using Continuous Integration (or CI) in Python script/project development using GitHub. This project acts as a stencil for future projects that as well since it provides a clear outline of the steps that a project needs to follow to maintain consistency, robustness, and quality in the code. CI (here, the workflow) ensures that with git push, the code undergoes linting, formatting, installing dependencies, and testing of the entire project (both python script as well as Jupyter notebook).

example workflowexample workflowexample workflowexample workflow

YouTube Video:

Click Here

Contents:

  • Jupyter Notebook with:
    • Cells that perform descriptive statistics using Polars or Pandas.
    • Tested by using nbval plugin for pytest
  • Python Script performing the same descriptive statistics using Polars or Pandas.
  • lib.py file that shares the common code between the script and notebook
  • Makefile with the following:
    • Run all tests (must test notebook and script and lib)
    • Formats code with Python black
    • Lints code with Ruff
    • Installs code via: pip install -r requirements.txt
  • test_script.py to test script
  • test_lib.py to test library
  • Pinned requirements.txt
  • GitHub Actions performs all four Makefile commands with badges for each one in the README.md

More about the Functionality of this Project:

For this project, I created descriptive statistics and visualization functions in the lib.py file as follows:

  • Calculating the mean (rounded to two decimal places),
  • Calculating the median,
  • Calculating the standard deviation (rounded to two decimal places),

image

  • Displaying the overall summmary statistics of a dataset.

image

  • Visualizing data in the form of a violinplot. It plots individuals Risk Preferences (y-axis) as per their Socioeconomic Status (x-axis), split by Gender (1: Male, 2: Female).

These descriptive statistics and visualization are then performed in my main.py file.

My test files are then used to test the functionality of the defined functions (in the test_lib.py file) and the validity inputs for each function (tested in the test_script.py).

Workflows:

Below are screenshots to show that my project is passing all formatting, linting, and tests. However, each workflow's validity can also be seen in the badges at the top of the README.md file.

Linting

  • With Ruff:

image

  • With Pylint (to check score):

image

Formatting

image

Testing

image

aad64_individual-project1's People

Contributors

aaryadesai1 avatar

Watchers

Noah Gift avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.