aad64_Individual-Project1

Continuous Integration using GitHub Actions of Python Data Science Project

Summary

This project is an example of using Continuous Integration (or CI) in Python script/project development using GitHub. This project acts as a stencil for future projects that as well since it provides a clear outline of the steps that a project needs to follow to maintain consistency, robustness, and quality in the code. CI (here, the workflow) ensures that with git push, the code undergoes linting, formatting, installing dependencies, and testing of the entire project (both python script as well as Jupyter notebook).

YouTube Video:

Click Here

Jupyter Notebook with:
- Cells that perform descriptive statistics using Polars or Pandas.
- Tested by using nbval plugin for pytest
Python Script performing the same descriptive statistics using Polars or Pandas.
lib.py file that shares the common code between the script and notebook
Makefile with the following:
- Run all tests (must test notebook and script and lib)
- Formats code with Python black
- Lints code with Ruff
- Installs code via: pip install -r requirements.txt
test_script.py to test script
test_lib.py to test library
Pinned requirements.txt
GitHub Actions performs all four Makefile commands with badges for each one in the README.md

More about the Functionality of this Project:

For this project, I created descriptive statistics and visualization functions in the lib.py file as follows:

Calculating the mean (rounded to two decimal places),
Calculating the median,
Calculating the standard deviation (rounded to two decimal places),

Displaying the overall summmary statistics of a dataset.

Visualizing data in the form of a violinplot. It plots individuals Risk Preferences (y-axis) as per their Socioeconomic Status (x-axis), split by Gender (1: Male, 2: Female).

These descriptive statistics and visualization are then performed in my main.py file.

My test files are then used to test the functionality of the defined functions (in the test_lib.py file) and the validity inputs for each function (tested in the test_script.py).

Workflows:

Below are screenshots to show that my project is passing all formatting, linting, and tests. However, each workflow's validity can also be seen in the badges at the top of the README.md file.

nogibjj / aad64_individual-project1 Goto Github PK