This project is an example of using Continuous Integration (or CI) in Python script/project development using GitHub. This project acts as a stencil for future projects that as well since it provides a clear outline of the steps that a project needs to follow to maintain consistency, robustness, and quality in the code. CI (here, the workflow) ensures that with git push, the code undergoes linting, formatting, installing dependencies, and testing of the entire project (both python script as well as Jupyter notebook).
- Jupyter Notebook with:
- Cells that perform descriptive statistics using Polars or
Pandas
. - Tested by using nbval plugin for pytest
- Cells that perform descriptive statistics using Polars or
- Python Script performing the same descriptive statistics using Polars or
Pandas
. - lib.py file that shares the common code between the script and notebook
- Makefile with the following:
- Run all tests (must test notebook and script and lib)
- Formats code with Python black
- Lints code with Ruff
- Installs code via: pip install -r requirements.txt
- test_script.py to test script
- test_lib.py to test library
- Pinned requirements.txt
- GitHub Actions performs all four Makefile commands with badges for each one in the README.md
For this project, I created descriptive statistics and visualization functions in the lib.py
file as follows:
- Calculating the
mean
(rounded to two decimal places), - Calculating the
median
, - Calculating the
standard deviation
(rounded to two decimal places),
- Displaying the overall
summmary statistics
of a dataset.
Visualizing
data in the form of a violinplot. It plots individuals Risk Preferences (y-axis) as per their Socioeconomic Status (x-axis), split by Gender (1: Male, 2: Female).
These descriptive statistics and visualization are then performed in my main.py
file.
My test files are then used to test the functionality of the defined functions (in the test_lib.py
file) and the validity inputs for each function (tested in the test_script.py
).
Below are screenshots to show that my project is passing all formatting, linting, and tests. However, each workflow's validity can also be seen in the badges at the top of the README.md file.
- With Ruff:
- With Pylint (to check score):