Implementation of the code for reproducing the results of the paper Fair Generalized Linear Models with a Convex Penalty published on ICML2022. [paper/arxiv]
You have to use pip
to install the requirements (requirements.txt
) not conda
.
git clone https://github.com/hyungrok-do/fair-glm-cvx
cd fair-glm-cvx
pip install -r requirements.txt
The codes were developed with the following specific versions of packages:
Python==3.x
scipy==1.7.3
numpy==1.21.5
pandas==1.3.5
fairlearn==0.7.0
scikit-learn==1.0.2
matplotlib==3.5.1
pyyaml==6.0
wget==3.2
cvxpy==1.2.1
dccp==1.0.3
Note that cvxpy
is required for
and
dccp
is required for
bash reproduce.sh
will help you to reproduce the paper results (except for the discretization plots. To reproduce Figure 2 in Appendix B, run python discretization.py
).
If you are working on a HPC with slurm
, you may use sbatch reproduce.s
(before execute it, you may want to make a dir for the log files: mkdir -p ./logs
)
You may substitute a dataset's name (see the Argument column of the table below) for DATASET
.
python experiment.py --dataset DATASET
Configuration for each experiment can be found in yaml
files in configs
folder.
We use 11 datasets for our experiments (8 from UCI Machine Learning Repository, except for the COMPAS, LSAC, and HRS).
Outcome | Dataset | Dataset Class Name | Argument | Sensitive Attribute | #instances | #features |
---|---|---|---|---|---|---|
Binary | Adult | AdultDataset |
adult |
Gender (2) | 45,222 | 34 |
Binary | Arrhythmia | ArrhythmiaDataset |
arrhythmia |
Gender (2) | 418 | 80 |
Binary | COMPAS | COMPASDataset |
compas |
Race (4) | 6,172 | 11 |
Binary | Drug Consumption | DrugConsumptionBinaryDataset |
drug_consumption |
Race (2) | 1,885 | 25 |
Binary | German Credit | GermanCreditDataset |
german_credit |
Gender (2) | 1,000 | 46 |
Continuous | Communities and Crime | CrimeDataset |
crime |
Race (3) | 1,993 | 97 |
Continuous | Law School | LSACDataset |
lsac |
Race (5) | 20,715 | 7 |
Continuous | Parkinsons Telemonitoring | ParkinsonsUPDRSDataset |
parkinsons_updrs |
Gender (2) | 5,875 | 25 |
Continuous | Student Performance | StudentPerformanceDataset |
student_performance |
Gender (2) | 649 | 39 |
Multiclass | Drug Consumption | DrugConsumptionMultiDataset |
drug_consumption_multi |
Race (2) | 1,885 | 25 |
Multiclass | Obesity | ObesityDataset |
obesity |
Gender (2) | 2,111 | 23 |
Count | Health & Retirement Study | HRSDataset |
hrs |
Race (4) | 12,774 | 23 |
We provide implementations of several linear model-based fair approaches (or their linear versions).
Method | Model Class Name | Reference |
---|---|---|
Fair Constraint | FairnessConstraintModel |
Zafar et al., 2017 (AISTATS) [code] |
Disparate Mistreatment | DisparateMistreatmentModel |
Zafar et al., 2017 (WWW) [code] |
Squared Difference Penalizer | SquaredDifferenceFairLogistic |
Bechavod et al., 2017 |
Group Fairness / Individual Fairness | ConvexFrameworkModel |
Berk et al., 2017 |
Independence Measured by HSIC | HSICLinearRegression |
Perez-Suay et al., 2017 |
Fair Empirical Risk Minimization | LinearFERM |
Donini et al., 2018 [code] |
Reductions Approach | ReductionsApproach |
Agarwal et al., 2018, 2019 [code] |
General Fair Empirical Risk Minimization | GeneralFairERM |
Oneto et al., 2020 |
Fair Generalized Linear Models | FairGeneralizedLinearModel |
Do et al., 2022 |
Please cite as:
@InProceedings{pmlr-v162-do22a,
title = {Fair Generalized Linear Models with a Convex Penalty},
author = {Do, Hyungrok and Putzel, Preston and Martin, Axel S and Smyth, Padhraic and Zhong, Judy},
booktitle = {Proceedings of the 39th International Conference on Machine Learning},
pages = {5286--5308},
year = {2022},
editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
volume = {162},
series = {Proceedings of Machine Learning Research},
month = {17--23 Jul},
publisher = {PMLR},
url = {https://proceedings.mlr.press/v162/do22a.html}
}