Lessons_for_adversarial_debiasing

This repository contains code and models for the paper: Disentangling Document Topic and Author Gender in Multiple Languages: Lessons for Adversarial Debiasing.

@inproceedings{dayanik21:_disen_docum_topic_author_gender_multip_languag,
  author = {Dayanik, Erenay and Padó, Sebastian},
  biburl = {https://puma.ub.uni-stuttgart.de/bibtex/29f3e2e70efa78c0dd97ae2f4b2f071ac/sp},
  booktitle = {Proceedings of the EACL WASSA workshop},
  note = {To appear},
  title = {Disentangling Document Topic and Author Gender in Multiple Languages: Lessons for Adversarial Debiasing},
  year = 2021
}

Installation

$ git clone https://github.com/wassa21/adv.git
$ cd adv
$ pip install -r requirements.txt

Data

Please see ./data for information about the dataset

Experiments

Table 3: F1 scores for topic classification

$ cd src/topic_classification
$ bash run.sh

Table 4: F1 scores for gender classification

$ cd src/gender_classification
$ bash run.sh

Figure 3 (Right): F1 scores for topic classification with adversarial author gender training

$ cd src/topic_classification_with_adv_gender
$ bash run.sh

Figure 4 (Right): F1 scores for author gender classification with adversarial topic training

$ cd src/gender_classification_with_adv_topic
$ bash run.sh

Evaluation

Each run.sh script above will save the model with best weighted F-Score to lessons_for_adversarial_debiasing/models and save predictions on test set to lessons_for_adversarial_debiasing/outputs.
By default, prediction file names generated by the following template:

{LANG}_{ix}_BERT_SUM_MLP_{DATE}_best_model_outputs.csv

LANG: 'de','es','fr' or 'tr';

ix: 0,1,2,3,4 representing one of the five randomly generated test sets.

DATE: the system date and time when the scripts was runned.
In order to obtain weighted F-Score evaluation on these generation files one can use the src/evaluate.py. It expects 3 arguments:

argv[1]: path of prediction file (Example: de_1_BERT_SUM_MLP_2020-05-25_21-45-03_best_model_outputs.csv)

argv[2]: task type (either gender or topic)

argv[3]: is_adv (either true or false)

For example, to evaluate the predictions of a gender classifier (Table 4) one can use the following command:

$ python evaluate.py GenderPredictor_BERT_SUM_MLP_2020-05-25_21-45-03 gender false

This command will evaluate gender classifiers trained on DE,ES,FR,TR transcripts of TED talks.

Use src/evaluate_mb.py to evaluate majority baseline. (With same command line arguments)

wassa21 / adv Goto Github PK

adv's Introduction

Lessons_for_adversarial_debiasing

Installation

Data

Experiments

Table 3: F1 scores for topic classification

Table 4: F1 scores for gender classification

Figure 3 (Right): F1 scores for topic classification with adversarial author gender training

Figure 4 (Right): F1 scores for author gender classification with adversarial topic training

Evaluation

adv's People

Contributors

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs