GithubHelp home page GithubHelp logo

isabella232 / disparate-vulnerability Goto Github PK

View Code? Open in Web Editor NEW

This project forked from spring-epfl/disparate-vulnerability

0.0 0.0 0.0 23.04 MB

Accompanying code for "Disparate Vulnerability to Membership Inference Attacks"

Home Page: https://arxiv.org/abs/1906.00389

License: MIT License

Python 96.90% Makefile 1.67% Dockerfile 1.43%

disparate-vulnerability's Introduction

Disparate Vulnerability to Membership Inference Attacks

This is the accompanying code to the paper "Disparate Vulnerability to Membership Inference Attacks" which appears in PETS 2022.

The code enables to reproduce all the paper experiments with the corresponding plots and tables.

Setup

Manual

System Requirements. You need Python 3.8 and poetry 1.1.8. You can install poetry, e.g., as follows:

pip install --user poetry==1.1.8

To reproduce the plots exactly, you also need a LaTeX distribution with certain packages. On a Debian-based system, these can be installed as follows:

sudo apt install texlive-latex-extra cm-super texlive-science dvipng

Environment. Use poetry to set up the Python environment:

poetry install

Data. We use ADULT and Texas Hospital Discharge datasets. The ADULT dataset is checked into the repository; to install the Texas Hospital Discharge data use the following command:

make data

Notebooks sync. We use jupytext to automatically convert Jupyter notebooks to Python scripts and keep them in sync. To generate the notebooks initially:

make sync

Docker

Alternatively, we provide a docker image that can be used as:

docker build -t dv .
docker run -it --rm dv <command such as 'make tests'>

The docker image already includes all the steps in the manual setup.

Testing the setup

To test that the setup works as expected, you can use:

make tests

This runs the same scripts that are used for the paper experiments but with fewer models (3 vs 200 models needed for the full reproduction). The tests can take several minutes to run. The tests fail if the command terminates unsuccessfully. Warnings are OK.

Modules and scripts

The repo contains the following relevant modules and directories:

  • mia.py - Implementations of Membership Inference Attacks.
  • model_zoo.py - Definitions of target ML models used in experiments.
  • plot_params.py - Setup of the plot style parameters.
  • plotting.py - Plotting utilities.
  • utils.py - Misc. utilities.
  • data/ - The directory in which make data stores the data files.
  • loaders/ - The directory that contains modules for loading the datasets.
  • results/ - The directory where the experiment data are saved.
  • images/ - The directory where the plots are saved.

Jupyter notebooks (committed as Python scripts, see Notebooks Sync above):

Reproducing the paper results

Launching the Jupyter server

To reproduce the paper results, you need to launch the Jupyter server:

poetry run jupyter-notebook .

The command will output instructions on accessing the server.

If using the Docker container, you can launch the Jupyter server like so:

docker run -it --rm dv -p <port>:<port> jupyter-notebook --ip=0.0.0.0 --port=<port>

This will run the container on a given port (e.g., 8888) on your host machine.

Full reproduction

To reproduce all the experimental data, tables, and plots, you need to execute each of the notebooks in the root folder using the launched Jupyter server. If you have not used Jupyter before, you can check this tutorial to learn how to do this.

Executing the notebooks will re-run the experiments and might take from 20 minutes to about 6 hours depending on the experiment and the hardware.

Using experimental data from the paper

The reproduction will not be exact, as scikit-learn is not deterministic. We include the experimental data used in the paper in the results/ folder---results of the experiments. Using this data, you can reproduce the tables and plots from the paper without actually re-running the experiments.

To do so, you need to run the relevant notebook (see Modules and scripts) using the Jupyter server, with a modification: set RESTORE_SAVED_DATA to True (if this flag is used in the notebook). You will see the plots and tables displayed inline of the respective cells.

disparate-vulnerability's People

Contributors

m-yaghini avatar bogdan-kulynych avatar moyaghini avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.