GithubHelp home page GithubHelp logo

graham-broughton / daema Goto Github PK

View Code? Open in Web Editor NEW

This project forked from euranova/daema

0.0 0.0 0.0 5.93 MB

From the paper "DAEMA: Denoising Autoencoder with Mask Attention"

License: MIT License

Python 99.79% Dockerfile 0.21%

daema's Introduction

DAEMA: Denoising Autoencoder with Mask Attention

This repository contains the code used for the paper DAEMA: Denoising Autoencoder with Mask Attention. The documentation of the code, generated by sphinx, is available here.

Please cite as

@article{tihon2021daema,
  title={DAEMA: Denoising Autoencoder with Mask Attention},
  author={Tihon, Simon and Javaid, Muhammad Usama and Fourure, Damien and Posocco, Nicolas and Peel, Thomas},
  journal={arXiv preprint arXiv:2106.16057},
  year={2021}
}

How to setup the environment

On a Local Machine

Create and activate the conda environment with python 3.8.2

conda create --name <env-name> python=3.8.2
conda activate <env-name>

Install the libraries listed in requirements.txt

pip install -r requirements.txt

Run the code

cd src
python run.py

With Docker

The repo also contains Dockerfile to run the code

docker build -t <image_name>:<tag> .
docker run -t --name <container-name> <image_name> <experiment-to-run>

Example:

docker build -t daema:latest .
docker run -t --name daema_container daema:latest python run.py

Test your installation

You can test your installation by running

PYTHONPATH=src/ pytest tests

How to reproduce the results of the paper

MCAR state-of-the-art comparison:

  • DAEMA: python run.py
  • DAE: python run.py --daema_attention_mode no --daema_ways 1
  • AimNet: python run.py --model Holoclean --batch_size 0 --lr 0.05 --metric_steps 18 19 20 21 22
  • MIDA: python run.py --model MIDA --batch_size -1 --metric_steps 492 494 496 498 500 --scaler MinMax
  • MissForest: python run.py --model MissForest --metric_steps 0 --scaler MinMax
  • Mean: python run.py --model Mean --metric_steps 0
  • Real: python run.py --model Real --metric_steps 0

MNAR state-of-the-art comparison:

  • Same as above, but with an additional argument: --ms_setting mnar

Missingness proportions:

  • Same as above, but with an additional argument (e.g. for 10% missingness): --ms_prop 0.1

Ablation study part 1 (not part of the paper in the end):

  • Full: python run.py
  • Classic: python run.py --daema_attention_mode classic
  • Sep.: python run.py --daema_attention_mode sep

Ablation study part 2 (not part of the paper in the end):

  • DAEMA: python run.py
  • Reduced loss: python run.py --daema_loss_type dropout_only
  • Full loss: python run.py --daema_loss_type full
  • No art. miss.: python run.py --daema_pre_drop 0

How to add a dataset

To test the code on a local dataset:

  • put the dataset in files/data/<name>.csv;
  • update the src/pipeline/datasets/DATASETS variable to add your dataset;
  • run the tests;
  • use the --datasets argument to select it for the experiments (e.g. python run.py --datasets <name>).

How to add a model

To test the code on a custom model:

  • implement the model following the expected interface (see src/models/baseline_imputations/Identity for the basic structure);
  • update the src/models/__init__/MODELS variable to add your model;
  • run the tests;
  • use the --model argument to select it for the experiments (e.g. python run.py --model <Name>).

daema's People

Contributors

usama113 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.