GithubHelp home page GithubHelp logo

bkmargetts / medgan Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mp2893/medgan

0.0 1.0 0.0 51 KB

Generative adversarial network for generating electronic health records.

License: BSD 3-Clause "New" or "Revised" License

Python 15.24% Jupyter Notebook 84.76%

medgan's Introduction

medGAN

medGAN is a generative adversarial network for generating multi-label discrete patient records. It can generate both binary and count variables (i.e. medical codes such as diagnosis codes, medication codes or procedure codes).

Relevant Publications

medGAN implements the algorithm introduced in the following paper:

Generating Multi-label Discrete Patient Records using Generative Adversarial Networks
Edward Choi, Siddharth Biswal, Bradley Malin, Jon Duke, Walter F. Stewart, Jimeng Sun  
Machine Learning for Healthcare (MLHC) 2017

Code Description

This code trains a generative adversarial network to generate patient records. This work currently can handle patient records that are aggregated over time, hence represented as a matrix where a row corresponds to a patient, and a column to a specific medical code (e.g. diagonsis code, medication code, or procedure code). The value of the matrix could either be binary (i.e. a specific medical code occurred in the longitudinal patient record or not) or count (i.e. how many times a specific medical code occurred in the longitudinal patient record).

Running GRAM

STEP 1: Installation

  1. medGAN was implemented to run on TensorFlow 1.2. TensorFlow can be easily installed in Ubuntu as suggested here

  2. Download/clone the medGAN code

STEP 2: Fast way to test medGAN with MIMIC-III
This step describes how to train medGAN, with minimum number of steps using MIMIC-III.

  1. You will first need to request access for MIMIC-III, a publicly avaiable electronic health records collected from ICU patients over 11 years.

  2. You can use "process_mimic.py" to process MIMIC-III dataset and generate a suitable training dataset for medGAN. Place the script to the same location where the MIMIC-III CSV files are located, and run the script. The execution command is python process_mimic.py ADMISSIONS.csv DIAGNOSES_ICD.csv <output file> <"binary"|"count">. Note that the last argument decides whether you construct a binary matrix or a count matrix. The above command will extract ICD9 diagnosis codes from MIMIC-III. Mind that this script will use only 3 digits of the ICD9 diagnosis code. If you want to use all 5 digits, please see the source code of "process_mimic.py".

  3. Run medGAN using the ".matrix" file generated by process_mimic.py. The command is: python medgan.py <matrix file> <output path> --data_type=["binary", "count"].

  4. After the training, if you want to generate synthetic records, use this command : python medgan.py <matrix file> <generated output path> --model_file=<trained output path> --generate_data=True. Note that <matrix file> is not actually used for generating synthetic records, so it is just a dummy input.

medgan's People

Contributors

mp2893 avatar bkmargetts avatar didayolo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.