GithubHelp home page GithubHelp logo

frankkramer-lab / riadd.aucmedi Goto Github PK

View Code? Open in Web Editor NEW
27.0 3.0 10.0 14.36 MB

Multi-Disease Detection in Retinal Imaging based on Ensembling Heterogeneous Deep Learning Models

License: GNU General Public License v3.0

Python 98.07% R 1.93%
retinal-disease-detection ensemble-learning class-imbalance multi-label-image-classification deep-learning medical-image-analysis healthcare-imaging retinal-images

riadd.aucmedi's Introduction

Multi-Disease Detection in Retinal Imaging based on Ensembling Heterogeneous Deep Learning Models

DOI shield_license

Preventable or undiagnosed visual impairment and blindness affects billion of people worldwide. Automated multi-disease detection models offer great potential to address this problem via clinical decision support in diagnosis. In this work, we proposed an innovative multi-disease detection pipeline for retinal imaging which utilizes ensemble learning to combine the predictive power of several heterogeneous deep convolutional neural network models. Our pipeline includes state-of-the-art strategies like transfer learning, class weighting, real-time image augmentation and focal loss utilization. Furthermore, we integrated ensemble learning techniques like heterogeneous deep learning models, bagging via 5-fold cross-validation and stacked logistic regression models.

Participation at the Retinal Image Analysis for multi-Disease Detection Challenge (RIADD):
https://riadd.grand-challenge.org/

The models, predictions, metadata and evaluation results (scores, figures) are available under the following Zenodo repository:
https://doi.org/10.5281/zenodo.4573990

Reproducibility

Requirements:

  • Ubuntu 18.04
  • Python 3.6
  • NVIDIA TITAN RTX or a GPU with equivalent performance

Step-by-Step workflow:

Download Git repository:

git clone https://github.com/frankkramer-lab/riadd.aucmedi.git
cd riadd.aucmedi/

Install our in-house developed image classification framework AUCMEDI and all other required module dependencies:

pip install -r requirements.txt

Adjust the RFMiD image directory path of all scripts in the 'configuration' section.

# Train detector models
python scripts/detector_DenseNet201.py
python scripts/detector_EfficientNetB4.py

# Train classifier models
python scripts/classifier_DenseNet201.py
python scripts/classifier_InceptionV3.py
python scripts/classifier_ResNet152.py
python scripts/classifier_EfficientNetB4.py

# Train Logistic Regression models
python scripts/ensemble_training.py

# Run Inference
python scripts/inference.py
python scripts/ensemble.py

# Perform result evaluation
python scripts/evaluation.py

Based on Framework: AUCMEDI

AUCMEDI_LOGO

The open-source software AUCMEDI allows fast setup of medical image classification pipelines with state-of-the-art methods via an intuitive, high-level Python API or via an AutoML deployment through Docker/CLI.

https://github.com/frankkramer-lab/aucmedi

This pipeline was based on AUCMEDI, which is an in-house developed open-source framework to setup complete medical image classification pipelines with deep learning models on top of Tensorflow/Keras⁠. The framework supports extensive preprocessing, image augmentation, class imbalance strategies, state-of-the-art deep learning models and ensemble learning techniques. The experiment was performed in parallel with multiple NVIDIA TITAN RTX GPUs.

Dataset: RFMiD

Reference: https://riadd.grand-challenge.org/

Pachade S, Porwal P, Thulkar D, Kokare M, Deshmukh G, Sahasrabuddhe V, Giancardo L, Quellec G, Mériaudeau F.
Retinal Fundus Multi-Disease Image Dataset (RFMiD): A Dataset for Multi-Disease Detection Research.
Data. 2021; 6(2):14.
https://doi.org/10.3390/data6020014

The new Retinal Fundus Multi-Disease Image Dataset (RFMiD) consists of 3200 fundus images and contains 46 retinal conditions including various rare and challenging to detect diseases.
The dataset was published associated to the Retinal Image Analysis for Multi-Disease Classification (RIADD) challenge from the ISBI 2021. The aim was to multi-label classify different sized retinal microscrope images.

Microscope distribution:
{(1424, 2144, 3): 1493, (1536, 2048, 3): 150, (2848, 4288, 3): 277}

fig_LabelFreq

This dataset consists of diseases/abnormalities (diabetic retinopathy (DR), age-related macular degeneration (ARMD), media haze (MZ), drusen (DN), myopia (MYA), branch retinal vein occlusion (BRVO), tessellation (TSLN), epiretinal membrane (ERM), laser scar (LS), macular scar (MS), central serous retinopathy (CSR), optic disc cupping (ODC), central retinal vein occlusion (CRVO), tortuous vessels (TV), asteroid hyalosis (AH), optic disc pallor (ODP), optic disc edema (ODE), shunt (ST), anterior ischemic optic neuropathy (AION), parafoveal telangiectasia (PT), retinal traction (RT), retinitis (RS), chorioretinitis (CRS), exudation (EDN), retinal pigment epithelium changes (RPEC), macular hole (MHL), retinitis pigmentosa (RP), cotton wool spots (CWS), coloboma (CB), optic disc pit maculopathy (ODPM), preretinal hemorrhage (PRH), myelinated nerve fibers (MNF), hemorrhagic retinopathy (HR), central retinal artery occlusion (CRAO), tilted disc (TD), cystoid macular edema (CME), post traumatic choroidal rupture (PTCR), choroidal folds (CF), vitreous hemorrhage (VH), macroaneurysm (MCA), vasculitis (VS), branch retinal artery occlusion (BRAO), plaque (PLQ), hemorrhagic pigment epithelial detachment (HPED) and collateral (CL)) based on their visual characteristics as shown in the Figure below.

fig_classes

Methods

The implemented medical image classification pipeline can be summarized in the following core steps:

  • Class Weighted Focal Loss and Upsampling to conquer Class Imbalance
  • Stratified Multi-label 5-fold Cross-Validation
  • Extensive real-time image augmentation
  • Multiple Deep Learning Model Training
  • Distinct Training for Multi-Disease Labels and Disease Risk Detection
  • Ensemble Learning Strategy: Bagging & Stacking
  • Stacked Binary Logistic Regression Models for Distinct Classification

fig_pipeline

Results & Discussion

fig_results

Receiver operating characteristic (ROC) curves for each model type applied in our pipeline. The ROC curves showing the individual model performance measured by the true positive and false positive rate. The cross-validation models were macro-averaged for each model type to reduce illustration complexity.

fig_results

In our participation, we were able to reach rank 7 from a total of 58 teams. In the independent evaluation from the challenge organizers, we achieved an AUROC of 0.95 for the disease risk classification. For multi-label scoring, they computed the average between the macro-averaged AUROC and the mAP, for which we reached the score 0.70. The top performing ranks shared only a marginal scoring difference which is why we had only a final score difference of 0.05 to the first ranked team.

We were able to validate and demonstrate high accuracy and reliability of our pipeline, as well as the comparability with other state-of-the-art pipelines for retinal disease prediction.

Author

Dominik Müller
Email: [email protected]
IT-Infrastructure for Translational Medical Research
University Augsburg
Bavaria, Germany

How to cite / More information

Dominik Müller, Iñaki Soto-Rey and Frank Kramer. (2021)
Multi-Disease Detection in Retinal Imaging based on Ensembling Heterogeneous Deep Learning Models.
PubMed: https://pubmed.ncbi.nlm.nih.gov/34545816/
arXiv e-print: https://arxiv.org/abs/2103.14660

Article{riaddMUELLER2021,
  title={Multi-Disease Detection in Retinal Imaging based on Ensembling Heterogeneous Deep Learning Models},
  author={Dominik Müller, Iñaki Soto-Rey and Frank Kramer},
  year={2021}
  journal={Studies in Health Technology and Informatics},
  volume={283},
  url={https://doi.org/10.3233/shti210537},
  doi={10.3233/shti210537},
  eprint={2103.14660},
  archivePrefix={arXiv},
  primaryClass={eess.IV}
}

Thank you for citing our work.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE Version 3.
See the LICENSE.md file for license rights and limitations.

riadd.aucmedi's People

Contributors

muellerdo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

riadd.aucmedi's Issues

Ensemble

weighted average according to validation AUC

Error while executing the project- Threading error

I tried executing the Detecter DenseNet file . But it gives a strange error.
File "C:\Users\DELL\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle _thread.lock objects.

This happens when epoch 1 starts running
Pasting a image here.
snapshot

Plots

Evaluation for paper:

  • Fitting plot via the logs.csv
  • Implement Train/Test evaluation approach on ensemble_train
  • Plot auc of the each model inferences and ensemble learning
  • Compute performance scores -> Table
  • Create draw.io pipeline
  • Copy & Paste RIADD classification image
  • Table with label distributions before and after upsampling

Upsampling seems to be stuck / going on forever without changes

Upon trying to upsample using the upsampling.py script, after a while the process just stops changing, and the script seems to be stuck in an eternal loop.

The output of the analyse_classes function looks the same every time and the number of files in the Upsample_Set folder isn't increasing anymore.

Disease_Risk 4384 1.0
DR 752 0.17153284671532848
ARMD 275 0.06272810218978102
MH 500 0.11405109489051095
DN 343 0.07823905109489052
MYA 198 0.045164233576642336
BRVO 219 0.049954379562043794
TSLN 495 0.11291058394160584
ERM 128 0.029197080291970802
LS 139 0.03170620437956204
MS 105 0.023950729927007298
CSR 110 0.02509124087591241
ODC 719 0.16400547445255476
CRVO 100 0.02281021897810219
TV 102 0.023266423357664233
AH 101 0.02303832116788321
ODP 349 0.07960766423357664
ODE 362 0.08257299270072993
ST 101 0.02303832116788321
AION 101 0.02303832116788321
PT 101 0.02303832116788321
RT 104 0.023722627737226276
RS 187 0.042655109489051095
CRS 104 0.023722627737226276
EDN 196 0.04470802919708029
RPEC 102 0.023266423357664233
MHL 115 0.026231751824817517
RP 100 0.02281021897810219
CWS 102 0.023266423357664233
CB 100 0.02281021897810219
ODPM 0 0.0
PRH 100 0.02281021897810219
MNF 101 0.02303832116788321
HR 0 0.0
CRAO 100 0.02281021897810219
TD 102 0.023266423357664233
CME 115 0.026231751824817517
PTCR 101 0.02303832116788321
CF 101 0.02303832116788321
VH 100 0.02281021897810219
MCA 100 0.02281021897810219
VS 100 0.02281021897810219
BRAO 100 0.02281021897810219
PLQ 100 0.02281021897810219
HPED 100 0.02281021897810219
CL 100 0.02281021897810219

I notice a number of disease categories being zero valued...
Could this be the problem?

Baseline results

Parameter for augmentation:

aug = Image_Augmentation(flip=True, rotate=True, brightness=True, contrast=True,
                         saturation=False, hue=False, scale=True, crop=False,
                         grid_distortion=True, compression=False,
                         gaussian_noise=False, gaussian_blur=False,
                         downscaling=False, gamma=False,
                         elastic_transform=False)
Mean AUROC:
                                         auroc
arch                                          
DenseNet121                           0.553166
DenseNet169                           0.604234
DenseNet169.augmenting_mean           0.747839
DenseNet169.augmenting_softmax        0.700156
DenseNet169.simple                    0.735617
EfficientNetB2                        0.477691
EfficientNetB4                        0.495157
EfficientNetB4.augmenting_mean        0.512000
EfficientNetB4.augmenting_softmax     0.548385
EfficientNetB4.simple                 0.498329
InceptionResNetV2.augmenting_mean     0.600779
InceptionResNetV2.augmenting_softmax  0.630981
InceptionResNetV2.simple              0.613342
ResNet101                             0.603005
ResNet101.augmenting_mean             0.590247
ResNet101.augmenting_softmax          0.614355
ResNet101.simple                      0.584725
ResNet152                             0.601131
ResNet152.augmenting_mean             0.658980
ResNet152.augmenting_softmax          0.609400
ResNet152.simple                      0.620327
VGG16.augmenting_mean                 0.591308
VGG16.augmenting_softmax              0.575181
VGG16.simple                          0.579012
Xception                              0.510792
Xception.augmenting_mean              0.506821
Xception.augmenting_softmax           0.527212
Xception.simple                       0.534921

To-do

  • Implement directory interface in AUCMEDI
  • Implement inference augmenting in AUCMEDI
  • Implement augmenting predictions in baseline as well for comparison with normal baseline prediction
  • Run baseline again
  • Evaluate baseline
  • Compute baseline predictions for evaluation set with some model (upload and see score on online evaluation set)
  • Implement up-sampling approach for rare label combinations
  • Run classifier 5-fold CV with selected architectures
  • Implement ensembler & ensembler Training
  • Run ensembler training
  • Implement prediction
  • Test out single model for each class ML algorithms
  • Implement sample weights for RF label approach
  • Implement disease risk binary model and train it
  • Implement focal loss to AUCMEDI
  • Implement multi-label class weights in AUCMEDI
  • Implement ensemble learning for disease risk
  • Implement single architecture inference for Focal Loss EfficientNetB4 and upload
  • Implement focal loss usage to all models

Further:

Final:

  • Implement Evaluation
  • Re-run all classifiers and detector trainings
  • Re-run ensemble training
  • Re-run inferences
  • Re-run ensemble
  • Upload
  • Run Evaluation on final data
  • Write Manuscript
  • Revise Manuscript
  • Update README
  • Upload Data to Zendo
  • Make this repo public

Upload:

  • Simple LR approach
  • Simple RF approach
  • Augmenting LR approach
  • Augmenting RF approach

Idea:

  • Run predictions on training dataset (detector & label)
  • Collect all predictions (detector & label)
  • Train a LR/RF Model for each class

Image classification framework giving error message keras. engine not found when working in COLAB

I am using colab for my image classification model where in if I install aucmedi -0.1.0 version , many modules like evaluate. dataset, evaluate. filling and 2d neural networks are not working , module does not exist error is shown . and if I install aucmedi then I get module Keras. engine not found error.
Please can you tell how to work with Aucmedi on COLAB


ModuleNotFoundError Traceback (most recent call last)
in <cell line: 2>()
1 from aucmedi import *
----> 2 from aucmedi.evaluation.fitting import *
3
4 evaluate_fitting(
5 train_history = history,

ModuleNotFoundError: No module named 'aucmedi.evaluation.fitting'

Structure

Architectures: [VGG16, DenseNet169, EfficientNetB4, Xception] ?

Single class and multi-label individual

For each arch:

  • Run 5-fold CV training
  • Use Inference Augmentation for prediction?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.