GithubHelp home page GithubHelp logo

scjjb / dras-mil Goto Github PK

View Code? Open in Web Editor NEW
9.0 1.0 0.0 116.67 MB

Efficient subtyping of ovarian cancer histopathology whole slide images using active sampling in multiple instance learning

Home Page: https://doi.org/10.1117/12.2653869

License: GNU General Public License v3.0

Python 100.00%

dras-mil's Introduction

DRAS-MIL

Discriminative Region Active Sampling for Multiple Instance Learning

Preprint | Presentation & Conference Paper | Cite

DRAS-MIL is a sampling approach to improve the efficiency of evaluating histopathology slides while minimising the loss of classification accuracy.

Workflows

Baseline model training

  1. Whole slide tissue detection and patching (create_patches_fp.py)
  2. Feature extraction (extract_features_fp.py)
  3. Creation of cross-validation folds (create_splits_seq.py)
  4. Model training (main.py)
  5. Slide evaluation (eval.py)
  6. Model evaluation (other_metrics.py)

DRAS-MIL evaluation experiments

Here we pre-compute all features as we will run multiple experiments, so will not save time by only computing relevant features

  1. Whole slide tissue detection and patching (create_patches_fp.py)
  2. Feature extraction (extract_features_fp.py)
  3. Slide evaluation (eval.py with --sampling)
  4. Model evaluation (other_metrics.py)

DRAS-MIL evaluation in practice

Here features are evaluated only when needed

  1. Whole slide tissue detection and patching (create_patches_fp.py)
  2. Slide evaluation (eval.py with --sampling and --eval_features)
  3. Model evaluation (other_metrics.py)

Example code runs

Create 3 cross-validation folds:

python create_splits_seq.py --task custom_714 --seed 1 --label_frac 1 --val_frac 0.33 --test_frac 0.33 --k 3

Baseline model training with 500 random hyperparameter tuning experiments:

python main.py --coords_path "../../../../MULTIX/DATA/coords" --tuning --no_inst_cluster --num_tuning_experiments 500 --tuning_output_file tuning_results/main_custom1vsall_714_ABMILsb_ce_finaltuning.csv --split_dir /workspace/CLAM-private/splits/custom_714_100 --k 1 --results_dir /workspace/CLAM-private/results --exp_code main_custom1vsall_714_ABMILsb_ce_finaltuning --weighted_sample --bag_loss ce --task ovarian_1vsall --min_epochs 50 --max_epochs 500 --model_type clam_sb --log_data --subtyping --data_root_dir "/MULTIX/DATA/" --csv_path 'dataset_csv/set_all_714.csv' --features_folder "ovarian_dataset_features_256_patches_20x"

Baseline model training with the best hyperparameters found during tuning:

python main.py --early_stopping --use_all_samples --no_inst_cluster --reg 0.00079 --drop_out 0.02 --lr 0.0038 --split_dir /workspace/CLAM-private/splits/custom_714_100 --k 3 --results_dir /workspace/CLAM-private/results --exp_code main_nosampling_reg00079_dropout02_lr0038_1vsall_714_ABMILsb_ce_last10best_mean20stopper --weighted_sample --bag_loss ce --task ovarian_1vsall --max_epochs 500 --model_type clam_sb --log_data --subtyping --data_root_dir "/MULTIX/DATA/" --csv_path 'dataset_csv/set_all_714.csv' --features_folder "ovarian_dataset_features_256_patches_20x"

Slide evaluation processing all possible patches:

python eval.py  --drop_out 0.02 --split val --splits_dir /workspace/CLAM-private/splits/custom_714_100 --k 3 --models_exp_code main_nosampling_reg00079_dropout02_lr0038_1vsall_714_ABMILsb_ce_last10best_mean20stopper_s1 --save_exp_code main_nosampling_reg00079_dropout02_lr0038_1vsall_714_ABMILsb_ce_last10best_mean20stopper_VALIDSET --task ovarian_1vsall --model_type clam_sb --results_dir /workspace/CLAM-private/results/ --data_root_dir "/MULTIX/DATA/" --csv_path 'dataset_csv/set_all_714.csv' --features_folder "ovarian_dataset_features_256_patches_20x"

Bootstrapping to estimate model performance:

python bootstrapping.py --num_classes 2 --model_names main_nosampling_reg00079_dropout02_lr0038_1vsall_714_ABMILsb_ce_last10best_mean20stopper_VALIDSET --bootstraps 1000 --run_repeats 1 --folds 3

Reference

This code is forked from the CLAM repository with corresponding paper. This repository and the original CLAM repository are both available for non-commercial academic purposes under the GPLv3 License.

Breen, J., Allen, K., Zucker, K., Hall, G., Orsi, N.M. and Ravikumar, N., 2023, April. Efficient subtyping of ovarian cancer histopathology whole slide images using active sampling in multiple instance learning. In Proceedings of SPIE 12471 (Vol. 12471). SPIE. https://doi.org/10.1117/12.2653869.

@inproceedings{breen2023efficient,
  title={Efficient subtyping of ovarian cancer histopathology whole slide images using active sampling in multiple instance learning},
  author={Breen, Jack and Allen, Katie and Zucker, Kieran and Hall, Geoff and Orsi, Nicolas M and Ravikumar, Nishant},
  booktitle={Proceedings of SPIE 12471},
  volume={12471},
  year={2023},
  organization={SPIE}
}

dras-mil's People

Contributors

scjjb avatar fedshyvana avatar faisalml avatar keithcallenberg avatar andrew-weisman avatar richarizardd avatar

Stargazers

 avatar  avatar Davide Camponogara avatar Ko Taniguchi avatar Xiaoqi_ avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

dras-mil's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.