GithubHelp home page GithubHelp logo

matrixgame2018 / medfm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from openmedlab/medfm

0.0 0.0 0.0 3.88 MB

Official Repository of NeurIPS 2023 - MedFM Challenge

Home Page: https://medfm2023.grand-challenge.org/

Shell 0.07% Python 99.39% Dockerfile 0.54%

medfm's Introduction

NeurIPS 2023 - MedFM: Foundation Model Prompting for Medical Image Classification Challenge 2023

A naive baseline and submission demo for the Foundation Model Prompting for Medical Image Classification Challenge 2023 (MedFM).

โœจ Notification

Please check out master branch. Third party implementation of MedFMC baseline is supported! It is based on the MMPreTrain, with backbone of ViT-cls, ViT-eva02, ViT-dinov2, Swin-cls and ViT-clip. More details could be found in its document. Thanks Ezra-Yu for this excellent work.

๐Ÿ› ๏ธ Installation

Install requirements by

$ conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.1 -c pytorch
$ pip install mmcls==0.25.0 openmim scipy scikit-learn ftfy regex tqdm
$ mim install mmcv-full==1.6.0

We suggest you install PyTorch successfully first, then install OpenMMLab packages and their dependencies.

Moreover, you can use other Computer Vision or other foundation models such as EVA and CLIP.

๐Ÿ“Š Results

The results of ChestDR, ColonPath and Endo in MedFMC dataset and their corresponding configs on each task are shown as below.

Few-shot Learning Results

We utilize Visual Prompt Tuning method as the few-shot learning baseline, whose backbone is Swin Transformer. The results are shown as below:

ChestDR

N Shot Crop Size Epoch mAP AUC Config
1 384x384 20 13.14 56.49 config
5 384x384 20 17.05 64.86 config
10 384x384 20 19.01 66.68 config

ColonPath

N Shot Crop Size Epoch Acc AUC Config
1 384x384 20 77.60 84.69 config
5 384x384 20 89.29 96.07 config
10 384x384 20 91.21 97.14 config

Endo

N Shot Crop Size Epoch mAP AUC Config
1 384x384 20 19.70 62.18 config
5 384x384 20 23.88 67.48 config
10 384x384 20 25.62 71.41 config

Transfer Learning on 20% (Fully Supervised Task)

Noted that MedFMC mainly focuses on few-shot learning i.e., transfer learning task. Thus, fully supervised learning tasks below only use 20% training data to make corresponding comparisons.

ChestDR

Backbone Crop Size Epoch mAP AUC Config
DenseNet121 384x384 20 24.48 75.25 config
EfficientNet-B5 384x384 20 29.08 77.21 config
Swin-B 384x384 20 31.07 78.56 config

ColonPath

Backbone Crop Size Epoch Acc AUC Config
DenseNet121 384x384 20 92.73 98.27 config
EfficientNet-B5 384x384 20 94.04 98.58 config
Swin-B 384x384 20 94.68 98.35 config

Endo

Backbone Crop Size Epoch mAP AUC Config
DenseNet121 384x384 20 41.13 80.19 config
EfficientNet-B5 384x384 20 36.95 78.23 config
Swin-B 384x384 20 41.38 79.42 config

๐ŸŽซ License

This project is released under the Apache 2.0 license.

๐Ÿ™Œ Usage

Data preparation

Prepare data following MMClassification. The data structure looks like below:

data/
โ”œโ”€โ”€ MedFMC
โ”‚   โ”œโ”€โ”€ chest
โ”‚   โ”‚   โ”œโ”€โ”€ images
โ”‚   โ”‚   โ”œโ”€โ”€ chest_X-shot_train_expY.txt
โ”‚   โ”‚   โ”œโ”€โ”€ chest_X-shot_val_expY.txt
โ”‚   โ”‚   โ”œโ”€โ”€ train_20.txt
โ”‚   โ”‚   โ”œโ”€โ”€ val_20.txt
โ”‚   โ”‚   โ”œโ”€โ”€ trainval.txt
โ”‚   โ”‚   โ”œโ”€โ”€ test_WithLabel.txt
โ”‚   โ”œโ”€โ”€ colon
โ”‚   โ”‚   โ”œโ”€โ”€ images
โ”‚   โ”‚   โ”œโ”€โ”€ colon_X-shot_train_expY.txt
โ”‚   โ”‚   โ”œโ”€โ”€ colon_X-shot_val_expY.txt
โ”‚   โ”‚   โ”œโ”€โ”€ train_20.txt
โ”‚   โ”‚   โ”œโ”€โ”€ val_20.txt
โ”‚   โ”‚   โ”œโ”€โ”€ trainval.txt
โ”‚   โ”‚   โ”œโ”€โ”€ test_WithLabel.txt
โ”‚   โ”œโ”€โ”€ endo
โ”‚   โ”‚   โ”œโ”€โ”€ images
โ”‚   โ”‚   โ”œโ”€โ”€ endo_X-shot_train_expY.txt
โ”‚   โ”‚   โ”œโ”€โ”€ endo_X-shot_val_expY.txt
โ”‚   โ”‚   โ”œโ”€โ”€ train_20.txt
โ”‚   โ”‚   โ”œโ”€โ”€ val_20.txt
โ”‚   โ”‚   โ”œโ”€โ”€ trainval.txt
โ”‚   โ”‚   โ”œโ”€โ”€ test_WithLabel.txt

Noted that the .txt files includes data split information for fully supervised learning and few-shot learning tasks. The public dataset is splited to trainval.txt and test_WithLabel.txt, and trainval.txt is also splited to train_20.txt and val_20.txt where 20 means the training data makes up 20% of trainval.txt. And the test_WithoutLabel.txt of each dataset is validation set.

Corresponding .txt files are stored at ./data_backup/ folder, the few-shot learning data split files {dataset}_{N_shot}-shot_train/val_exp{N_exp}.txt could also be generated as below:

python tools/generate_few-shot_file.py

Where N_shot is 1,5 and 10, respectively, the shot is of patient(i.e., 1-shot means images of certain one patient are all counted as one), not number of images.

The images in each dataset folder contains its images, which could be achieved from original dataset.

Training and evaluation using OpenMMLab codebases.

In this repository we provided many config files for fully supervised task (only uses 20% of original traning set, please check out the .txt files which split dataset) and few-shot learning task.

The config files of fully supervised transfer learning task are stored at ./configs/densenet, ./configs/efficientnet, ./configs/vit-base and ./configs/swin_transformer folders, respectively. The config files of few-shot learning task are stored at ./configs/ablation_exp and ./configs/vit-b16_vpt folders.

For the training and testing, you can directly use commands below to train and test the model:

# you need to export path in terminal so the `custom_imports` in config would work
export PYTHONPATH=$PWD:$PYTHONPATH
# Training
# you can choose a config file like `configs/vit-b16_vpt/in21k-vitb16_vpt1_bs4_lr6e-4_1-shot_chest.py` to train its model
python tools/train.py $CONFIG

# Evaluation
# Endo and ChestDR utilize mAP as metric
python tools/test.py $CONFIG $CHECKPOINT --metrics mAP
python tools/test.py $CONFIG $CHECKPOINT --metrics AUC_multilabel
# Colon utilizes accuracy as metric
python tools/test.py $CONFIG $CHECKPOINT --metrics accuracy --metric-options topk=1
python tools/test.py $CONFIG $CHECKPOINT --metrics AUC_multiclass

The repository is built upon MMClassification/MMPretrain. More details could be found in its document.

Generating Submission results of Validation Phase

Noted:

  • The order of filanames of all CSV files must follow the order of provided colon_val.csv, chest_val.csv and endo_val.csv! You can see files in ./data_backup/result_sample for more details.
  • The name of CSV files in result.zip must be the same names xxx_N-shot_submission.csv below.

Run

python tools/test_prediction.py $DATASETPATH/test_WithoutLabel.txt $DATASETPATH/images/ $CONFIG $CHECKPOINT --output-prediction $DATASET_N-shot_submission.csv

For example:

python tools/test_prediction.py data/MedFMC/endo/test_WithoutLabel.txt data/MedFMC/endo/images/ $CONFIG $CHECKPOINT --output-prediction endo_10-shot_submission.csv

You can generate all prediction results of endo_N-shot_submission.csv, colon_N-shot_submission.csv and chest_N-shot_submission.csv and zip them into result.zip file. Then upload it to Grand Challenge website.

result/
โ”œโ”€โ”€ endo_1-shot_submission.csv
โ”œโ”€โ”€ endo_5-shot_submission.csv
โ”œโ”€โ”€ endo_10-shot_submission.csv
โ”œโ”€โ”€ colon_1-shot_submission.csv
โ”œโ”€โ”€ colon_5-shot_submission.csv
โ”œโ”€โ”€ colon_10-shot_submission.csv
โ”œโ”€โ”€ chest_1-shot_submission.csv
โ”œโ”€โ”€ chest_5-shot_submission.csv
โ”œโ”€โ”€ chest_10-shot_submission.csv

Then using zip to make them as .zip file(i.e., result_sample.zip in ./data_backup folder) and upload it to submission site of Grand Challenge MedFMC Validation Phase.

๐Ÿ—๏ธ Using MedFMC repo with Docker (TO BE DONE)

More details of Docker could be found in this tutorial.

Preparation of Docker

We provide a Dockerfile to build an image. Ensure that your docker version >=19.03.

# build an image with PyTorch 1.11, CUDA 11.3
# If you prefer other versions, just modified the Dockerfile
docker build -t medfmc docker/

Run it with

docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/medfmc/data medfmc

Build Docker and make sanity test

The submitted docker will be evaluated by the following command:

docker container run --gpus all --shm-size=8g -m 28G -it --name teamname --rm -v $PWD:/medfmc_exp -v $PWD/data:/medfmc_exp/data teamname:latest /bin/bash -c "sh /medfmc_exp/run.sh"
  • --gpus: specify the available GPU during inference
  • -m: spedify the maximum RAM
  • --name: container name during running
  • --rm: remove the container after running
  • -v $PWD:/medfmc_exp: map local codebase folder to Docker medfmc_exp folder.
  • -v $PWD/data:/medfmc_exp/data: map local codebase folder to Docker medfmc_exp/data folder.
  • teamname:latest: docker image name (should be teamname) and its version tag. The version tag should be latest. Please do not use v0, v1... as the version tag
  • /bin/bash -c "sh run.sh": start the prediction command.

Assuming the team name is baseline, the Docker build command is

docker build -t baseline .

During the inference, please monitor the GPU memory consumption using watch nvidia-smi. The GPU memory consumption should be less than 10G. Otherwise, it will run into an OOM error on the official evaluation server.

3) Save Docker

docker save baseline | gzip -c > baseline.tar.gz

๐Ÿ–Š๏ธ Citation

@article{wang2023medfmc,
  title={MedFMC: A Real-world Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification},
  author={Wang, Dequan and Wang, Xiaosong and Wang, Lilong and Li, Mengzhang and Da, Qian and Liu, Xiaoqiang and Gao, Xiangyu and Shen, Jun and He, Junjun and Shen, Tian and others},
  journal={arXiv preprint arXiv:2306.09579},
  year={2023}
}

medfm's People

Contributors

mengzhangli avatar xiaosongwang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.