GithubHelp home page GithubHelp logo

audiowiz / 3d-transformer-med-classification Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kathrin229/3d-transformer-med-classification

0.0 0.0 0.0 7.98 MB

Repository for Master thesis project investigating classification of 3D chest CT scans using Vision Transformer.

Home Page: https://link.springer.com/chapter/10.1007/978-3-031-34344-5_27#chapter-info

Python 100.00%

3d-transformer-med-classification's Introduction

3d-transformer-med-classification

This repository contains implementations for the thesis "Covid-19 Diagnosis In 3D Chest CT Scans With Attention-Based Models". The goal of this thesis is to investigate 3D image classification in the medical context with attention based models, specifically, Covid-19 classification on 3D lung CT scans. The models DenseNet121 and TimeSformer are used. Within this research, new attention schemes for TimeSformer are proposed.

Table of Contents

Information about the project

The three dimensional information in CT scans reveals a lot of findings in the medical context, also for detecting symptoms of COVID-19 in chest CT scans (Shamshad et al. 2022). However, due to the lack of availability of large scale datasets in 3D, the use of attention-based model in this field is proven to be difficult. With transfer learning and domain adaptions, this thesis wants to tackle this problem and investigates the performance of attention-based models on COVID-19 two-class and multi-class classification in 3D chest CT scans and compares these results to the performance of classical Convolutional Neural Networks. The CC-CCII dataset (Zhang et al., 2020) is a publicly available 3D chest CT scan dataset that we modify for our research purpose. We propose the first application of a pure vision transformer-based model for COVID-19 CT scan classification that is using the 3D information in the CT scans. This is done by applying a pre-trained TimeSformer model (Bertasius et al., 2021) on the modified dataset. The attention-based model is outperforming our DenseNet121 (Huang et al., 2018) baseline model in accuracy by up to 4% and 7% for two and three classes, respectively. We also propose three newly developed attention schemes in addition to the attention schemes developed for TimeSformer. One of them, the "space-limited" attention scheme, is able to outperform other attention schemes by around 2% in accuracy and reduces runtime by almost one third compared to the original attention scheme.

Setup

For the setup of environments, follow the installation instructions for packages in readme.txt. Once the environment for the desired model is set up, DenseNet121 can be trained and evaluated by running dense_net_call.py and the same for TimeSformer by running timesformer_call.py. These two files also contain the parameter setup for the models. If a different configuration for the new attention schemes is needed, this can be adjusted directly in vit.py. `

Models and Data

For our experiments we use the CC-CCII dataset. We resize and cut the 3D CT scans to a size of 160x128 (width x height) by using bounding boxes to cut off as less lung parts as possible. We sample the slices in the scan with a symmetrical downsampling strategy to 32 slices. The code for pre-processing is available in the folder preprocess. The file data_set_creation.py starts the pre-processing pipeline.

This is an example of one pre-processed scan of our dataset: Example scan For other comparisons, we created two more datasets that are half the size, so 80x64x16 (width x height x depth). One dataset takes the middle 16 slices of the scan (called mid) and the other dataset takes every second slice (called 2nd).

The datasets are available here:
https://drive.google.com/drive/folders/1xHaQyZCGjSYgd11fktD22Ue2WhFPPvwd?usp=sharing
The names of the datasets are structured as follows: dataset_<class>_<train/valid/test>_<image_slice_size>.npz

We fine-tune a pre-trained TimeSformer model on our datasets. For this, we modify the TimeSformer architecture such that it fits for grayscale 3D images:

TimeSformer modification

In the thesis, we propose three new attention schemes for TimeSformer: space-limited attention sl (a), time-limited attention tl (b) and space-and-time-limited attention stl (c).

New attention schemes

Fine-tuned models can be found here:
https://drive.google.com/drive/folders/1Wf8sRcY-h5JbQsLkou1s--QE-uAMSr-g?usp=sharing
The names of the models are structured as follows:
<model>_<attention_scheme><attention_scheme_param>_<n_classes>_<dataset_size>.py

Acknowledgements

3d-transformer-med-classification's People

Contributors

kathrin229 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.