GithubHelp home page GithubHelp logo

mr7495 / covid-ct-code Goto Github PK

View Code? Open in Web Editor NEW
60.0 3.0 29.0 17.16 MB

Fully automated code for Covid-19 detection from CT scans from paper: https://doi.org/10.1016/j.bspc.2021.102588

Home Page: https://doi.org/10.1016/j.bspc.2021.102588

Jupyter Notebook 93.15% Python 6.85%
deep-learning deep-neural-networks convolutional-neural-networks computer-vision covid-19 covid19-data ct-scans radiology automated-machine-learning covid-ct ct-scan-images covid-dataset ctscan-dataset

covid-ct-code's Introduction

A Fully Automated Deep Learning-based Network For Detecting COVID-19 from a New And Large Lung CT Scan Dataset

COVID-19 is a severe global problem, and AI can play a significant role in preventing losses by monitoring and detecting infected persons in early-stage. This paper aims to propose a high-speed and accurate fully-automated method to detect COVID-19 from the patient's CT scan images. We introduce a new dataset that contains 48260 CT scan images from 282 normal persons and 15589 images from 95 patients with COVID-19 infections. At the first stage, this system runs our proposed image processing algorithm to discard those CT images that inside the lung is not properly visible in them. This action helps to reduce the processing time and false detections. At the next stage, we introduce a novel method for increasing the classification accuracy of convolutional networks. We implemented our method using the ResNet50V2 network and a modified feature pyramid network alongside our designed architecture for classifying the selected CT images into COVID-19 or normal with higher accuracy than other models. After running these two phases, the system determines the condition of the patient using a selected threshold. We are the first to evaluate our system in two different ways. In the single image classification stage, our model achieved 98.49% accuracy on more than 7996 test images. At the patient identification phase, the system correctly identified almost 234 of 245 patients with high speed. We also investigate the classified images with the Grad-CAM algorithm to indicate the area of infections in images and evaluate our model classification correctness.

The details about our dataset is available at COVID-CTset
The trained models are made available at https://drive.google.com/drive/u/1/folders/1pXOkJe15qDeqa-sgSQ116yxMyC0tpmFm
Find our paper at Here

The general view of our work in this paper is represented in the next figure.

photo not available
General view of our proposed fully automated network

CT scans Selection

The lung HRCT scan device takes a sequence of consecutive images(we can call it a video or consecutive frames) from the chest of the patient that wants to check his infection to COVID-19. In an image sequence, the infection points may appear in some images and not be shown in other images.

The clinical expert analyzes theses consecutive images and, if he finds the infections on some of them, indicates the patient as infected.

Consider we have a neural network that is trained for classifying CVOID-19 cases based on a selected data that inside the lung was obviously visible in them. If we test that network on each image of an image sequence the belongs to a patient, the network may fail, because the lung is closed at the beginning and the end of each CT scan image sequence as it is depicted in the next figure. Hence, the network has not seen these cases while training; it may result in wrong detections, and so does not work well.

photo not available
This figure shows the difference between an open lung and a closed lung

We propose some other techniques to discard the images that inside the lungs are not visible in them. Doing this also reduces performing time for good because, because the networks now only see some selected images.

The main difference between an open lung and closed lung is that the open lung image has lower pixel values(near to black) in the middle of the lung. First, we set a region in the middle of the images for analyzing the pixel values in them. This region should be at the center of the lung in all the images, so open-lung and closed-lung show the differences in this area. Unfortunately, the images of the dataset were not on one scale, and the lung's position differed for different patients; so after experiments and analysis, as the images have 512*512 pixels resolution, we set the region in the area of 120 to 370 pixels in the x-axis and 240 to 340 pixels in the y-axis ([120,240] to [370,340]). This area shall justify in containing the information of the middle of the lung in all the images.

The images of our dataset are 16-bit grayscale images. The maximum pixel value between all the images is almost equal to 5000. This maximum value differs very much between different images. At the next step for discarding some images and selecting the rest of them from an image sequence that belongs to a patient, we aim to measure the pixels of each image in the indicated region that have less value than 300, which we call dark pixels. This number was chosen out of our experiments.

For all the images in the sequence, we count the number of pixels in the region with less value than 300. After that, we would divide the difference between the maximum counted number, and the minimum counted number by 1.5. This calculated number is our threshold. For example, if a CT scan image sequence of a patient has 3030 pixels with a value of less than 300 in the region, and another has 30 pixels less than 300, the threshold becomes 2000. The image with less dark pixels in the region than the threshold is the image that the lung is almost closed in that, and the image with more dark pixels is the one that inside the lung is visible in it.

We calculated this threshold in this manner that the images in a sequence (CT scans of a patient) be analyzed together because, in one sequence, the imaging scale does not differ. After that, we discard those images that have less counted dark pixels than the calculated threshold. So the images with more dark pixels than the computed threshold will be selected to be given to the network for classification.

In the next figure, the image sequence of one patient is depicted, where you can observe which of the images the algorithm discards and which will be selected.

photo not available
The output of the selection algorithm. The highlighted images are the rejected ones by the algorithm

The CT selection algorithm is shared at CT_selection_algorithm.py

Neural Networks

In this research, at the next stage of our work, we used deep convolution networks to classify the selected image of the first stage into normal or COVID-19. We utilized Xception, ResNet50V2, and our model for running the classification.

Feature pyramid network(FPN) helps when there are objects with different scales in the image. Although here we investigate image classification, to do this, the network must learn about the infection points and classify the image based on them. Using FPN can help us better classify the images in our cases.

In the next figure, you can see the architecture of the proposed network. We used concatenation layers instead of adding layers in the default version of the feature pyramid network due to the authors' experience. At the end of the network, we concatenated the five classification results of the feature pyramid outputs(each output presents classification based on one scale features) and gave it to the classifier so that the network can use all of them for better classification.

photo not available
Architecture of ResNet50V2 with FPN

The evaluation results based on single image classification is reported in next table:

Average between five folds Overall Accuracy COVID sensitivity Normal sensitivity
ResNet50V2 with FPN 98.49 94.96 98.7
Xception 96.55 98.02 96.47
ResNet50V2 97.52 97.99 97.49

photo not available
Visualized Features by Grad-Cam algorithm to show that the network is operating correctly and indicate the infection regions in the COVID-19 CT Scans

photo not available
In the normal images, as the network does not see any infections, the highlighted features would be at the center showing that no infections have been found

The developed code for training and validation is shared available at COVID_Train&Validation.ipynb

Fully automated Network

In Automated_covid_detector_validation.ipynb You can find the developed code for validating our fully automated networks on out dataset.

By using Automated_covid_detector_with CT selection algorithm, you can apply the automated network with CT selection algorithm on a patient CT scan folder to find out if he is infected to COVID-19 or not

Automated_covid_detector_without_selection_algorithm also presents the code for running our system on custom CT images without utilizing the CT selection algorithm

The fully automated network takes the Ct scan images of a person as an input and runs the selection algorithm on them to select only the proper ones. Then the selected images would be given to network for classification, and if some percent of the selected images (An optional value) of a patient be classified as COVID-19, then that person would be considered as infected to COVID-19.

The evaluated results of the fully automated network on more than 230 patients are shown in the next table:

Average between five folds Correct Identified Patients Wrong Identified Patients Correct COVID Identified Wrong Identified as Normal Normal Correct Identified Wrong Identified as COVID
Our model 233.8 10.8 17.6 1.4 216.2 9.4
Xception 218.8 25.8 18.8 0.2 200 25.6
ResNet50V2 225.2 19.4 18.2 0.8 207 18.6

The published version of our paper is available at:
https://doi.org/10.1016/j.bspc.2021.102588

If you use our data or codes, please cite it by:

@article{RAHIMZADEH2021102588,
title = {A fully automated deep learning-based network for detecting COVID-19 from a new and large lung CT scan dataset},
journal = {Biomedical Signal Processing and Control},
pages = {102588},
year = {2021},
issn = {1746-8094},
doi = {https://doi.org/10.1016/j.bspc.2021.102588},
url = {https://www.sciencedirect.com/science/article/pii/S1746809421001853},
author = {Rahimzadeh, Mohammad and Attar, Abolfazl and Sakhaei, Seyed Mohammad},
}

If you have any questions, contact me by this email : [email protected]

covid-ct-code's People

Contributors

mr7495 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

covid-ct-code's Issues

Can you provide your dataset like folder style?

Dear Mohamad,
First, I thank you for your great job.
I am also interested in the topic Covid-19 CT scan now. I am having very promising models to predict Covid-19 CT scans. So I am trying to expand my experiments via external validity. Can you give me your ENTIRE dataset like folder style, e.g. Covid-19 folder containing only images classified from Covid-19 patients, and Non-Covid-19 folder consiting of scans from healthy subjects?
Best regards.
Linh

Automated_covid_detector_validation: img_name is not defined

This notebook doesn't run because of this error. I can't figure out what you meant by img_name here as it has indeces.

NameError Traceback (most recent call last)
in ()
15 full_add=os.path.join(r,file)
16 if 'SR_2' in full_add: #Select only SR_2 folder
---> 17 index1=img_name.index('patient')
18 index2=img_name.index('_SR')
19 if 'covid' in full_add:

NameError: name 'img_name' is not defined

Function call stack: keras_scratch_graph

I really like your code and I am trying to run it with the given dataset. I am trying to use my NVIDIA gpu, but whenever I do, it runs into memory problems and raises the following error:

2020-11-22 21:40:39.870589: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.09GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.

Function call stack: keras_scratch_graph at the end.

However, when I reduce the batchsize, the error stops, but I would want to run this code with a larger batchsize. Do you know what may be the problem?

numpy.AxisError: axis 2 is out of bounds for array of dimension 3

Additionally, I am receiving the following error:

This is for the COVID and Validation code, when it gets the predicted class index.

The error is for the following line:
pred_ind=np.argmax(net.predict(np.expand_dims(np.expand_dims(img,axis=0),axis=3))[0])

Do you know what might be the problem? Thanks!

How do you select the layers? If backbone model is InceptionV3 which layers to select?

Greetings. Thank you for your exellent work.
I am curious, in your code in COVID_Train&Validation.ipynb, you selected 3 layers from the ResNet50V2 framework ["conv4_block1_preact_relu", "conv5_block1_preact_relu", "post_relu"], what are the reasons of selecting these layers? And if I want to replace the backbone model to InceptionV3, what will be the layers selected in that model?
If you coudl kindly help me out I would be most grateful.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.