GithubHelp home page GithubHelp logo

jchen42703 / pneumothorax-seg-cnn Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 1.0 1.59 MB

Using CNNs for the 2019 SIIM-ACR Pneumothorax Segmentation Kaggle Challenge

Python 100.00%
cnn tf-keras tensorflow pneumothorax segmentation kaggle convolutional-neural-networks unet

pneumothorax-seg-cnn's Introduction

Convolutional Neural Networks for Pneumothorax Segmentation

This repository contains code for my approach to the 2019 SIIM-ACR Pneumothorax Segmentation Kaggle Challenge. Implemented using tensorflow.keras.

Credits to Amazingly Helpful Kernels

Preprocessing

The only preprocessing that was done was CLAHE and rescaling the intensities to [0, 255]. The output images were converted to .png files.

Solution

[0.8523 Public LB // 0.8980 Private (1% of test data)]
Note: Models were not retrained on the stage 1 test labels.
My final solution was essentially just running the Unet Plus Plus with EfficientNet Encoder kernel 10 different times and ensembling the SWA snapshots with horizontal flipping for test-time augmentation and zeroing out predictions with small ROIs (pneumothorax_seg.inference.segmentation_only.py). There were some variations in the models I ensembled (mainly, the data augmentation), but they didn't have much impact on the overall performance across folds. Here are the model weights I used.

What about the other models?

Classification-Segmentation Cascade

[Best: 0.8500 Public LB]
With this approach, I was planning on replicating something similar to the 1st Place Solution to the RSNA Pneumonia Detection Challenge, where they used ensembles of classification models (5 10-fold CV ensembles) in conjuction with ensembles of detection models. Both that challenge and this one used the NIH ChestXray14 Dataset, and they also released pre-trained weights on said dataset.

  • used binary cross entropy, Adam, SWA, and a cosine annealing LR.
  • Data augmentation was horizontal flipping, color inversion, gaussian smoothing, rotations, zooms, and random gamma (as shown here)

I found that I couldn't really get any real worthwhile performance improvements with a small ensemble of 4 classification models on a single fold (EfficientNetB4 with ImageNet weights; DenseNet169, InceptionResNetV2, and Xception on NIH weights) and the UEfficientNetB4 trained on pneumothorax positive patients only, so I just reverted back to segmentation only approaches. (Classification F-Scores were around 0.7-0.76 and AUCs around 0.82-0.87 depending on the thresholds). It would probably do better with larger ensembles, but time was a serious constraint with limited computing power.

However, I did create a cleaner version found in the original repository to load the NIH weights in the classification models, which is found in pneumothorax_seg.training.utils.load_pretrained_classification_model.

Misc. Approaches

  • Regular U-Net: Just did a basic recursive tf.keras implementation with conv, LeakyReLU, and Instance Normalization. However, it didn't do too well (CV < 0.75), so I opted to just use the U-Net++. (This was done on 512x512 inputs).
  • VAE CNN: This architecture is from an unofficial implementation of the 1st place solution to the 2018 MICCAI Brain Tumour Segmentation Challenge. I just made a 2D version so that it fit the code style in this repository and that it worked for tensorflow.keras and channels_last. I didn't play with it much, but it did about as well as the vanilla U-Net, while consuming a lot more compute time. (This was also done on 512x512 inputs). Better hyperparameter tuning might've made a bigger difference, but I didn't have much compute to work with.

pneumothorax-seg-cnn's People

Contributors

jchen42703 avatar

Watchers

 avatar  avatar

Forkers

jackcheng8668

pneumothorax-seg-cnn's Issues

Generators will shuffle the same way everytime if the session is seeded.

This is because:

    def on_epoch_end(self):
        """Updates indexes after each epoch"""
        self.indexes = np.arange(len(self.fpaths))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)

Since it resets the indices, self.indexes will be the same shuffled list if np.random.seed is set. I haven't tested this though.

Output `submission_final.csv` from `create_submission` still contains 1s.

This presumably is the cause of my low lb scores (0.6640 and 0.6656). This bug is caused by:

    test_ids = np.array([Path(fpath).stem for fpath in test_fpaths])
    seg_ids = test_ids[np.where(sub_df["EncodedPixels"] == 1)[0]].tolist()

There is no guarantee that test_ids will be arranged in the same way as sub_df["ImageId"]. As such, this causes a bunch of wrong patient ids being chosen and a lot of them being missed; hence, lots of 1s appearing in submission_final.csv.

Allow user to specify the test directory

test_fpaths = glob.glob('./test/*') # assumes this directory for now

This needs to be fixed ^. How about we add an extra argument to create_submission.create_submission to specify the path to the test directory?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.