GithubHelp home page GithubHelp logo

flamingofugang / deeplearningcamelyon Goto Github PK

View Code? Open in Web Editor NEW

This project forked from 3dimaging/deeplearningcamelyon

0.0 1.0 0.0 559 KB

DeepLearning - Camelyon16 dataset

License: Other

Python 28.92% Shell 0.20% Jupyter Notebook 70.88%

deeplearningcamelyon's Introduction

Deep Learning Pipeline for Camelyon 2016 dataset

Weizhe Li, Weijie Chen

0 - Preparation

0.1 Set up deep learning environment.

0.2 ASAP installation and image display

Compare the ASAP with OpenSlide: ASAP doesn’t have detailed manual to describe its commands; OpenSlide has a much better document for its commands. ASAP has a GUI; OpenSlide doesn’t

0.3 Mask file generation

  • Mask file is the ground truth for model training. Mask file has the exact same dimensions as its corresponding WSI image. Mask file is a binary file with normal tissue coded as ‘0’ and tumor tissue coded as ‘1’ for each corresponding pixel of WSI image.
  • Mask file generation
  • However, the code provided by the organizer is misleading. The mask file generated by the code from their paper is all ‘0’; then another piece of code suggested generated mask file that can be open only by ASAP GUI, not by its command line and OpenSlide.
  • Time consuming

WSI and Mask file: tumor_026

alt text

WSI and Mask file: tumor_005

alt tex

1 - WSI Visulization with Annotation

Annotation Visulization

2 - Image Preprocess

2.1 Image Segmentation

To reduce computation, the blank regions (no tissue) on slide will be excluded.

  • Color space switch to HSV
  • Tissue region segmentation (Otsu’s method of foreground segmentation)

2.2 Patch Extraction

Step 1 : Randomly extract patches (256 x 256) on the tissue region at the level of 40x

	Tumor slide : 1K positive and 1K negative from each slide

	Normal slide: 1K negative from each slide

Step 2 : Crop 224x224 patches and conduct image augmentation

  • stain normalization (Method II)

    The color variety among patches

The patches before and after stain normalization

  • flip
  • adding color noise (Method II)

Step 3 : Image Generator

Patches: alt text

Ground Truth: alt text

3 - Training Neural Network

3.1 FCN

Lambda, Normalize input (x / 255.0 - 0.5), outputs 256x256x3 
  1. Convolution1, 5 x 5 kernel, stride 2, outputs 128x128x100
  2. Maxpooling1, 2 x 2 window, stride 2, outputs 64x64x100
  3. Convolution2, 5 x 5 kernel, stride 2, outputs 32x32x200
  4. Maxpooling2, 2 x 2 window, stride 2, outputs 16x16x200
  5. Convolution3, 3 x 3 kernel, stride 1, outputs 16x16x300
  6. Convolution4, 3 x 3 kernel, stride 1, outputs 16x16x300
  7. Dropout, 0.1 rate
  8. Convolution5, 1x1 kernel, stride 1, outputs 16x16x2
  9. Deconvolution, 31 x 31 kernel, stride 16, outputs 256x256x2

-FCN training

FCN prediction

3.2 U-net

Learning Curve alt text

Prediction by trained U-net: alt text

3.3 GoogleNet

-- step 1: Model Training

Training GoogleNet

  • Optimization method: Stochastic gradient descent

  • Weight initialization: Random sampling from a Gaussian distribution

  • Batch size: 32

  • Batch normalization: No

  • Regularization: L2-regularization (0.0005) and 50% dropout

  • Learning rate: 0.01, multiplied by 0.5 every 50,000 iterations (0.01, multiplied by 0.1 per epoch)

  • Activation function: ReLu

  • Loss function: Cross-entropy

  • Number of training epochs/iterations: 300,000 iterations

-- step 2: Negative Mining

Extract additional training patches from false positive regions

4 - Prediction and Evaluation

4.1 Make predictions and construct heatmaps

Test images were divided into non-overlapping small patches; each patch will get a predicted image for each pixel assigned by probability. Heatmap is a way to display the probability

Put all the patches together and get prediction for the whole slide (code for heatmap generation based on predicted values).

Results

Heatmap for tumor_026:

alt text

The overview of heatmap for tumor_026:

alt text

Comparison of predicted with ground truth for tumor_005:

alt text

4.2a Slide-based Classification

Extracting Features for whole-slide image classification task

Global Features Extraction

  1. The ratio between the area of metastatic regions and the tissue area.
  2. The sum of all cancer metastases probailities detected in the metastasis identification task, divided by the tissue area. caculate them at 5 different thresholds (0.5, 0.6, 0.7, 0.8, 0.9), so the total 10 global features

Local Features Extraction

Based on 2 largest metastatic candidate regions (select them based on a threshold of 0.5).

9 features were extracted from the 2 largest regions:

  1. Area: the area of connected region
  2. Eccentricity: The eccentricity of the ellipse that has the same second-moments as the region
  3. Extend: The ratio of region area over the total bounding box area
  4. Bounding box area
  5. Major axis length: the length of the major axis of the ellipse that has the same normalized second central moments as the region
  6. Max/mean/min intensity: The max/mean/minimum probability value in the region
  7. Aspect ratio of the bounding box
  8. Solidity: Ratio of region area over the surrounding convex area

4.2b Lesion-based Detection

4.3 ROC and FROC Generation

Teams using GoogleNet

HMS&MIT, HMS&MGH(model I), Smart Imaging(model II), Osaka University, CAMP-TUM(model II), Minsk Team, DeepCare

References

deeplearningcamelyon's People

Contributors

3dimaging avatar brandon-gallas avatar sarahndudgeon avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.