GithubHelp home page GithubHelp logo

ispras / endometrium-dataset-analysis Goto Github PK

View Code? Open in Web Editor NEW
6.0 4.0 1.0 2.89 MB

This repository is dedicated to the analysis of the EndoNuke dataset

License: Mozilla Public License 2.0

Python 12.12% Jupyter Notebook 87.88%

endometrium-dataset-analysis's Introduction

Description

This is a supplementary repository for EndoNuke dataset. The dataset can be accessed at endonuke.ispras.ru. The code here is the implementation of the methods described in the corresponding paper and also provides some useful tools to use this dataset. It's up to a user to decide, whether to use them or to create the original methods.

Installation

To install the package make sure, that the python3 and pip. We recommend to use python3.8 and pip v21.1.1. To run a dip test on staining distributions, it's also necessary to have R installed.

It's also nessessary to correctly install cv2 dependencies. The following will be sufficient (for Ubuntu 20.04):

apt install libopencv-dev python3-opencv

For all subsequent actions we strongly recommend to use venv module to keep the system python unspoiled.

After all aforementioned packages are set up, run the following lines to install the package endoanalysis:

git clone [email protected]:ispras/endometrium-dataset-analysis.git
cd endometrium-dataset-analysis
pip install .

To test the installation run:

pip list | grep endometrium

The result must contain the line starting with endometrium-dataset-analysis

R installation

We use R implementation dip test for unimodality in the noteboog staining.ipynb To install R, follow the instructions from here. After R is installed, install the diptest package inside R environment:

install.packages("diptest")

and than install rpy2 via pip (don't forget to return to bash):

pip install rpy2

Methods from the paper

Before following the instructions presented here it's highly reccomended to read the paper. To reproduce the analysis presented in the paper the following steps should be followed:

  1. Download the dataset from endonuke.ispras.ru and extract the archive. We assume, that the dataset is extracted in the directory endometrium-dataset-analysis/data/dataset and master yml files are extracted to endometrium-dataset-analysis/data/master_ymls Then go to the project root directory: endometrium-dataset-analysis.

  2. Run the script to resize the dataset images and annotations (note, that the resize is done in-place):
    resize_dataset --master data/master_ymls/everything.yaml --size 256,256
    
  3. Run the script to generate the masks without any size filtering:
    generate_masks --master data/master_ymls/unique.yaml --workers 8 --window 100 --avg_area 20  --new_master_dir data/masks/masks_raw --compress
    

    These masks will be saved to endometrium-dataset-analysis/data/masks/masks_raw dir.

  4. Go through obtain the mean radius and area thresholds using the following mean_raduis.ipynb notebook:

    or just use the values 18 for small outliers threshold, 667 for large threshold and 163 as average area.

  5. Run the script to generate filtered full masks (masks of fixed size):
    generate_masks --master data/master_ymls/unique.yaml --workers 8 --window 100 --avg_area 163 --min_area 18 --max_area 667  --new_master_dir data/masks/masks_full --compress
    

    These masks will be saved to endometrium-dataset-analysis/data/masks/masks_full dir.

  6. Run the script to generate "probes" masks (masks of fixed size):
    generate_masks --master data/master_ymls/unique.yaml --workers 8 --window 100 --avg_area 20 --min_area 1000000000  --new_master_dir data/masks/masks_probes --compress
    

    These masks will be saved to endometrium-dataset-analysis/data/masks/masks_probes dir.

  7. Run the scripts to calculate dab values for probe and full mask methods:
    dab_values --master data/masks/masks_full/unique_with_masks.yml --bin_out data/dab_values/full.npy
    dab_values --master data/masks/masks_probes/unique_with_masks.yml --bin_out data/dab_values/probes.npy
    
  8. Go through staining.ipynb notebook to perform the dip tests and Kolmogorov-Smirnov test (note, that for this step R should be installed so rpy2 package is operational):

  9. Finally, go through agreement.ipynb notebook to reproduce the agreement study

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.