GithubHelp home page GithubHelp logo

mic-dkfz / regrcnn Goto Github PK

View Code? Open in Web Editor NEW
50.0 5.0 21.0 4.35 MB

This repository holds the code framework used in the paper Reg R-CNN: Lesion Detection and Grading under Noisy Labels. It is a fork of MIC-DKFZ/medicaldetectiontoolkit with regression capabilites.

License: Apache License 2.0

Python 92.95% Cuda 4.61% C 0.07% C++ 1.19% Shell 1.18%
detection object-detection mask-rcnn retinanet 3d-detection 3d-object-detection medical-image-processing medical-image-computing medical-image-analysis deep-learning

regrcnn's Introduction

Copyright © German Cancer Research Center (DKFZ), Division of Medical Image Computing (MIC). Please make sure that your usage of this code is in compliance with the code license.

Release Notes

0.0.2 - Now with updated torch (>=1.3) and torchvision (>=0.4) dependencies and on python-3.7.5 basis.

Introduction

This repository holds the code framework used in the paper Reg R-CNN: Lesion Detection and Grading under Noisy Labels [1]. The framework is a fork of MIC's medicaldetectiontoolkit with added regression capabilities.

As below figure shows, the regression capability allows for the preservation of ordinal relations in the training signal as opposed to a standard categorical classification loss like the cross entropy loss (see publication for details).


Network Reg R-CNN is a version of Mask R-CNN [2] but with a regressor in place of the object-class head (see figure below). In this scenario, the first stage makes foreground (fg) vs background (bg) detections, then the regression head determines the class on an ordinal scale. Consequently, prediction confidence scores are taken from the first stage as opposed to the head in the original Mask R-CNN.


In the configs file of a data set in the framework, you may set attribute self.prediction_tasks = ["task"] to a value "task" from ["class", "regression_bin", "regression"]. "class" produces the same behavior as the original framework, i.e., standard object-detection behavior. "regression" on the other hand, swaps the class head of network Mask R-CNN [2] for a regression head. Consequently, objects are identified as fg/bg and then the class is decided by the regressor. For the sake of comparability, "regression_bin" produces a similar behavior but with a classification head. Both methods should be evaluated with the (implemented) Average Viewpoint Precision instead of only Average Precision.

Below you will found a description of the general framework operations and handling. Basic framework functionality and description are for the most part identical to the original medicaldetectiontoolkit.


[1] Ramien, Gregor et al., "Reg R-CNN: Lesion Detection and Grading under Noisy Labels". In: UNSURE Workshop at MICCAI, 2019.
[2] He, Kaiming, et al. "Mask R-CNN" ICCV, 2017

Overview

This is a comprehensive framework for object detection featuring:

  • 2D + 3D implementations of common object detectors: e.g., Mask R-CNN [2], Retina Net [3], Retina U-Net [4].
  • Modular and light-weight structure ensuring sharing of all processing steps (incl. backbone architecture) for comparability of models.
  • training with bounding box and/or pixel-wise annotations.
  • dynamic patching and tiling of 2D + 3D images (for training and inference).
  • weighted consolidation of box predictions across patch-overlaps, ensembles, and dimensions [4] or standard non-maximum suppression.
  • monitoring + evaluation simultaneously on object and patient level.
  • 2D + 3D output visualizations.
  • integration of COCO mean average precision metric [5].
  • integration of MIC-DKFZ batch generators for extensive data augmentation [6].
  • possible evaluation of instance segmentation and/or semantic segmentation by dice scores.

[3] Lin, Tsung-Yi, et al. "Focal Loss for Dense Object Detection" TPAMI, 2018.
[4] Jaeger, Paul et al. "Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection" , 2018

[5] https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/cocoeval.py
[6] https://github.com/MIC-DKFZ/batchgenerators

How to cite this code

Please cite the Reg R-CNN publication [1] or the original publication [4] depending on what features you use.

Installation

Setup package in virtual environment

git clone https://github.com/MIC-DKFZ/RegRCNN.git.
cd RegRCNN
virtualenv -p python3.7 regrcnn_env
source regrcnn_env/bin/activate
python setup.py install
Custom Extensions

This framework uses two custom mixed C++/CUDA extensions: Non-maximum suppression (NMS) and RoIAlign. Both are adapted from the original pytorch extensions (under torchvision.ops.boxes and ops.roialign). The extensions are automatically compiled from the provided source files under RegRCNN/custom_extensions with above setup.py. However, the extensions need to be compiled specifically for certain GPU architectures. Hence, please ensure that the architectures you need are included in your shell's environment variable TORCH_CUDA_ARCH_LIST before compilation.

Example: You want to use the modules with the new TITAN RTX GPU, which has Compute Capability 7.5 (Turing Architecture), but sometimes you also want to use it with a TITAN Xp (6.1, Pascal). Before installation you need to export TORCH_CUDA_ARCH_LIST="6.1;7.5". A link list of GPU model names to Compute Capability can be found here: https://developer.nvidia.com/cuda-gpus. Note: If you'd like to import the raw extensions (not the wrapper modules), be sure to import torch first.

Prepare the Data

This framework is meant for you to be able to train models on your own data sets.

In order to include a data set in the framework, create a new folder in RegRCNN/datasets, for instance "example_data". Your data set needs to have a config file in the style of the provided example data sets "lidc" and "toy". It also needs a data loader meeting the same requirements as the provided examples. Likely, you will also need a preprocessing script that transforms your data (once per data set creation, i.e., not a repetitive operation) into a suitable and easily processable format. Important requirements:

  • The framework expects numpy arrays as data and segmentation ground truth input.
  • Segmentations need to be suited for object detection, i.e., Regions of Interest (RoIs) need to be marked by integers (RoI-ID) in the segmentation volume (0 is background). Corresponding properties of a RoI, e.g., the "class_targets" need to be provided in a separate array or list with (RoI-ID - 1) corresponding to the index of the property in the list (-1 due to zero-indexing). Example: A data volume contains two RoIs. The second RoI is marked in the segmentation by number 2. The "class_targets" info associated with the data volume holds the list [2, 3]. Hence, RoI-ID 2 is assigned class 3.
  • This framework uses a modified version of MIC's batchgenerators' segmentation-to-bounding-box conversion tool. In this version, "class_targets", i.e., object classes start at 1, 0 is reserved for background. Thus, if you use "ConvertSegToBoundingBoxCoordinates" classes in your preprocessed data need to start at 1, not 0.

Two example data loaders are provided in RegRCNN/datasets. The way I load data is to have a preprocessing script, which after preprocessing saves the data of whatever data type into numpy arrays (this is just run once). During training / testing, the data loader then loads these numpy arrays dynamically. Please note the data input side is meant to be customized by you according to your own needs and the provided data loaders are merely examples: LIDC has a powerful data loader that handles 2D/3D inputs and is optimized for patch-based training and inference. Due to the large data volumes of LIDC, this loader is slow. The provided toy data set, however, is light weight and a good starting point to get familiar with the framework. It is fully creatable from scratch within a few minutes with RegRCNN/datasets/toy/generate_toys.py.

Execute

  1. Set I/O paths, model and training specifics in the configs file: RegRCNN/datasets/your_dataset/configs.py

  2. i) Train the model:

    python exec.py --mode train --dataset_name your_dataset --exp_dir path/to/experiment/directory       
    

    This copies snapshots of configs and model to the specified exp_dir, where all outputs will be saved. By default, the data is split into 60% training and 20% validation and 20% testing data to perform a 5-fold cross validation (can be changed to hold-out test set in configs) and all folds will be trained iteratively. In order to train a single fold, specify it using the folds arg:

    python exec.py --folds 0 1 2 .... # specify any combination of folds [0-configs.n_cv_splits]
    

    ii) Alternatively, train and test consecutively:

    python exec.py --mode train_test --dataset_name your_dataset --exp_dir path/to/experiment/directory       
    
  3. Run inference:

    python exec.py --mode test --exp_dir path/to/experiment/directory 
    

    This runs the prediction pipeline and saves all results to exp_dir.

  4. Additional settings:

    • Check the args parser in exec.py to see which arguments and modes are available.
    • E.g., you may pass -d or --dev to enable a short development run of the whole train_test procedure (small batch size, only one epoch, two folds, one test patient, etc.).

Models

This framework features models explored in [4] (implemented in 2D + 3D): The proposed Retina U-Net, a simple but effective Architecture fusing state-of-the-art semantic segmentation with object detection,


also implementations of prevalent object detectors, such as Mask R-CNN, Faster R-CNN+ (Faster R-CNN w\ RoIAlign), Retina Net, Detection U-Net (a U-Net like segmentation architecture with heuristics for object detection.)



Training annotations

This framework features training with pixelwise and/or bounding box annotations. To overcome the issue of box coordinates in data augmentation, we feed the annotation masks through data augmentation (create a pseudo mask, if only bounding box annotations provided) and draw the boxes afterwards.


The framework further handles two types of pixel-wise annotations:

  1. A label map with individual ROIs identified by increasing label values, accompanied by a vector containing in each position the class target for the lesion with the corresponding label (for this mode set get_rois_from_seg_flag = False when calling ConvertSegToBoundingBoxCoordinates in your Data Loader). This is usual use case as explained in section "Prepare the data".
  2. A binary label map. There is only one foreground class and single lesions are not identified. All lesions have the same class target (foreground). In this case the data loader runs a Connected Component Labelling algorithm to create processable lesion - class target pairs on the fly (for this mode set get_rois_from_seg_flag = True when calling ConvertSegToBoundingBoxCoordinates in your data loader).

Prediction pipeline

This framework provides an inference module, which automatically handles patching of inputs, and tiling, ensembling, and weighted consolidation of output predictions:




Consolidation of predictions

Weighted Box Clustering

Multiple predictions of the same image (from test time augmentations, tested epochs and overlapping patches), result in a high amount of boxes (or cubes), which need to be consolidated. In semantic segmentation, the final output would typically be obtained by averaging every pixel over all predictions. As described in [4], weighted box clustering (WBC) does this for box predictions:





To enable WBC, set self.clustering = "wbc" in your configs file.

Non-Maximum Suppression

Test-time predictions can alternatively be aggregated with standard non-maximum suppression. In your configs file, simply set self.clustering = "nms" instead of "wbc".

As a further alternative you may also choose no test-time aggregation by setting self.clustering = None.

Visualization / Monitoring

In opposition to the original framework, this fork uses tensorboard for monitoring training and validation progress. Since, for now, the framework cannot easily be updated to pytorch >= 1.x, we need third-party package tensorboardX to use tensorboard with pytorch.

You can set an applicable choice of implemented metrics like "ap" for Average Precision or "auc" for patient-level ROC-AUC in the configs under self.metrics = [...]. Metrics are then evaluated by evaluator.py and recorded in monitor_metrics. logger.metrics2tboard sends monitor_metrics to your tensorboard logfiles at the end of each epoch. You need to separately start a virtual tensorboard server, pass it your experiment directory (or directories, but it crashes if its more than ~5 experiments) and navigate to the server address. (You can also read up on tensoardboard usage in the original documentation).

Example:

  1. Activate your virtualenv where tensorboard is installed.
  2. Start tensorboard server. For instance, your experiment directory is yourexpdir:
    tensorboard --port 6007 --logdir yourexpdir
  3. Navigate to localhost:6007 in your browser.

Output monitoring

For qualitative monitoring, example plots are saved to yourexpdir/plots for training and validation and yourexpdir/test/example_plots for testing. Note, that test-time example plots may contain unconsolidated predictions over test-time augmentations, thereby possibly showing many overlapping and/or noisy predictions. You may adapt/use separate file RegRCNN/inference_analysis.py to create clean and nice plots of (consolidated) test-time predictions.

Balancing Mechanism of Example Data Loader

The data loaders of the provided example data sets employ a custom mechanism with the goal of assembling target-balanced batches or training sequences. I.e., the amount of examples shown per target class should be near balance.

The mechanism creates a sampling-likelihood distribution, as shown below, over all available patients (PIDs). At batch generation, some patients are drawn according to this distribution, others are drawn completely randomly (according to a uniform distribution across all patients). The ratio of uniformly and target-dependently drawn patients is set in your configs file by configs.batch_random_ratio. configs.balance_target determines which targets are considered for the balancing distribution.

The balancing distribution assigns probabilities s.t. expected occurrences of fg and bg RoIs among all classes are as similar as possible. The balance is naturally limited by multiple RoIs occurring in the same patient (e.g, if each patient has 4 RoIs of class 1 and 1 RoI of class 2 the best balancing ratio achievable is still 4:1). See utils/dataloader_utils.BatchGenerator.balance_target_distribution.

Experience has shown, that showing at least one foreground example in each batch is most critical, other properties have less impact.



Unittests

unittests.py contains some verification and testing procedures, which, however, need you to adjust paths in the TestCase classes before execution. Tests can be used, for instance, to verify if your cross-validation folds have been created correctly, or if separate experiments have the same fold splits.

License

This framework is published under the APACHE 2.0 License

regrcnn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

regrcnn's Issues

LIDC experiment preprocessing: target shape error

I tried to run preprocessing.py for LIDC experiment and got the following error:

...
processing 0335a with GT(s) single_annotator and merged, spacing (0.71484375, 0.71484375, 1.0) and img shape (300, 512, 512).
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "preprocessing.py", line 63, in resample_array
    assert target_shape[i] > 0
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "preprocessing.py", line 394, in pp_patient
    img_arr = resample_array(img_arr, img.GetSpacing(), self.cf.target_spacing)
  File "preprocessing.py", line 65, in resample_array
    raise AssertionError("AssertionError:", src_imgs.shape, src_spacing, target_spacing)
AssertionError: ('AssertionError:', (1, 512, 512), array([ 2., 30.,  1.]), (0.7, 0.7, 1.25))
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "preprocessing.py", line 470, in <module>
    pp.iterate_patients(processes=8)
  File "preprocessing.py", line 412, in iterate_patients
    pool.map(self.pp_patient, self.paths, chunksize=1)
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
AssertionError: ('AssertionError:', (1, 512, 512), array([ 2., 30.,  1.]), (0.7, 0.7, 1.25))

Got preprocessed data with https://github.com/MIC-DKFZ/LIDC-IDRI-processing/tree/v1.0.1.

What am I doing wrong?

Are there any requirements for gcc version?

I complied this project with gcc-4.8. But the errors occurs in nmx-extention==0.0.0.

/media/userdisk3/gjzhao/M20/RegRCNN-master/regrcnn_env/lib/python3.7/site-packages/torch/include/ATen/core/dispatch/Dispatcher.h: In member function ‘Return c10::Dispatcher::doCallUnboxed(const c10::DispatchTable&, const c10::LeftRight<ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction> >&, Args ...) const [with Return = bool; Args = {}]’:
/media/userdisk3/gjzhao/M20/RegRCNN-master/regrcnn_env/lib/python3.7/site-packages/torch/include/ATen/core/dispatch/Dispatcher.h:191:1: warning: control reaches end of non-void function [-Wreturn-type]
}
^
error: command 'gcc' failed with exit status 1
----------------------------------------
ERROR: Command errored out with exit status 1: /media/userdisk3/gjzhao/M20/RegRCNN-master/regrcnn_env/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-we5qua6l/setup.py'"'"'; file='"'"'/tmp/pip-req-build-we5qua6l/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-bdv9b78x/install-record.txt --single-version-externally-managed --compile --install-headers /media/userdisk3/gjzhao/M20/RegRCNN-master/regrcnn_env/include/site/python3.7/RoIAlign-extension-3D Check the logs for full command output.
Could not install custom extension custom_extensions/roi_align/3D from source due to error:
Command '['/media/userdisk3/gjzhao/M20/RegRCNN-master/regrcnn_env/bin/python', '-m', 'pip', 'install', 'custom_extensions/roi_align/3D']' returned non-zero exit status 1.
Trying to install from pre-compiled wheel.
FAILED to install custom extension custom_extensions/roi_align/3D due to Error:

[Errno 2] No such file or directory: 'custom_extensions/roi_align/3D/dist'
removed directory: ‘./build/bdist.linux-x86_64’
removed directory: ‘./build’
removed ‘./dist/RegRCNN-0.0.2-py3.6.egg’
removed ‘./dist/RegRCNN-0.0.2-py3.7.egg’
removed directory: ‘./dist’
removed ‘./RegRCNN.egg-info/SOURCES.txt’
removed ‘./RegRCNN.egg-info/PKG-INFO’
removed ‘./RegRCNN.egg-info/top_level.txt’
removed ‘./RegRCNN.egg-info/requires.txt’
removed ‘./RegRCNN.egg-info/dependency_links.txt’
removed directory: ‘./RegRCNN.egg-info’

AssertionError: deltas have nans

I have run the code with my own data, similar to the toy example and I get the following error. Does anyone have an idea what the cause could be for this error?

Traceback (most recent call last):
File "exec.py", line 270, in
test(cf, logger)
File "exec.py", line 198, in test
cf, "eval_test_separately") or not cf.eval_test_separately)
File "/Documents/RegRCNN/predictor.py", line 892, in predict_test_set
results_dict = self.predict_patient(batch) #only holds "boxes", "seg_preds"
File "/Documents/RegRCNN/predictor.py", line 816, in predict_patient
results_dict = self.data_aug_forward(batch)
File "/Documents/RegRCNN/predictor.py", line 628, in data_aug_forward
results_list = [self.spatial_tiling_forward(batch, patch_crops)]
File "/Documents/RegRCNN/predictor.py", line 550, in spatial_tiling_forward
patches_dict = self.batch_tiling_forward(batch)
File "/Documents/RegRCNN/predictor.py", line 509, in batch_tiling_forward
chunk_dicts += [self.net.test_forward(b, return_masks=self.cf.return_masks_in_test)]
File "/Documents/RegRCNN/models/mrcnn.py", line 749, in test_forward
_, _, _, detections, detection_masks = self.forward(img)
File "/Documents/RegRCNN/models/mrcnn.py", line 427, in forward
proposal_count, self.anchors, self.cf)
File "/Documents/RegRCNN/utils/model_utils.py", line 402, in refine_proposals
assert torch.all(non_nans), "deltas have nans: {}".format(deltas[~non_nans])
AssertionError: deltas have nans: tensor([nan, nan, nan, ..., nan, nan, nan], device='cuda:0')

Example data loader format

To train it on RGB images I created a custom dataloader thought it seems not to be possible to use 3 input channels in parallel. Slight modifications of dataloader and config files created some errors showing deeper dependencies on having single input channel.
Is there any regular way for RGB if I missed something ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.