GithubHelp home page GithubHelp logo

rmsandu / segmentation-eval Goto Github PK

View Code? Open in Web Editor NEW
23.0 1.0 5.0 1.47 MB

Extract and evaluate radiomics for liver cancer tumors from DICOM segmentation masks. Using SimpleITK, PyRadiomics and PyDicom.

License: Other

Jupyter Notebook 29.92% Python 69.97% Shell 0.11%
dicom segmentation-eval pydicom simpleitk radiomics-feature-extraction tumor pyradiomics ablation segmentation liver-cancer

segmentation-eval's Introduction

Radiomics Extraction and Evaluation from Binary 3D (DICOM) Segmentation Masks

The segmentations used in this project represent a pair of the type tumor - ablation for liver cancer tumors. The ablation can be replaced with another type of segmentation, such as a predicted tumor segmentation (using a separate algorithm).

The evaluation metrics include:

Requirements

The following non-standard libraries are required to use the full functionality of the project for DICOM image reading and processing.

Main Functions and Operations Performed

  1. A_read_files_info.py (optional)
    • DicomReader.py
  2. B_ResampleSegmentations.py
  3. C_mainDistanceVolumeMetrics.py
    • VolumeMetrics.py
    • DistanceMetrics.py
    • scripts/ellipsoid_inner_outer.py
    • scripts/plot_ablation_margin_hist.py
  4. D_compile_population_radiomics.py (optional)
    • if batch processing (for multiple segmentations) have been enabled this script will compile all the features from the outpt CSV files into one single file.
  5. E_radiomics_stats.py (optional)
    • various plots and statistics for the features that have been previously extracted

Usage

The scripts are called (internally) in alphabetical order using the folllowing logic:

Read Images --> Resample --> Extract Distance Metrics  --> Extract Volume Metrics  --> Plot Distance Metrics --> Output Metrics to Xlsx file in Tabular format

image

PyRadiomics automatically checks if the source CT image and the derived mask have the same dimensions. If not, resampling is performed in the background. This function only operates with the path to the patient folder that can have all source CT image, segmentation masks, other files, all in one folder. It does that by creating a dictionary of paths based on the metadata information that was encoded in the ReferencedImageSequenceTag and SourceImageSequence. I have previously encoded the mapping information (which is also an anonymization pipeline) in the DICOM-Anonymization-Segmentation-Mapping

Input

ap = argparse.ArgumentParser()
ap.add_argument("-i", "--rootdir", required=False, help="path to the patient folder to be processed")
ap.add_argument("-o", "--plots_dir", required=True, help="path to the output images")
ap.add_argument("-b", "--input_batch_proc", required=False, help="input csv file for batch processing")
args = vars(ap.parse_args())

The main function where to run the program from is A_read_files_info.py. The function can either work with a single patient image folder by calling the function like:

python A_read_files_info.py --i "C:\Users\MyUser\MyPatientFolderwithDicomAndSegmentationImages --o "C:\OutputFilesandImages"

For Batch Processing option the input is an Excel (.xlsx) file with the following headers:

Patient_ID Ablation_IR_Date Nr_Lesions Patient_Dir_Paths
C001 20160103 2 ['D:\Users\User1\Pats\Pat_C001']
C002 20181108 1 ['D:\Users\User1\Pats\Pat_C002']

The algorithm starts by iterating through all the patient folders provided in the column "Patient_Dir_Paths". Of course it's not absolutely necessarry to use the data structured in the way I did. Moreover, the A_read_files_info.py can be skipped altogether, especially when you know the mapping between your Source (original) CT image -> Segmentation1 -> Segmentation2. In this specific case my Segmentation1 is called "tumor_segmentation" and Segmentation2 is called "ablation_segmentation", which are 2 separate structures/tissue within my organ of interest (that's the liver). If you already know the file mapping between your images (i.e. which image is related to which image, aka more explicit, from which CT source image comes each segmentation) you can move the next steps which are B_ResampleSegmentations.py and C_mainDistanceVolumeMetrics.py.

Resampling (Resizing)

In my case resampling was necessary because not only my 2 segmentation masks had different sizes [x, y, z] , but they also had different spacings in-between the slices. If your images differ only in size then you can use the SimpleITK Paste Image function. However, if the spacing differs, resampling is necesarry otherwise we cannot compare the 2 images using the Distance and Volume metrics.

The file that does the resampling and resizing job is B_ResampleSegmentations.py. First the images need to be read and loaded using the SimpleITK Python library. My segmentations and CT images were saved in multiple DICOM slices (images) in a DICOM folder. The script DicomReader.py reads all the DICOM slices of an image/segmentation and returns a single variable which is an object in SimpleITK format. The segmentation masks are resampled in the same dimensions and spacing (however still anisotropic) before calling the DistanceMetrics.py and VolumeMetrics.py.

This resampling script can be called like:
import SimpleITK as sitk
import DicomReader as Reader
tumor_segmentation_sitk, tumor_sitk_reader = Reader.read_dcm_series(tumor_path, True)
ablation_segmentation_sitk, ablation_sitk_reader = Reader.read_dcm_series(ablation_path, True)
resizer = ResizeSegmentation(ablation_segmentation_sitk, tumor_segmentation_sitk) # object instantiation
tumor_segmentation_resampled = resizer.resample_segmentation() # SimpleITK image object

The actual resampling operation is performed using the SimpleITK ResampleImageFilter and NearestNeighbour Interpolation such that no new segmentation mask labels are generated.

Segmentation Evaluation Metrics

The segmentation Evaluation Metrics are called from the script C_mainDistanceVolumeMetrics.py which calls:

  • DistanceMetrics.py
  • VolumeMetrics.py
  • RadiomicsMetrics - features extracted using the Python library PyRadiomics. You can find a list of all the featuers that can be computed using this PyRadiomics here. I used only the volumes, sphericity, intensity and axis values. The same as for Resampling, both these scripts take as input arguments SimpleITK image objects. They can be called for example like:

surface_distance_metrics = DistanceMetrics(ablation_segmentation, tumor_segmentation_resampled)
ablation_radiomics_metrics = RadiomicsMetrics(source_ct_ablation, ablation_segmentation)
evaloverlap = VolumeMetrics()
evaloverlap.set_image_object(ablation_segmentation, tumor_segmentation_resampled)

Inner and Outer Ellipsoidal Approximations

image

The inner (green) and outer (orange) ellipsoidal approximations of a segmented object (blue) are calculated using convex optimization according to "S. P. Boyd and L. Vandenberghe, Convex optimization. Cambridge, UK ; New York: Cambridge University Press, 2004." Their implementation in CVXPY was employed to compute the ellipsoids. The outer volume was computed using SVD and the inner volume was computed as the sqrt(det(B)) * Ball(0,1).

Output

The output is a Excel (.xlsx) file in tabular format that returns radiomics (feature values) per patient and per lesion. Aditionally a histogram that describes the Euclidean distances between the tumor and ablation (my 2 segmentations files) are generated. The script for plotting the histogram uses the Surface Euclidean Distances extracted using SimpleITK using the Maurer et. al algorithm is in scripts/plot_ablation_margin_hist.py. To compute the histogram the following steps are followed for the Maurer et. al algorithm:

  1. compute the contour surface of the object (face+edge+vertex connectivity : fullyConnected=True, face connectivity only : fullyConnected=False (default mode)
  2. convert from SimpleITK format to Numpy Array Img
  3. remove the zeros from the contour of the object, NOT from the distance map
  4. compute the number of 1's pixels in the contour
  5. instantiate the Signed Mauerer Distance map for the object (negative numbers also)
  6. Multiply the binary surface segmentations with the distance maps. The resulting distance maps contain non-zero values only on the surface (they can also contain zero on the surface)

histogram_example

Inner and Outer Ellipsoids Fitted around a segmentation (orange points).

inner outer ellipsoid 2

Patient Data Structure for my case

The patient data consists of files and folder has the following folder structure and organization:

`Patient_C001  
        |Series_1  
          |CAS-Recordings (* .xml files)   
          | 2016-09-13_07-15-32 ...
                                          |Segmentations
                                            |Segmentation_No3
                                                        |0001.dcm
                                                        |0002.dcm
                                                        |003.dcm
          |CT.1.2.392...dcm
          |CT.1.2.393...dcm
          |CT.1.2.3..dcm  
        |Series_2
        |Series_3`  

segmentation-eval's People

Contributors

dependabot[bot] avatar ipa avatar rmsandu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

segmentation-eval's Issues

Correlations scatter plot

  • correlation matrix between all features
  • include energy
  • LTP extraction bar plots percentage of surface margin distance scatter correlation plots
  • add it all into one PPT
  • scatter plot for ablation axis extracted in all directions (coronal, saggital, axial)

Tumor Volume Coverage Ratio

Doesn't take into account that the optimal coverage is at a distance of [5-10mm] . Considers perfect score 1 if all the tumor is covered and residual tumor volume is 0.

Read Files from Disk

Options for computing the patients segmentations:

  • batch processing
  • single patient processing

Deep Lesions vs Subcapsular markers marking

  • EAV vs. PAV
  • Dice Score (or any other overlap measure) vs. the Lateral Target Error
  • Effective largest axis ablation vs Predicted largest axis
  • Effective largest axist ablation vs. Energy [kJ]

Predict Axes with Random Forest Model:

  • predict major axis ablation
  • predict minor axis ablation
  • 2 separate RF models
  • train only on non-subcapsular
  • train on all lesions
  • [ ] RF per device
  • [ ] add lateral error.
  • play with the number of trees, etc.
  • test on the manufacturer's brochure
  • perform interpolation

Add ArgParse commands

  • add argparse arguments for the input to avoid modifying the script every single time

DICOM Anonymization

  • Anonymize source data (CT from which segmentations were derived) as well
  • Anonymize XML and other files

Apply Random Forest

  • predict Ablation Volume
  • predict Ablation Volume using the formula
  • extract the OOB error

Cross-match with RedCAP lesion info

  • use the CSV with the paths downloaded from RedCap
  • this way we filter out which patients we want to analyze. Remember we want to focus solely on 50 lesions deemed analyzable in the beginning.
  • combine the RedCap file that has patient and lesion information with path references from the metatags

Incorrect GT segmentation origin

  • the mask is not placed at the correct location
  • direction/ origin might be wrong
  • paste before re-sample and resize?
  • identify where the issue appears : in DicomWriter or in Resize_Resample

Extract centroids from the segmentations

  • extract the centroid coordinatres of the tumor
  • extract the centroid coordinates of the ablation
  • bug: Setting Sequence to an array. still haven't figured out how to add the tuple (x, y, z) to an array value

Extract TPEs :

  • extract TPEs from maverric (update xml-recording extraction algorithm)
  • validated needles
  • import function LIT

sir/madam,what should I do if I have three labels?

for example,I have the result of a brain tumor segmentation.But to distinguish different areas of tumor,We have 3 labels.

In this case , what should I do to use your code to evaluate the result?

Thanks!

Ellipsoid creation

  • create ellipsoid using the radii from the manufacturer ablation devices
  • use the needle trajectory to find the angle around which to orientate the ablation

the ablation ellipsoid needs to be placed wrt to the ellipsoid

Naming, saving, storing conventions incoming data

  • All Filepaths into CSV from GT 2017-2018?
  • One CSV file per patient saved into the patients folder?
  • The CSV should be generated automatically from CAS ONE IR Segmentation Software
  • What additional information should be saved? Pathology type? Trajectory?
  • Problem: sometimes multiple trajectories have been used for the same lesion

Change labels histogram surface distance

  • different coloring scheme (0-1 mm yellow, light green 1-5 mm, 5-10 mm green, deep green 10-15 mm)
  • change the division to : optimal > 5mm, sufficient 0 <x<5 and not covered x<5
  • send condition for the margin ranges as input to the function

Create Predicted DICOM Ellipsoid

  • Extract power, time, type of needle from MWA database
  • MWA database useless, extract from REDCAP
  • compute coordinates of the simulated ablation ellipsoid
  • extract origin of the ablation zone
  • add the needle offset
  • re-create DICOM mask of the predicted ablation
  • compare the simulated ablation with the resulted ablation segmentation

Compute Metric for 177 Lesions

  • how many lesions actually treated percutaneously (175)
  • how many available complete datasets
  • how many treated with MWA
  • how many treated with RFA
  • how many re-ablated
  • how many had multiple needles in/parellel ablations
  • how many subcapsular
  • how many vicinity vessels

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.