rapidsai / cucim Goto Github PK

cuCIM - RAPIDS GPU-accelerated image processing library

Home Page: https://docs.rapids.ai/api/cucim/stable/

License: Apache License 2.0

CMake 0.97% C 0.22% C++ 4.29% Python 15.02% Shell 0.27% Jupyter Notebook 79.11% Dockerfile 0.01% Cuda 0.12%

image-processing computer-vision medical-imaging microscopy digital-pathology cuda gpu nvidia image-analysis image-data segmentation multidimensional-image-processing

cucim's Introduction

cuCIM

RAPIDS cuCIM is an open-source, accelerated computer vision and image processing software library for multidimensional images used in biomedical, geospatial, material and life science, and remote sensing use cases.

cuCIM offers:

Enhanced Image Processing Capabilities for large and n-dimensional tag image file format (TIFF) files
Accelerated performance through Graphics Processing Unit (GPU)-based image processing and computer vision primitives
A Straightforward Pythonic Interface with Matching Application Programming Interface (API) for Openslide

cuCIM supports the following formats:

Aperio ScanScope Virtual Slide (SVS)
Philips TIFF
Generic Tiled, Multi-resolution RGB TIFF files with the following compression schemes:
- No Compression
- JPEG
- JPEG2000
- Lempel-Ziv-Welch (LZW)
- Deflate

NOTE: For the latest stable README.md ensure you are on the main branch.

Developer Page

Blogs

Webinars

cuCIM: a GPU Image IO and Processing Library

Documentation

Release notes are available on our wiki page.

Install cuCIM

Conda

Conda (stable)

conda create -n cucim -c rapidsai -c conda-forge cucim cuda-version=`<CUDA version>`

<CUDA version> should be 11.2+ (e.g., 11.2, 12.0, etc.)

Conda (nightlies)

conda create -n cucim -c rapidsai-nightly -c conda-forge cucim cuda-version=`<CUDA version>`

<CUDA version> should be 11.2+ (e.g., 11.2, 12.0, etc.)

PyPI

Install for CUDA 12:

pip install cucim-cu12

Alternatively install for CUDA 11:

pip install cucim-cu11

Notebooks

Please check out our Welcome notebook (NBViewer)

Downloading sample images

To download images used in the notebooks, please execute the following commands from the repository root folder to copy sample input images into notebooks/input folder:

(You will need Docker installed in your system)

./run download_testdata

mkdir -p notebooks/input
tmp_id=$(docker create gigony/svs-testdata:little-big)
docker cp $tmp_id:/input notebooks
docker rm -v ${tmp_id}

Build/Install from Source

See build instructions.

Contributing Guide

Contributions to cuCIM are more than welcome! Please review the CONTRIBUTING.md file for information on how to contribute code and issues to the project.

Acknowledgments

Without awesome third-party open source software, this project wouldn't exist.

Please find LICENSE-3rdparty.md to see which third-party open source software is used in this project.

License

Apache-2.0 License (see LICENSE file).

cucim's People

Contributors

Stargazers

Watchers

cucim's Issues

[FEA] Add style checks in CI/CD for C++ code

Need to check styles with clang-format in CI/CD

Ah ok. Maybe we can track C++ formatting cleanup as an issue

Originally posted by @jakirkham in #108 (comment)

[FEA] Support Zarr-based image format (such as NGFF)

Is your feature request related to a problem? Please describe.

We need to look at Next-generation file formats (NGFF) (https://ngff.openmicroscopy.org/latest/) which use Zarr format for general microscopy images with distributed computing.

We want to support Zarr/NGFF after supporting major digital pathology formats (including .svs format).

Describe the solution you'd like

We can support Zarr or OME-Zarr format by 1) reusing existing library (such as ome-zarr-py or z5) or 2) implementing it from scratch.

Reusing a C++ library such as z5 may be a preferred option.

With GDS(GPUDirected-Storage, https://docs.nvidia.com/gpudirect-storage/overview-guide/index.html), the performance of loading the chunks(files) of the Zarr file(folder) could be greatly improved.

Other useful libraries that cuCIM can exploit may be available (we need to collect the information).

Additional context

Do you plan to support images with greater depth than 8 bits? More than 3 channels? (For multiple stains)

CuImage class holds DLPack's structures that can represent various data types, strides, shapes (including Channel dimension) so supporting them is possible. CuImage class already have a public API for that (dtype, dims, shape, and channel_names. see this)

We have a plan to also support microscopy-related image formats which usually store images with data types such as float and int16/32 and multi-channels (such as NGFF which is based on Zarr format), once we address the need for supporting Digital Pathology image formats.

scikit-image API functions often support more than 3 channels (with exceptions for things like rgb2gray where the input must be a 3 channel image).
Many operations involve internal conversion to floating-point. We made an effort to preserve single-precision computation when the input is single-precision and have started pushing those same modifications back upstream to scikit-image itself (which traditionally did most operations in double precision).

References

Articles

Zarr: Scalable Storage of Tensor Data for Use in Parallel and Distributed Computing | SciPy 2019 |(opens in new tab)

Data

Public OME-Zarr Data

Python Implementation

C/C++ Implementation

[FEA] Prevent memory leak during the development

Is your feature request related to a problem? Please describe.

Since the project doesn't have a mechanism for preventing memory leaks, there are high chances to introduce memory leaks during the development.

Describe the solution you'd like

Make use of a safe pointer (smart pointer) or scoped deleter for 3rdparty library/internal memory allocation.
Run unit/integration tests with memory sanitizer/tracker (e.g., valgrind/asan)

Additional context

There are many cases that we introduced memory leak problems.

Add Transforms for Digital Pathology

Merge #100 to add transforms for Digital Pathology

[FEA] Integration between GDS & nvJPEG

Design and Implement image patch generator by loading raw compressed image at once using GDS and provide a series of image patches by decoding multiple tile images in batch using nvJPEG

https://nbviewer.org/github/gigony/cucim/blob/v21.06.00/notebooks/Using_Cache.ipynb#Supporting-Generator-(iterator)

[FEA] Better support for plugins and its configuration (Carbonite SDK)

Is your feature request related to a problem? Please describe.

Currently, we don't use Carbonite's dependent plugin-loading mechanism.

Instead, we manager plugins separately, making some plugins built-in plugins:

cucim/cpp/include/cucim/plugin/plugin_config.h

Lines 35 to 36 in 95e4483

 std::vector<std::string> plugin_names{ std::string("cucim.kit.cuslide@" XSTR(CUCIM_VERSION) ".so"), 

 std::string("cucim.kit.cumed@" XSTR(CUCIM_VERSION) ".so") };

And commented on some methods from Carbonite code that are not ported yet:

cucim/cpp/include/cucim/core/framework.h

Line 66 in 95e4483

 // void load_plugins(const PluginLoadingDesc& desc = PluginLoadingDesc::get_default()); 

cucim/cpp/src/core/framework.cpp

Lines 40 to 44 in 95e4483

 // static void load_plugins(const PluginLoadingDesc& desc) 

 //{ 

 // CUCIM_ASSERT(g_framework); 

 // return g_framework->load_plugins(desc); 

 //}

cucim/cpp/src/core/cucim_framework.cpp

Lines 107 to 110 in 95e4483

 void CuCIMFramework::load_plugins(const PluginLoadingDesc& desc) 

 { 

 (void)desc; 

 }

We need to revisit current Carbonite integration someday.

Describe the solution you'd like

Revisit Carbonite SDK and use (minimal) SDK as it is, once it is available:

https://docs.omniverse.nvidia.com/prod_kit/prod_kit/developer_api.html#carbonite-sdk

[FEA] Update to use high-performance malloc and memory pool implementation

Is your feature request related to a problem? Please describe.

It would be great if we have a foundation for malloc and memory pool implementation.

malloc upgrade

cuCIM C++ implementation currently tries to use a separate call (cucim_malloc()) to abstract malloc functionality.

cucim/cpp/src/cache/image_cache_per_process.cpp

Line 99 in 45cf987

return cucim_malloc(n);

memory pool

it currently does not use memory pool for read_region() method and cache.

We can make use of memory pool implementation for that.

Describe the solution you'd like

According to https://lwn.net/SubscriberLink/872869/0e62bba2db51ec7a/, mimalloc(https://github.com/microsoft/mimalloc) seems to be a good candidate.
We can also consider other implementations like tcmalloc (https://github.com/google/tcmalloc).

For memory pool, we can exploit PMR(std::pmr::monotonic_buffer_resource) and rmm.

Enable GDS with public package for v0.19.0

GDS feature is disabled for now due to no public cuFile package available.
Once GDS package is released publicly(expected to release this week), need to update the code to use cufile.h header from the public package and/or update conda package to use the public gds package.

[FEA] Update API for consistency with upcoming scikit-image 0.19

New functions introduced in 0.19 that are easy to port to cuCIM (via CuPy)

scikit-image/scikit-image#5158: Add normalized mutual information metric
scikit-image/scikit-image#5308: New illuminants were added to the color conversions
scikit-image/scikit-image#5382: Added ND butterworth filter
scikit-image/scikit-image#5420: Add no-reference perceptual blur metric.

Other easy functions are skimage.feature.blob_dog and skimage.feature.blob_log. Those two existed in v0.18, but had either bug fixes and/or a new argument added in v0.19.

channel_axis support:

scikit-image 0.19 adds a channel_axis argument that should now be used instead of the multichannel boolean. In scikit-image 1.0, the multichannel argument will likely be removed. We should start supporting channel_axis in cuCIM. Corresponding upstream PRs are:

scikit-image/scikit-image#5228: Decorators for helping with the multichannel->channel_axis transition
scikit-image/scikit-image#5284: multichannel to channel_axis (1 of 6): features and draw
scikit-image/scikit-image#5285: multichannel to channel_axis (2 of 6): transform functions
scikit-image/scikit-image#5286: multichannel to channel_axis (3 of 6): filters
scikit-image/scikit-image#5287: multichannel to channel_axis (4 of 6): metrics and measure
scikit-image/scikit-image#5288: multichannel to channel_axis (5 of 6): restoration
scikit-image/scikit-image#5289: multichannel to channel_axis (6 of 6): segmentation
scikit-image/scikit-image#5462: Add a channel_axis argument to functions in the skimage.color module
scikit-image/scikit-image#5427: residual multichannel->channel_axis fixes
scikit-image/scikit-image#5348: channel_as_last_axis decorator fix

Single-precision support

We did much of this already, but need to review for consistency with upstream now that it has been implemented there. Also, can expand the test cases using parameterization over dtypes as was done upstream.

scikit-image/scikit-image#4880: Richardson-Lucy deconvolution: allow single-precision computation
scikit-image/scikit-image#5200: Add float32 support to moments_hu
scikit-image/scikit-image#5204: single precision support in skimage.registration
scikit-image/scikit-image#5219: single precision support in skimage.restoration
scikit-image/scikit-image#5220: single precision support in skimage.metrics
scikit-image/scikit-image#5344: single precision support in moments functions
scikit-image/scikit-image#5353: single precision support in skimage.features
scikit-image/scikit-image#5354: single precision support in skimage.filters
scikit-image/scikit-image#5372: improved single precision support in skimage.transform
scikit-image/scikit-image#5373: single precision support in skimage.segmentation
scikit-image/scikit-image#5443: support single precision in skimage.color

API/Deprecations

The selem argument has been renamed to footprint throughout the library. The footprint argument is now deprecated. (scikit-image/scikit-image#5445)
Deprecate in_place in favor of the use of an explicit out argument in skimage.morphology.remove_small_objects, skimage.morphology.remove_small_holes and skimage.segmentation.clear_border
The input argument of skimage.measure.label has been renamed label_image. The old name is deprecated.
standardize on num_iter for paramters describing the number of iterations and max_num_iter for parameters specifying an iteration limit.
The names of several parameters in skimage.measure.regionprops have been updated so that properties are better grouped by the first word(s) of the name. The old names will continue to work for backwards compatibility.
In measure.label, the deprecated neighbors parameter has been removed (use connectivity instead).
The deprecated skimage.color.rgb2grey and skimage.color.grey2rgb functions have been removed (use skimage.color.rgb2gray and skimage.color.gray2rgb instead).
skimage.color.rgb2gray no longer allows grayscale or RGBA inputs.
The deprecated alpha parameter of skimage.color.gray2rgb has now been removed. Use skimage.color.gray2rgba for conversion to RGBA.
Attempting to warp a boolean image with order > 0 now raises a ValueError.
When warping or rescaling boolean images, setting anti-aliasing=True will raise a ValueError.
The bg_label parameter of skimage.color.label2rgb is now 0.
The deprecated skimage.feature.register_translation function has been removed (use skimage.registration.phase_cross_correlation instead).
The deprecated skimage.feature.masked_register_translation function has been removed (use skimage.registration.phase_cross_correlation instead).
The default mode in skimage.filters.hessian is now 'reflect'.
The default boundary mode in skimage.filters.sato is now 'reflect'.

Bug fixes

Input labels argument renumbering in skimage.feature.peak_local_max is avoided (scikit-image/scikit-image#5047).
Nonzero values at the image edge are no longer incorrectly marked as a boundary when using find_bounaries with mode='subpixel' (scikit-image/scikit-image#5447).
Fix return dtype of _label2rgb_avg function.
Ensure skimage.color.separate_stains does not return negative values.
Handle 1D arrays properly in skimage.filters.gaussian.
Fix Laplacian matrix size bug in skimage.segmentation.random_walker.
Regionprops table (skimage.measure.regionprops_table) dtype bugfix.
Fix skimage.transform.rescale when using a small scale factor.
Fix multichannel intensity_image extra_properties in regionprops.
Fix error message for skimage.metric.structural_similarity when image is too small.
Do not mark image edges in 'subpixel' mode of skimage.segmentation.find_boundaries.
Fix behavior of skimage.exposure.is_low_contrast for boolean inputs.
Fix wrong syntax for the string argument of ValueError in skimage.metric.structural_similarity .
Fixed NaN issue in skimage.filters.threshold_otsu.
Fix skimage.feature.blob_dog docstring example and normalization.
Fix uint8 overflow in skimage.exposure.adjust_gamma.
Fix broken doctests in skimage.exposure.histogram and skimage.measure.regionprops_table. (scikit-image/scikit-image#5522)
Corrected phase correlation in skimage.register.phase_cross_correlation. (scikit-image/scikit-image#5461)

[FEA] Faster morphology operations via footprint decomposition

Is your feature request related to a problem? Please describe.
I recently proposed a PR upstream in scikit-image that decomposes a binary footprint into a series of smaller footprints: scikit-image/scikit-image#5482. On the CPU this gives large performance improvements when working with larger footprints.

Describe the solution you'd like
This would be good to adapt here as morphology operations on the GPU are currently much faster on small sized elements than large ones, likely due in part to much more favorable memory access patterns.

Describe alternatives you've considered
There are a number of alternative ways to do the decomposition, but this one is straightforward to implement and wouldn't require any changes upstream to CuPy itself.

Additional context
I have not yet benchmarked this approach on the GPU, but am confident it will generally be faster here as well.

[FEA] Update DLPack (v0.2.0) and rmm versions

See comments in #64 (comment) .

DLPack is updated (v0.6.0) and it has breaking changes (in v0.4.0): https://github.com/dmlc/dlpack/releases

Need to update DLPack version along with rmm version.

[BUG] There is no mechanism for user to know the availability of cucim.CuImage

Describe the bug
Since v21.06.00, cuCIM silenced ImportError exception from from .clara import CuImage, __version__, cli
to support cucim.skimage package-only use case.

cucim/python/cucim/src/cucim/__init__.py

Line 44 in 3afa154

from .clara import CuImage, __version__, cli

try:
    import cupy
except ImportError:
    pass

try:
    from .clara import CuImage, __version__, cli
except ImportError:
    from ._version import get_versions
    __version__ = get_versions()['version']
    del get_versions
    del _version

That caused an issue on MONAI's tests because it wouldn't raise ImportError even if loading libcucim.so was failed, and it causes MONAI's optional_import() method to return True.

MONAI team (@wyli @drbeh) has the following workaround to check cucim.CuImage's availability.

drbeh/MONAI@a102ab7#diff-fcffca442c7cee4391ab58c3d2b71b80134da7621262457c69a3741308e156cd

Steps/Code to reproduce bug

Install cuCIM v21.08.02 from PyPI
import cucim
img = cucim.CuImage("input.tif")

Expected behavior

User may want to see error when import cucim if cucim.CuImage is not available.

Environment details (please complete the following information):

Environment location: Docker
Method of cuCIM install: PyPI

Additional context

We may want to expose a method in (cucim package) to to check if image loader(cucim.clara) or image processor(cucim.skimage, cucim.core) are available so that user can check the real availability.

such as:
- cucim.is_available() : check if all modules are available.
- cucim.is_available("clara") : check if image loader-related modules are available.
- cucim.is_available("skimage")
- cucim.is_available("core")

[BUG] Compression scheme 33003 tile decoding is not implemented.

Describe the bug

For Aperio svs files, I run into a cryptic error:

/slide.svs: Compression scheme 33003 tile decoding is not implemented.

However, cuCMI still loads a tile and doesn't raise but the color are wrong or it is entirely black. See the following grid of tiles:

Are all OpenSlide vendors supported? Is there a list of all supported vendors somewhere? I could not find it.

Steps/Code to reproduce bug

# Download the TCGA-F4-6459-01Z-00-DX1 slide from https://portal.gdc.cancer.gov/cases/9fd08502-355b-4f5a-a25c-73ec7184f6d3?bioId=57f4136d-d6aa-4537-ba73-f21fe0374005 

import numpy as np

from PIL import Image
from cucmi import CuImage

slide = CuImage("TCGA-F4-6459-01Z-00-DX1")

tile = slide.read_region((0, 0), (224, 224), 0)

Image.fromarray(np.asarray(tile))

[DOC] Update cuCIM documentation to use PyData Sphinx theme

As summarized by @isVoid in rapidsai/cudf#7098 , Python API documentation for RAPIDS projects has historically used the Sphinx Read The Docs theme and served a single large HTML page. This has caused the docs to load slowly as API surfaces have grown and caused users to experience potentially significant latency/delays when searching for functions.

In rapidsai/cudf#8746 , @galipremsagar updated the cuDF documentation to the more responsive PyData Sphinx theme. In rapidsai/cugraph#1793, @BradReesWork made a similar change.

Updating the cuCIM documentation to the PyData Sphinx theme should improve the user experience for our documentation.

[FEA] - GPU Weighted Box Fusion (WBF) for Object Detection BBoxes

Background
Image Object Detection models predict bounding boxes (bboxes). A single model will usually predict more candidate bboxes than are necessary, therefore practitioners need techniques to combine the many predicted bboxes into fewer (and more accurate) predicted bboxes.

Additionally if we use 5-Fold models and/or multiple image object detection models, we need a way to ensemble (i.e. combine) the many bboxes from the many models.

Current State-Of-The-Art is CPU
Both of these tasks can be solved with NMS (Non-maximum Suppression) and/or WBF (Weighted Box Fusion). Most people use the GitHub repository here which has a CPU implementation of both
https://github.com/ZFTurbo/Weighted-Boxes-Fusion

PyTorch has a batched GPU version of NMS, but there are currently no GPU versions of WBF. The technique WBF is currently the most accurate state-of-the-art method for combining bboxes into fewer more accurate predictions . Research paper here
https://arxiv.org/abs/1910.13302

When using multiple 5-Fold models with thousands of images and many bbox per image, we can easily obtain millions of bboxes! Applying WBF on CPU can take up to 1 hour!

Describe the solution you'd like
It would be wonderful to implement a GPU version for WBF. And perhaps implement another GPU version of NMS too.

[FEA] Add test cases for TIFF file loader module

Is your feature request related to a problem? Please describe.
Add unit/integration/system/performance testing for TIFF file loader module (under clara).

Setup test cases for Python/C++ modules.

See https://github.com/rapidsai/cucim/wiki/002_setup_tests.md for the details.

Update DP transforms with module level functions

Per comment from @jakirkham on MR #100

#100 (review)

Installation via conda doesn't work

Hi all,

I was just trying to setup cucim on a freshly installed Windows computer with mini-conda installed. I copied the command from the readme and I'm receiving this error during installation.

(bio1) C:\Users\rober>conda create -n cucim -c rapidsai -c conda-forge/label/cupy_rc -c conda-forge cucim cudatoolkit=11.2
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - cucim

Current channels:

  - https://conda.anaconda.org/rapidsai/win-64
  - https://conda.anaconda.org/rapidsai/noarch
  - https://conda.anaconda.org/conda-forge/label/cupy_rc/win-64
  - https://conda.anaconda.org/conda-forge/label/cupy_rc/noarch
  - https://conda.anaconda.org/conda-forge/win-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

Am I missing a conda channel or something?

Any hint is appreciated!

Thanks,
Robert

Wrapping resource(memory) handling code with RAII

Is your feature request related to a problem? Please describe.

Current implementation is not handling error cases well -- memory allocated is manually freed by calling free() or cucim_free() method.

We need to improve the whole code to use RAII.

Describe the solution you'd like

Create a utility class to handle resources.

[QST] Compression scheme 33003 tile decoding is not implemented.

For some svs files, I run into a cryptic "error":

/slide.svs: Compression scheme 33003 tile decoding is not implemented.

"Error" is in quote as it doesn't seem to have any effect, the tile is still loaded. It seems to be linked to the Aperio vendor.

Are all OpenSlide vendors not supported yet? If so, why is the tile correctly loaded? Is there any flag I can set to stop this error from spamming my logs?

[FEA] Adding HoG

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I wish I could use cuCIM to do [...]

It's be great if cucim implements skimage's hog algorithm. Hog feature is really useful and it'd be great if we had a GPU implementation of it.

Describe the solution you'd like
Implement skimage's Hog algorithm.
(https://scikit-image.org/docs/dev/api/skimage.feature.html?highlight=hog#skimage.feature.hog)

Describe alternatives you've considered
None

Additional context
None

[FEA] Support Aperio SVS format (using CPU Decoder)

Is your feature request related to a problem? Please describe.

Aperio SVS format is the most popular Digital Pathology image format available in public and we see many open data sets.
For example:

Describe the solution you'd like

Implement SVS format with OpenJpeg (for CPU decoding of JPEG2k).
Using nvJPEG/nvJPEG2000 (for GPU decoding of JPEG/JPEG2k) would be the next task.

Additional context

skimage.segmentation.watershed implementation?

Is your feature request related to a problem? Please describe.
It's be great if cucim implements skimage's watershed algorithm. watershed is really useful for segmentation and it'd be great if we had a GPU implementation of it.

Describe the solution you'd like
Implement skimage's watershed algorithm.

Describe alternatives you've considered
None

Additional context
None

[FEA] Investigate using black

Previously we tried to use black for formatting Python, but ran into some issues. It would be good to revisit this at some point

cc @grlee77 @quasiben

[QST] Is it available for OS Win?

I have tried install in Conda prompt:
pip install cucim
ERROR: Could not find a version that satisfies the requirement cucim (from versions: none)
ERROR: No matching distribution found for cucim

Is it available for OS Win or what I have made done wrong? Thank you for asist

[FEA] Support BioFormats library through cuCIM

Is your feature request related to a problem? Please describe.

We would like to start thinking about supporting BioFormats.

Bioformats is a pathway to access so many formats that we'll never want to implement ourselves.

Even if it's in Java, may as well find a way to hook into it as an option. There are way too many formats that we will have a hard time supporting. Bioformats handles lots of them, including proprietary ones.

Describe the solution you'd like
TBD

Describe alternatives you've considered

We can make it easy to load various microscopy image formats while we support the formats in an optimized way.

One concern is that for such adaptation layer (plugin) without optimization, there is already a solution in Python so doing it in cuCIM could be a duplicated effort:

https://github.com/CellProfiler/python-bioformats/ which uses Java Virtual Machine to use BioFormats library (https://pythonhosted.org/python-bioformats/)
ITK also provides SCIFO-ITK integration (an ImageIO plugin for the Insight Toolkit (ITK) that uses Bio-Formats to read and write supported file formats.)
- https://github.com/scifio/scifio-imageio
- https://github.com/scifio/scifio-itk-bridge

Do we also want to provide such integration without optimization?

Additional context
N/A

[FEA] Add Digital Pathology-related transformations

Is your feature request related to a problem? Please describe.

We have implemented Digital Pathology-related transformation operations with CuPy and we would like to make them part of cuCIM's built-in image-processing operations.

Describe the solution you'd like

First, need to discuss how to structure the package layout for non-scikit-image APIs.
Then, expose existing implementations as part of cuCIM package.

Describe alternatives you've considered
N/A

Additional context

We have been working on Digital Pathology-specific operations and need to make them available so that MONAI can make use of them.

[FEA] Support Cache Mechanism in cuCIM

This issue is explaining the feature that is already being implemented and would like to use this issue for showing the feature and get feedback until the implementation is merged and available.

Is your feature request related to a problem? Please describe.

In many deep learning use cases, small image patches need to be extracted from the large image and they are fed into the neural network.

If the patch size doesn't align with the underlying tile layout of TIFF image (e.g., AI model such as ResNet may accept a particular size of the image [e.g., 224x224] that is smaller than the underlying tile size [256x256]), redundant image loadings for a tile are needed (See the following two figures)

Which resulted in lower performance for unaligned cases as shown in our GTC 2021 presentation

To improve image loading performance for general use cases, cuCIM needs a cache mechanism implemented.

The cache feature also need to be highly configurable (such as disabling cache) : #17 (comment)

Will we be able to disable the cache when doing random tile reads? We generally disable the OpenSlide one when training.

(As shown in Access Pattern 3 in https://github.com/rapidsai/cucim/blob/branch-0.20/notebooks/File-access_Experiments_on_TIFF.ipynb)

Describe the solution you'd like

Make the cache memory/strategy configurable

Other libraries have the following strategies for cache.

OpenSlide
- 1024 x 1024 x 30 bytes (30MiB) per file handle for cache ==> 160 (RGB) or 120 (ARGB) 256x256 tiles
- Not configurable
rasterio
- 5% of available system memory by default (e.g., 32 GB of free memory => 1.6 GB of cache memory allocated)
- Configurable through environment module

We would like to support three cache strategies (can be extended).

nocache
per_process
shared_memory (interprocess)

And we want to make user can select cache strategy and its configuration through a configuration file (.cucim.json) or function calls(CuImage.cache() method)

Use high-performance libraries/algorithm

Make use of libcuckoo for cache item handling(hash map).
- Known to be better than Intel TBB's concurrent hash map. The project is fairly stable.
Use circular queue(using std::vector with FIFO strategy) with mutex pool, instead of linked list (with LRU strategy) for better handling of concurrent environments.
For shared memory strategy, it uses boost library (interprocess module) for shared memory management.

Results

cuCIM has similar performance gain with aligned case when patch and tile layout are not aligned.

Demo

You can test the feature via the following command

pip install -i https://test.pypi.org/simple/ --pre --upgrade cucim

Room for improvement

Using of a memory pool

per_process strategy performs better than shared_memory strategy, and both strategies perform less than nocache strategy when underlying tiles and patches are aligned.

shared_memory strategy does some additional operations compared with per_process strategy, and both strategies have some overhead using cache (such as memory allocation for cache item/indirect function calls)

=> All three strategies (including nocache) can have benefited if we allocate CPU/GPU memory for tiles from a fixed-sized cache memory pool (using RMM and/or PMR) instead of calling malloc() to allocate memory.

Supporting generator(iterator)

When patches to read in an image can be determined in advance (inference use case), we can load/prefetch entire compressed/decompressed image data to the memory and provide Python generator(iterator) to get a series of patches efficiently for inference use cases.

Supplementary Images

[Tracking] Implement Watershed Algorithm

This issue is related to #49 and would like to discuss and track the issue here.
(@grlee77 Please feel free to edit/update this description).

Tracking

Currently collecting information for deciding a proper approach/implementation.

Problem

Watershed algorithm in scikit-image (https://scikit-image.org/docs/dev/api/skimage.segmentation.html?highlight=watershed#skimage.segmentation.watershed) is a popular algorithm for segmentation. However, cuCIM doesn't support it so would like to provide the algorithm through cucim.skimage.segmentation.watershed method.

It turns out that using the same algorithm used in scikit-image with CuPy is not feasible or tricky to implement:

The watershed itself is implemented in Cython and uses a heap data structure where the elements on the heap are small structs which is not GPU-friendly.

Candidate Implementations

There are some GPU implementations:

1. 2D implementation based on cellular automata in recent NPP

2. A different GPU-based algorithm with corresponding citations at watershed-cuda

https://github.com/louismullie/watershed-cuda
Can be used under Appach 2.0 license (see #49 (comment))

3. CLIJ's approach

https://clij.github.io/clij2-docs/reference_watershed
Its implementation has some issues (delivers results of limited quality) so alternatives are recommended (see this link)

[DOC] Document Transforms for Digital Pathology

With #100 in, how should we document these new functions ? Should we simply add cucim.skimage.operations ? Or broken out for each subsection: cucim.core.operations.color/ cucim.core.operations.expose.transform / etc

[FEA]DICOM Image Toolkits and Pipeline for CT, MR etc

For medical imaging inference, DICOM loading, resampling, morphology, window location and width adjustment are commonly used in both training and inference. However most of them are in GPU packages like ITK/VTK, numPy etc. There is either a toolkit set nor an example pipeline/tutorial to support these tasks on GPU.

Following is expected features:
1. A Toolkit package for pre- and post processing including but not limited to DICOM Loading, Volume resampling, Windows location/width adjusting, batching, morphology filtering, connect component extraction etc.
2. A demo/example code that could use above features to make a inference pipeline on GPU, together with AI inference. So the whole inference pipeline is on GPU.
3. Compling tools that could deploy the pipeline as a compiled engine.

[FEA] Provide GPU utilization metric for benchmarks

I would like to see GPU utilization for all benchmarks. This will allow me to know how much compute and memory was used for the benchmark. Currently there is no way to get this in the benchmark script. Eventually this could be used to create benchmarks designed to "max out" throughput available on the HW.

It would be nice to see additional metrics reported for each of the measured functions that written out of each benchmark script (example).

[QST] isort style: remove force_single_line = True from `setup.cfg`?

What is your question?

When we originally ran isort on the skimage module we used the setting multi_line_output = 0 which results in a fairly compact import style. For example:

cucim/python/cucim/src/cucim/skimage/morphology/__init__.py

Lines 1 to 8 in 78ca078

 from .binary import (binary_closing, binary_dilation, binary_erosion, 

 binary_opening) 

 from .grey import (black_tophat, closing, dilation, erosion, opening, 

 white_tophat) 

 from .greyreconstruct import reconstruction 

 from .misc import remove_small_holes, remove_small_objects 

 from .selem import (ball, cube, diamond, disk, octagon, octahedron, rectangle, 

 square, star)

but if I rerun isort on that file now, it seems there has been a force_single_line = True setting added to setup.cfg which result in the following more verbose style:

from ._skeletonize import thin
from .binary import binary_closing
from .binary import binary_dilation
from .binary import binary_erosion
from .binary import binary_opening
from .grey import black_tophat
from .grey import closing
from .grey import dilation
from .grey import erosion
from .grey import opening
from .grey import white_tophat
from .greyreconstruct import reconstruction
from .misc import remove_small_holes
from .misc import remove_small_objects
from .selem import ball
from .selem import cube
from .selem import diamond
from .selem import disk
from .selem import octagon
from .selem import octahedron
from .selem import rectangle
from .selem import square
from .selem import star

Can we switch back to force_single_line = False or is there a reason to prefer the single line style? I looked at a few other RAPIDS projects and the style is not uniform across them. cuml does seem to have a single line style, but cudf and cusignal do not.

The clara package here is using the single line style, but the skimage package is not. Whichever one we choose we should make the two consistent!

Use std::filesystem::path for paths

We may want to revisit at some point and look at using std::filesystem::path. Though that requires C++17 and we were already using std::string. So just a future note

Originally posted by @jakirkham in #128 (comment)

[FEA] Supporting multi bands (multi-channel) images for geospatial (remote sensing) and microscopy

Is your feature request related to a problem? Please describe.

cuCIM is currently supporting only jpeg/deflate-compressed RGB image (which is prevalent on Digital Pathology image) only and doesn't support multi-channel TIFF images yet.

Supporting multi-channel would help to load geospatial(remote sensing. multi-band) or microscopy (multi-channel) images.

Describe the solution you'd like

Supporting multi bands(channels) is feasible to implement in cuCIM as existing code can handle IFD(Image File Directory)s of TIFF format (each band image is stored in an IFD) and, AFAIK, GeoTIFF is a TIFF format with domain-specific metadata.
It would be a significant effort to implement the full feature of GeoTIFF format. However, we can start providing pixel array data and some important metadata to be used in DeepLearning use cases.

Describe alternatives you've considered

For parsing/recognizing metadata of the GeoTIFF image, we may want to look at GDAL's implementation that is also based on libtiff library, or libgeotiff for a reference?

Additional context

GDAL - GTIFF format
- Looks like we may need to handle other metadata such as Color Profile in GeoTIFF format.

[BUG] - Info messages appearing as warnings in Jupyter notebooks

Describe the bug
When running cucim - accelerated dataloading in Jupyter, a lot of pink warning style output can be generated that looks like an error.

Steps/Code to reproduce bug
Run the following code in a notebook with a suitably large image:

from cucim import CuImage
from dask.distributed import as_completed
from dask.distributed import Client, LocalCluster
import os
import numpy as np

cluster = LocalCluster(dashboard_address= 8789, processes=True)
client = Client(cluster)

# iterate over a set of regions from which to threshold
def process_chunk(params):
    start_loc_list = params[0]
    inp_file = params[1]
    patch_size = params[2]
    slide = CuImage(inp_file)
    res = []
    for start_loc in start_loc_list:
        region = np.array(slide.read_region(start_loc, [patch_size, patch_size], 0))
        if region.flatten().var() > 100:
            res.append(start_loc)
        
    return res

# As the results are processed, put them into a list
def compile_results(futures):
    patches = []

    for future in as_completed(futures):
        res1 = future.result()
        if res1:
            for patch in res1:
                patches.append(patch)
                
    return patches

input_file = "patient_100_node_0.tif"
wsi = CuImage(input_file)
sizes=wsi.metadata["cucim"]["resolutions"]
w = sizes["level_dimensions"][0][0]
h = sizes["level_dimensions"][0][1]
patch_size = 256
num_processes = os.cpu_count()

# compute the coordinates of the image patches
start_loc_data = [(sx, sy)
    for sy in range(0, h, patch_size)
        for sx in range(0, w, patch_size)]

# calculate the number of patches per process/thread
chunk_size = len(start_loc_data) // num_processes

# create list of patches to process
start_loc_list_iter = [(start_loc_data[i:i+chunk_size],input_file,patch_size)  for i in range(0, len(start_loc_data), chunk_size)]

# Threshold each patch asynchronously and return a future
future_result1 = list(client.map(process_chunk, start_loc_list_iter))
patches = compile_results(future_result1)`

Expected behavior
Ideally no such output should be produced unless the user specifies that they want info/warning output to be shown.

Environment details (please complete the following information):
Jupyter notebook running NGC PyTorch container 21.07
cucim v21.8.2

Additional context
Here is some example output when loading an image with 10 processes:

[Plugin: cucim.kit.cuslide] Loading...
[Plugin: cucim.kit.cuslide] Loading the dynamic library from: /opt/conda/lib/python3.8/site-packages/cucim/clara/[email protected]
[Plugin: cucim.kit.cuslide] loaded successfully. Version: 0
Initializing plugin: cucim.kit.cuslide (interfaces: [cucim::io::IImageFormat v0.1]) (impl: cucim.kit.cuslide)
[Plugin: cucim.kit.cuslide] Loading...
[Plugin: cucim.kit.cuslide] Loading the dynamic library from: /opt/conda/lib/python3.8/site-packages/cucim/clara/[email protected]
[Plugin: cucim.kit.cuslide] loaded successfully. Version: 0
Initializing plugin: cucim.kit.cuslide (interfaces: [cucim::io::IImageFormat v0.1]) (impl: cucim.kit.cuslide)
[Plugin: cucim.kit.cuslide] Loading...
[Plugin: cucim.kit.cuslide] Loading the dynamic library from: /opt/conda/lib/python3.8/site-packages/cucim/clara/[email protected]
[Plugin: cucim.kit.cuslide] Loading...
[Plugin: cucim.kit.cuslide] Loading the dynamic library from: /opt/conda/lib/python3.8/site-packages/cucim/clara/[email protected]
[Plugin: cucim.kit.cuslide] loaded successfully. Version: 0
Initializing plugin: cucim.kit.cuslide (interfaces: [cucim::io::IImageFormat v0.1]) (impl: cucim.kit.cuslide)
[Plugin: cucim.kit.cuslide] loaded successfully. Version: 0
Initializing plugin: cucim.kit.cuslide (interfaces: [cucim::io::IImageFormat v0.1]) (impl: cucim.kit.cuslide)
[Plugin: cucim.kit.cuslide] Loading...
[Plugin: cucim.kit.cuslide] Loading the dynamic library from: /opt/conda/lib/python3.8/site-packages/cucim/clara/[email protected]
[Plugin: cucim.kit.cuslide] Loading...
[Plugin: cucim.kit.cuslide] Loading the dynamic library from: /opt/conda/lib/python3.8/site-packages/cucim/clara/[email protected]
[Plugin: cucim.kit.cuslide] Loading...
[Plugin: cucim.kit.cuslide] Loading the dynamic library from: /opt/conda/lib/python3.8/site-packages/cucim/clara/[email protected]
[Plugin: cucim.kit.cuslide] loaded successfully. Version: 0
Initializing plugin: cucim.kit.cuslide (interfaces: [cucim::io::IImageFormat v0.1]) (impl: cucim.kit.cuslide)
[Plugin: cucim.kit.cuslide] loaded successfully. Version: 0
Initializing plugin: cucim.kit.cuslide (interfaces: [cucim::io::IImageFormat v0.1]) (impl: cucim.kit.cuslide)
[Plugin: cucim.kit.cuslide] loaded successfully. Version: 0
Initializing plugin: cucim.kit.cuslide (interfaces: [cucim::io::IImageFormat v0.1]) (impl: cucim.kit.cuslide)
[Plugin: cucim.kit.cuslide] Loading...
[Plugin: cucim.kit.cuslide] Loading the dynamic library from: /opt/conda/lib/python3.8/site-packages/cucim/clara/[email protected]
[Plugin: cucim.kit.cuslide] loaded successfully. Version: 0
Initializing plugin: cucim.kit.cuslide (interfaces: [cucim::io::IImageFormat v0.1]) (impl: cucim.kit.cuslide)
[Plugin: cucim.kit.cuslide] Loading...
[Plugin: cucim.kit.cuslide] Loading the dynamic library from: /opt/conda/lib/python3.8/site-packages/cucim/clara/[email protected]
[Plugin: cucim.kit.cuslide] loaded successfully. Version: 0
Initializing plugin: cucim.kit.cuslide (interfaces: [cucim::io::IImageFormat v0.1]) (impl: cucim.kit.cuslide)

[FEA] Provide a way/example to use cuCIM with Dask for DataLoading(Pytorch's Dataloader-like API)

Is your feature request related to a problem? Please describe.

PyTorch's DataLoader class is used in many DeepLearning training applications to load training data and pre-process the data, before feeding to AI model.

Since PyTorch's DataLoader is running in multi-processes, it is hard to use cuCIM's scikit-image APIs (which makes use of CUDA) in the pre-transformations of the DataLoader due to CUDA context issues.

It would be nice to provide a way/example to use cuCIM with DeepLearning Frameworks such as PyTorch.

Describe the solution you'd like

PyTorch's DataLoader works like this. It would be nice if we have a PyTorch's DataLoader-like utility class in Dask that mimics Pytorch's DataLoader behavior but implemented with Dask (dask-cuda) for the parallelization of data loading (so providing a generator/iterator that gives a batch of processed image data).

Describe alternatives you've considered

To use cuCIM in the training pipeline, we currently move GPU-accelerated pre-transforms from PyTorch DataLoader's transformation(using Compose) to the main thread (place GPU-based batch pre-transformation right before feeding to the AI model, and right after getting CPU-loaded/pre-transformed training data by DataLoader.), to avoid CUDA context issues.
It would be good if we also provide an example with that approach.

Additional context

Relevant information regarding CuPy+PyTorch.

With Numba to get cuda context.

https://github.com/numba/numba/blob/1fc53ecb6183f441498c0e082e748e6792e29791/numba/cuda/cudadrv/devices.py#L130

`color_jitter` changes dtype and shape

Describe the bug
When calling color_jitter on an individual 2D image, for instance a cupy array with dimension of (3, 2, 2) and dtype of float32. The output will be change the dtype and adds a batch dimension to it.

Function call:

OUTPUT = color_jitter(INPUT, brightness=1.0, contrast=1.0, saturation=1.0, hue=0.0)

INPUT:

type = cupy.ndarray
dtype = float32
shape = (3, 2, 2)

OUTPUT:

type = cupy.ndarray
dtype = unit8
shape = (1, 3, 2, 2)

cuda 10.x support

Is cuda 10.x support feasible or planned?

[QST]RuntimeError: This format (compression: 1, sample_per_pixel: 3, planar_config: 1, photometric: 2) is not supported yet!.

How could I solve this error?
Like chang tiff type?

RuntimeError Traceback (most recent call last)
/tmp/ipykernel_11392/2436574467.py in
7 level_count = resolutions["level_count"]
8
----> 9 region = img.read_region([0,0], level_dimensions[level_count - 1], level_count - 1, device="cuda")
10
11 print(region.device)

RuntimeError: This format (compression: 1, sample_per_pixel: 3, planar_config: 1, photometric: 2) is not supported yet!.

remove logging from DP transforms

As per comment from @jakirkham in MR #100

#100 (comment)

[FEA] Non-local means denoising

Hello!
I very often use function Non-local means denoising - it really good denoising method for preserving textures
I used to: from skimage.restoration import denoise_nl_means"
I tried "from cucim.skimage.restoration import denoise_nl_means" but it doesnt work. :(

Here is link: https://scikit-image.org/docs/dev/auto_examples/filters/plot_nonlocal_means.html

[QST] GDS, dlpack and cuCIM

Hi cuCIM team,

I am deploying RAPIDS with cuCIM v21.06 on ppc64le architecture (Summit cluster) and in our NV-ARM Developer Kit cluster, and noted during compilation, it grabs GDS. So my questions are:

Is GDS a hard dependency ?, if not, is there any env. variables to disable it ?

Thanks,

[FEA] Multi-GPU support for cupyimg

This is a question migrated over from cupyimg: will there be multi-GPU support for GPU-enabled scikit-image functions?

phase_cross_correlation and affine_transform are two functions I have in mind. These are probably ubiquitously used functions in biomedical imaging for image registration.

Problem is, for certain problems where image resolution is key and down-sampling is not possible (ie. single-molecule spatial transcriptomics), there's a need to register large 3D volumes that cannot be loaded into one GPU alone. Being able to "chunk" the image not unlike dask would be extremely helpful. I realize this is easier said than done!

I'm actually very keen on learning how to implement this, for personal growth and for to benefit the community - but need some guidance. Would the authors be able to comment on this utility?

[FEA] Add Stain Normalization operation (in Python)

Is your feature request related to a problem? Please describe.

We need to add Stain Normalization algorithm to a part of cuCIM's operation:
https://developer.nvidia.com/blog/accelerating-digital-pathology-pipelines-with-nvidia-clara-deploy

Describe the solution you'd like

Need to move the implementation to cuCIM's package: Project-MONAI/MONAI#2666

@drbeh helped push a draft PR

#186

We would like to update and merge it.

Describe alternatives you've considered
N/A

Additional context

We also have C++ implementation.
Once Operator interface is available, we can provide the operation with C++ implementation.

[DOC] document how to do an editable install (for Python developers)

Report needed documentation

Report needed documentation

It would be useful to expand the contributor documentation to explain how the python package can be installed as an "editable install".

When I tried a naive pip install -e . -v call from within the python/cucim folder where setup.py is contained, this runs fine, but I am unable to import the package. I think the issue is that the cucim module folder is within a src subfolder rather than directly in the folder containing setup.py.

Describe the documentation you'd like
A workaround that fixes it for me, it to add the following to the setup.cfg:

[egg_info]
egg_base = src

which causes the cucim.egg-info folder to get placed into the src subfolder rather than in the same folder as setup.py itself.

Steps taken to search for needed documentation
read CONTRIBUTING.md

module 'cupy' has no attribute 'byte'[BUG]

Describe the bug
While importing
"from cucim.skimage.transform import resize", I am getting error as
"AttributeError: module 'cupy' has no attribute 'byte'"

Steps/Code to reproduce bug
I am running the RAPIDS container: nvcr.io/nvidia/rapidsai/rapidsai:21.08-cuda11.2-runtime-ubuntu20.04
I have my:
cucim: 21.8.1 version
cupy: 9.0.0 version
CUDA Version: 11.4

Expected behavior
I want to test out the difference between skimage and cucim from this website:
https://www.youtube.com/watch?v=G46kOOM9xbQ

Environment details (please complete the following information):

Environment location: Docker(20.10.8 version)
Method of cuCIM install: Docker RAPIDS 21.08 container
- docker pull nvcr.io/nvidia/rapidsai/rapidsai:21.08-cuda11.2-runtime-ubuntu20.04

Additional context
Just need to resolve this error.

Add/improve README.md, examples and notebooks for v0.19.0

Update README.md
- Add a link to GTC session
Update notebooks
- Current notebooks are from cuClaraImage which is based on .whl file. Need to improve existing notebooks, assuming that cucim package is already installed based on the conda environment.
Update examples
- Refine existing examples. ==> not needed for now.

Separate cuda kernels into c files

As per the comment from @jakirkham on MR #100

#100 (comment)

RuntimeError: This format has more than one image with Subfile Type 0 so cannot be loaded!

HI cuCIM staff

My company is doing image processing stuff, and the image format provided is '.svs'.
Since we are interesting in this tool kit, we convert the .svs to .tif format by tifffile package.

I got this error message when I ran code below.
"
input_file = "1M09_1.tif"
slide = cucim.CuImage(input_file)
"
OS: Ubuntu 18.04

The point is I really don't know how to fix this error.

BTW, I did try with another image(get from online) in this step. And it works as normal. So, I guess my cuCIm installed correctly.

	std::vector<std::string> plugin_names{ std::string("cucim.kit.cuslide@" XSTR(CUCIM_VERSION) ".so"),
	std::string("cucim.kit.cumed@" XSTR(CUCIM_VERSION) ".so") };

	// static void load_plugins(const PluginLoadingDesc& desc)
	//{
	// CUCIM_ASSERT(g_framework);
	// return g_framework->load_plugins(desc);
	//}

	void CuCIMFramework::load_plugins(const PluginLoadingDesc& desc)
	{
	(void)desc;
	}

	from .binary import (binary_closing, binary_dilation, binary_erosion,
	binary_opening)
	from .grey import (black_tophat, closing, dilation, erosion, opening,
	white_tophat)
	from .greyreconstruct import reconstruction
	from .misc import remove_small_holes, remove_small_objects
	from .selem import (ball, cube, diamond, disk, octagon, octahedron, rectangle,
	square, star)

rapidsai / cucim Goto Github PK

cucim's Introduction

cuCIM

Install cuCIM

Conda

Notebooks

Downloading sample images

Build/Install from Source

Contributing Guide

Acknowledgments

License

cucim's People

Contributors

Stargazers

Watchers

Forkers

cucim's Issues

malloc upgrade

memory pool

New functions introduced in 0.19 that are easy to port to cuCIM (via CuPy)

channel_axis support:

Single-precision support

API/Deprecations

Bug fixes

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Make the cache memory/strategy configurable

Use high-performance libraries/algorithm

Results

Demo

Room for improvement

Using of a memory pool

Supporting generator(iterator)

Supplementary Images

Tracking

Problem

Candidate Implementations

1. 2D implementation based on cellular automata in recent NPP

2. A different GPU-based algorithm with corresponding citations at watershed-cuda

3. CLIJ's approach

How could I solve this error? Like chang tiff type?

Report needed documentation

Recommend Projects

Recommend Topics

Recommend Org

Jobs

How could I solve this error?
Like chang tiff type?