GithubHelp home page GithubHelp logo

shaziaakbar / deep-openslide Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 5.0 9 KB

A library for training deep neural networks using pathology data by accessing patches on-the-fly

Python 100.00%
pathology openslide

deep-openslide's Introduction

deep-openslide


Prerequisites:

  • Python (tested on version 2.7)
  • openslide
  • h5py

Description:

Extract.py provides functionality for extracting patches from pathology slides using openslide. Slides should be provided as .svs files and the location to these files is determined when you set up an instance of the TissueLocator class.

Extracted files are saved in .h5 (by default) compressed files containing one variable, 'x'. By default patches are extracted on a regular grid defined by patch_size. There are three additional modes which may also use:

Modes:

  • ["all"] (default): all tiles are extracted from the slide
  • ["random"]: a random subset of tiles are extracted from the slide; num_tiles_per slide must be provided.
  • ["mask"]: a mask is provided which determines where tiles should be extracted; mask must be provided

Usage: To use TissueLocator, in the constructor provide the location of the slide to be processed and the size of the tiles to extract (i.e. tile_size). You then have two options for extracting patches:

  • ["extract_patches_and_save"]: Use this function if you would like to store the patches externally. By default .h5 files are generated but you can override this to save "numpy" or "jpg" files instead.
  • ["get_tissue_patches"]: Use this function if you would like the patches to be returned as a numpy array; useful for a pipeline in which you want to call this method multiple times.

Here is a very simple example of how to use the code below:

    import extract 
    einst = extract.TissueLocator(svsfile, tile_size = (512, 512), mode="random", num_tiles_per_slide=102)
	einst.extract_patches_and_save(out_location = save_location)

To read the tiles back in again, simple load the h5 file as follows

	import h5py
	meta = h5py.File('name of .h5 file', 'r')
	patches = meta['x'][:]
	meta.close()

Using Tissue Finder code:

Functionality also exists for extracting regions containing tissue only. This is switched off by default. Set use_tissue_finder to True to enable this.

Note: if you are extracting random patches which contains tissue i.e. mode == "random" AND use_tissue_finder = True, then you must perform a check after setting up the constructor. An example is given below.

einst = extract.TissueLocator(filename, tile_size, mode="random", num_tiles_per_slide=num_patches, use_tissue_finder=True)
extracted_points = einst.get_coordinates_as_list(dims=(512, 512))

einst.extract_patches_and_save(out_location = save_location, workers=1, list_points=extracted_points)

Additional parameters:

  • [level]: If you would like to extract at the full resolution leave scale=1.0 (this hasn't been tested for other scales yet)
  • [offset]: If you don't want to extract tiles on a regular grid, set the step size for both x and y direction.
  • [export_format]: format of images/files to be saved (default: "h5").
  • [workers]: defines how many parallel processes are operatign when saving externally. Increase this to extract patches faster.
  • [MAX_SAMPLE_PER_BATCH_FILE]: determines the maximum number of patches to be stored in a single .npz file. By default this is 500.

Save in alternative formats:

If you don't wish to save the patches as .h5 files there is an option to change this to numpy files. If you would like to implement your own method (e.g. to create images) overwride the save() function in TileWorker class.

deep-openslide's People

Contributors

shaziaakbar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.