GithubHelp home page GithubHelp logo

reconstrue / brightfield Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 1.0 226.07 MB

Reconstruction of biocytin-stained neurons detected in brightfield microscopy image stacks

Home Page: http://reconstrue.com

License: Apache License 2.0

Jupyter Notebook 99.75% Ruby 0.01% Makefile 0.01% TeX 0.01% HTML 0.04% CSS 0.15% JavaScript 0.03% Python 0.01%
microscopy neuroscience reconstruction

brightfield's People

Contributors

johntigue avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

johntigue

brightfield's Issues

Viz: depth coded MinIP

Novelty:

  • Depth coded brightfield stack (not fluorescent)
  • Turbo not Jet
  • ?
import reconstrue.brigthfield.depth_coder
  • Colored Depth coding (grayscale in, z-axis rainbow out)
  • 2 projection views, one from each side of slide: stacked 1 to N, and stacked N to 1
  • Colormap to use should be a variable/dropdown
  • #74
  • Scale for eyes to map color to depth, axis labeled 0--[z_stack.depth]
  • Want this running in client-side JS for when rotating a volume

Algorithm:

  1. Start with z-stacks 2D images, grayscale 8-bit
  2. Turn each image into color
  • All pixels in image get set to same RGB color
  • That RGB color is the z-index scaled/bined to 255. Use that to index into matplotlib colormap
  • Input pixel intensity (0-255), inverted (255-intensity) becomes each pixel's opacity i.e. RGB => RGBA
  • Save those images to disk
  1. Now do a minimum intensity projection on the colorized z-stack, but it's actually maximum opacity projection (keeping the color in the accumulator image).

Finally, just show the color accumulator image, with or without transparency (dunno).

Bonus, take those images saved to disk (depth coded with inverse intensity as opacity), start with a pure white brightfield then merge that with each colored z-index image. That's what it would look like with light shining through but color filtered to depth. Then animate that as a GIF/movie. Pair that with the depth colored 2D projection mugshot.

Challenge response review notebook: 10 SWC juxtapositions

This issue describes a Jupyter notebook which is an image gallery of the ten neurons in the test set. The neurons images have SWCs superimposed upon them. There may be one or two SWCs per neuron (gold standards versus candidates). Note, this is an eyeballing tool; there is no programmatic quality evaluator that scores a candidate SWC against the corresponding gold standard.

If both SWCs are provided (gold standard and "other") then they can be juxtaposed by simultaneously rendering them in the same plot. This is how judges can review a submission.

If the gold standard SWC is not provided then this same notebook is essentially how a computer vision researcher can evaluate a models performance on the 10 specimen test set. For those ten specimens, this notebook visually overlays the skeleton atop the MinIP image (inverted TIFF stack).

The challenge is very specific: there are 10 neurons that a contestant needs to generate SWC files for. In its simplest form that could be simply the submission really being just 10 SWC files in a folder on the web. Each of the SWCs will be named according to its neuron ID, e.g. 651806289.swc and those 10 are:

  • 665856925.swc
  • 715953708.swc
  • 751017870.swc
  • 687730329.swc
  • 850675694.swc
  • 827413048.swc
  • 761936495.swc
  • 691311995.swc
  • 741428906.swc
  • 878858275.swc

But the human judges will want more than just 10 SWC files. They will want those submitted SWCs juxtaposed with gold standard SWC, so as to compare.

This notebook is simply told the folder's URL and it renders the 10 neuron stacks and 20 SWCs (10 from human, 10 from the submitted challenge folder).

The goal is to have a Jupyter notebook already run on some set of 10 SWC files in a folder. I.e. it's a pre-run notebook showing what the layout will look like when done properly.

This artifact is the template (a Jupyter notebook) that challengers fill out to submit their work to the challenge. That URL will be forwarded to the judges, one copy the 10 SWC template notebook for each submission.

So, the evaluation notebook simply repeats the same thing for each of the ten TEST_DATA_SET neurons:

  • show 2D projection of image stack
  • show 2-part mugshot of gold standard SWC
  • show 2-part mugshot of contestants SWC

Basically, contestant needs to submit a list of 10 URLs to SWC files, and this notebook should be able to layout the results.

This also means that each of the ten SWCs can be generated sequentially on Colab, and then the ten SWCs viewed together finally, in this notebook.

Note: having the 10 neuron hardwired is bad. Want arbitrary array of neurons. This way a comp vision coder can juxtapose a model run against the 105 specimens in the training set. I.e. can be a tool for when training.

Viz: MinIP colorized four-up

This is 4 images in a 2x2 Colab Grid:
Screen Shot 2019-10-25 at 07 43 43

See if can get rid of that padding; might be out of code control on Colab, dunno.

It should also be a matplotlib figure with 4 sub-plots. That would be a desirable plot/rendering.

Viz: Render reconstructions in CCF space

In the end, the topmost view of the cells in the dataset would be to plot them together in The Allen's CCF i.e. position them in the brain, to whatever accuracy is possible. 10 test neurons would be ok, 105 training neurons might get a bit crowded/overplotted.

  • BrainRender seem like a good tool. It's not interactive but it can render purdy.

Method: SVM classify by z-pixel shadow

This SVM tech is some simple stuff that should be checked out before going full on deep and convolutional. SVM is what was state-of-the-art a decade ago before deep learning stole the spotlight.

Identification of individual cells from z-stacks of bright-field microscopy images This

Really nice 2018 paper out of France. Simple, fast techniques (SVM classifier) simply looking at "z-pixels" (i.e. array of pixel intensities on the thin Z plane of a microscope slide; the colored curves). Basically, classifying SVM is light absorption around cell.

Viz: initial dataset visualizations

So, the Brightfield Challenge dataset is basically just a raw z-stack of 2D images (plus SWC files), each neuron's data in a separate directory. This project seeks to explore "doing neuroscience on Colab." So, crank out some code that does basic image processing on the image stack.

  • Basic TIFF view with pan and zoom, workin elegantly on Colab (#34)
  • Animated GIF of stack: all images into a single GIF (#20)
  • Z-stack explorer: a slider for z-index to a single image (#22)
  • HTML5 movie: Folks can play the movie, or manipulate time to move up/down stack (#199)
  • Maximal Image Projection (MIP, Minimal technically, MinIP): 3D z-stack to 2D (#28)
  • MinIP colorized 4-up (#45)
  • Depth coded colorized MinIP (#32)

U-Net

Ideally want:

  • commercially friendly license
  • GPU
  • Colab-able
  • pre-trained models for transfer learning

See also

Freiburg

University of Freiburg seems to be the originator of U-Net:

TensorFlow as platform:

This deep neural network is implemented with Keras functional API, which makes it extremely easy to experiment with different interesting architectures... runs seamlessly on CPU and GPU.

Pytorch

pytorch Implementation of U-Net, R2U-Net, Attention U-Net, Attention R2U-Net
https://github.com/LeeJunHyun/Image_Segmentation

Viz: Minimum Image Projection

This work is being done on Colab, datatset_get_eyes_on_image_stack.ipynb.

  • Basic MinIP (grayscale in, grayscale out)

Maximal Image Projection (MIP) is the basic 3D z-stack to 2D overview viz for microscopy slides. Technically "Minimal" because brightFIELD. This seems to be a pretty fundamental overview visualization of slide image stacks. The algorithm to make the 2D projection just sifts through the z-stack, and for each (x,y) column keep the lowest (darkest) pixel value for the 2D projection.

Viz: Dataset Z-stack explorer, single image

This was implemented in datatset_get_eyes_on_image_stack.ipynb Just a slider which sets the id of the specific z-index slide from input dataset to show.

So, this project is recapitulating a bunch of basic microscopy software, on Colab. One of the first software utilities that every toolset makes is a Z-stack explorer:

  • A single image view
  • A dragable slider to move up/down the stack

Viz: MinIP colorizeds as animated GIF

This would be relatively small as far as the animated GIFs already being produced (20 MB to 70 MB) and would be an interesting use of simple web tech to make a visualization.

Might want to have a transition color. This might best be done in JavaScript.

Viz: use skeleton to set crop bbox of MinIP

A full MinIP has to be provided for anyone who might want to wander around the slide. But usually the action all happens around the biocytin stained cell. The MinIP cropped to the interesting part is a good cover image or specimens gallery catalog of mug shots.

  1. Alternatively, skeletons by definition provide a cropping bbox.
  2. Simply doing low level stats on the MinIP should be able to find the main ROI. This is great just for setting the init crop view. This could lead to more pipeline automation (non-stop notebook "Run all"): can process a specimen's stack and know how to crop for the mug shot.
  3. But also a good warm-up for deep object recognition.

For animation optimization:

Don't need to animate outside the ROI. So show MinIP as background and have animted GIF of just ROI laid out to align with MinIP background PNG.

ShuTu SWC generator: make smooth Colab deploy

This is done in brightfield_neuron_swc_by_shutu.ipynb, which essentially translates ShuTu's build.sh into a Colab code cell, thereby customizing the build for Colab.

On 2019-10-12, it was shown that ShuTu can be installed on Colab. But the stock install instructions raises errors on Colab, harmless yet confusing errors.

It would be better to rewrite the build.sh as Colab code s.t. there are no errors. It's pretty simple: basically calling make twice. Also no need to install demo data because will be using Allen Institute data.

ShuTu reconstruct a single Allen neuron

Latest progress is on Colab: ShuTu process Allen.

To Do:

  • Test if ShuTu runs on Colab (#4)
  • Demo ShuTu working on its own demo data (#6)
  • Pick a single small neuron to reconstruct: 651806289 is 6GB (#7)
  • Test downloading dataset from Wasabi to Colab
  • Package Allen data for ShuTu to read (#13)
  • Have ShuTu do its thing on 651806289

ShuTu: analyzeSWC.py

ShuTu tutorial says:

To get basic statsitics of the neuron after scaling, use Python script analyzeSWC.py. The command is python analyzeSWC scaledSwcFilename

Might as well, CLI can work on Colab so it's something to do after SWC generation.

Vaa3D: repro whatever pipeline they use inside the Allen

Allen Institute's Morphology technical white paper:

Serial images (63X magnification) through biocytin-filled neurons were evaluated for quality, and cells that passed a quality threshold entered a detailed morphological analysis workflow. Reconstructions of cell dendrites and the initial segment (spiny neurons) or complete axon (aspiny neurons) were generated for a subset of neurons using a 3D Visualization-Assisted Analysis (Vaa3D) workflow. The automated 3D reconstruction results were then manually curated using the Mozak extension of Vaa3D.

Whelp, can Vaa3D be spun-up somehow on Colab?

Judging: implement as Google Form

In the end, a submission to the challenge consists of 10 SWC files, one for each of the 10 neurons in TEST_DATA_SET. That's the result of a lot of work but not that much data.

It would be interesting to see how well Google Forms could be used to actually perform an evaluation. This would dump the evaluation answers (quality score ranging from 1 to 10) directly into a Google Sheet spreadsheet.

This would be a low overhead way to admin a challenge and its evaluation competition.

Dataset: generate image stack as animated gif

Each neuron has an image stack.

  • The whole image stack is on the order of 6 to 60 GB
  • Each stack has a few hundred images
  • Each image file is on the order of 10s of MB each

So, how about "compress" the image stack into an animated GIF file of no more than a few MB, for easy viewing over the web? Looks like ImageJ already does this. That's Java; would be nice to find a Python implementation.

Dataset: formulate plan of attack

This work is implemented in a Jupyter notebook, challenge_dataset.ipynb.

  • The bucket where the dataset is stored is called brightfield-auto-reconstruction-competition.
  • Total dataset is about 2 terabyte of data.
  • Each neuron's data is between ~6GB and ~60GB
  • There is no index file in each neuron's data directory, just:
    • one SWC (except TEST_DATA_SET)
    • a series of .tif files

Google Colab's allocated file system is 50G, so pulling down one neuron at a time will probably work.

Viz: histogram the whole stack

If there were a simple matplotlib (or fancier) histogram of the grayscale pixels, that might enable some intensity filtering a la brightness/constrast tweaking.

GPU: Intensity projections: quintessential candidate for Numba speed-up

So, The Allen (and the rest of the scientific computing community(ies)) have decided that Python is the main language for users of the Allen stack. Of course Python would not have won out if not for the fact that mathematically intense Python code can be accelerated. Numba is the latest, greatest way to accelerate Python.

Part of the Reconstrue message is that neuroscience can really benefit from deploying to the cloud.

  • Colab is Jupyter on the cloud for free.
  • Colab has free Nvidia T4 GPUs available
  • Numba can compile to CUDA.
  • Ergo, this neuroscience-on-colab demo should use Numba

One of the very simplest things to Numba-ify would be intensity projections of 3D image stacks to 2D. This involves marching down the Z-index, each 2D image at a time and keeping the highest/lowest value for each "z-pixel" in a composite image/array.

Dataset: migrate 2.5TB to Goolge (GCS)

Consider

  • Google was a sponsor of BioImage 2019
  • The dataset resides on Wasabi, which mimics AWS's S3's APIs
  • This project implements all code as Python on Colab
  • The dataset would load quicker on Colab if the dataset were in Google Cloud Storage (GCS

That's a pretty legitimate use of Google subsidized research credits.

Docs: Leveraging the TPU/GPUs available on Colab

On Colab, you have to explicitly request a GPU. As of late 2019, there are three that have been seen recently

  • Tesla P100
  • Tesla T4
  • Tesla K80

The first two work with Rapids, the K80 does not. Let's make sure we have one that works with Rapids.

In the first pass of getting ShuTu working on Colab, there is no focus on the Nvidia T4 GPU that is made freely available on Colab. ShuTu is not written to leverage GPUs, rather it uses multiple CPU cores, of which there is only one on Colab (cue sad trombone).

Nonetheless, the whole imaging pipeline should run on RAPIDs and CUDA etc. Just chuck the image stack onto the GPU and out comes a SWC. Yup, just that simple :)

Viz: Dataset multi-scale renderings optimized for web

The original image stacks are 6GB to ~60GB. That's too big for the web and to cram into laptop GPUs. But if the resolution is taken down by half, that 1/8 original size. Smallest datasets would be under 1 GB already.

Basically, want whole 3D stack at some resolution being manipulated by WebGL.

  • max pixel, above which not shown
  • shadow effects?
  • smoothing anisotropic layers

A much slicker implementation would be to translate the image stack into a Neuroglancer precomputed file set. The precomputed rendering is just flat files for 3D volumetric rendering over the web. That's about as optimized as it's going to get.

Dataset: specimens_manifest.json

The code for generating specimens_manifest.json is in dataset_manifest.ipynb.

The manifest is a JSON file. The cache is a Python object for downloading files in the manifest and caching them on the Colab local file system. This is handy for re-running notebooks after frequent disconnects.

Specimen features

  • ID/name
  • Bucket prefix
  • array of file names in image stack
  • SWC filename

The cache presents the 115 specimens in a unified way:

  • specimens.all
  • specimens.testing
  • specimens.training.

These are just objects in the manifest.json.

[These ideas are for later in the game, not early "just make it work" phase of this project.]

Once this is in a module, could check if manifest file exists and if not then crawl the dataset bucket.

The Allen is big on NWB:N, this dataset seems like a perfect use of NWB file. This project uses Python as the main programming language and there is PyNWB for working with NWB files. Additionally, the AllenSDK works with NWB files.

On the other hand, having the files loose in FS does make it clear what's in the challenge dataset.

Actually, the sweet spot might be Exdir: HDF5 structure but just file system and .yaml files.

Physical layout of file system

Perhaps we could reorg the dataset. It would be nice to have sub-directories labeled:

Package code as module

Great 2015 talk:Python Packaging from Init to Deploy (26 min)

A while-deving, pre-module hack/mocking: If "modules are a thin wrapper around dictionaries" then mock a module and hang all the utilities off it.

I guess the name of the module is used like:

from reconstrue.brightfield import colab_display

Since using allensdk requires a !pip install allensdk going to be needing to do some pip-ing. So, might as well package up a bunch of the code as a git repo on github and just pip install from the github URL. This will simplify the notebooks, and should make it easier for others to reimplement.

  • A minimal setup.py (e.g.)

TReMAP:

TReMAP from out of the Allen seems to be somewhat similar to ShuTu in that both project to 2D before tracing and then mapping back to 3D space.

CellTypes_Morph_Overview.pdf

After multi-scale enhancement, an image was then ready for automatic 3D reconstruction. Automatic 3D neuron reconstruction for very large 3D images remains a challenge in neuroscience. For these datasets, an automatic 3D neuron tracing method called TReMAP was developed (Zhou et al., 2016). The key advancement of TReMAP was to utilize 3D Virtual Finger (a reverse-mapping technique) to detect 3D neuron structures based on tracing results on 2D projection planes. TReMAP was used for all data generated after 07/21/2016, while NeuronCrawler was used prior to that date (Zhou et al., 2015).
Tracing the 2D Enhanced Plane
For each labeled group in the 2D enhanced image, 2D projection trees were traced. TReMAP uses All-PathPruning 2 (APP2) as the basic tracing module, as APP2 has been shown to be relatively fast and accurate based on pruning a dense initial reconstruction of a neuron to generate a compact representation of the neuron (Peng et al., 2011; Xiao and Peng, 2013).

SWC skeleton viewers on Colab

Goal is to view SWCs on Colab.

[Viewing on desktop apps is nice but this project has an explicit goal of realizing all UI in Jupyter. But the ShuTu desktop app runs on windows, macOS, Linux – if one wishes to go desktop app old-school.]

Various potential types of tools, simplest first:

Note: The gold standard SWCs distinguish between axon and dendrites. What of somas?

Styling

See #62.

Initially, following the style of the paper handout from the first Challenge meeting, the viewer should do render

  • Dendrites: blue
  • Axon: red

Various existing skeleton viewer codebases:

Viz: Twitter friendly exports

This work was done in initial_dataset_vizualization.ipynb.

  • Find out max size 5MB or 15MB?
  • Get an animated gif going
  • A movie is valuable on twitter b/c timeline slider is z-stack navigation control.

Note that Twitter says:

What are the size and file type requirements?

Photos can be up to 5MB; animated GIFs can be up to 5MB on mobile, and up to 15MB on web.
We accept GIF, JPEG, and PNG files.
We DO NOT accept BMP, TIFF or other file formats.
Your photo will be automatically scaled for display in your expanded Tweet and in your gallery.

Web catalog, artifacts_manifest.json: each specimen dir has one

There should be a script that walks a built/ dirtree and manifests all files. This dumps the info to artifacts_manifest.json. A catalog of specimens would have one artifacts_manifest.json per specimen directory. An html page with a specimens_manifest.json could then walk the catalog: just a dropdown to select specimen and all resources from that dir are loaded into the page.

Part of the web catalog is simply the downsampled slides. So there are relative links to those. But the originals could be behind an absolute URL to another site. That site could be Wasabi, but it's not configured for that, currently. That's 2.5 TB of storage.

Viz: image stack as animated GIF

Demo: datatset_get_eyes_on_image_stack.ipynb

To do

  • Get minimally working
  • MinIP as slide 0 i.e. the preview. What people see before animation. Use libgif
  • MinIP as last slide. This way after playing, MinIP is left on the screen.
  • Get Colab to play animation.
  • Add progress bar to bottom of each image, showing depth of stack

Goal, animated GIF of stack i.e. all images into a single GIF. This is an initial "eyes on data" thing, which does not involve any automated reconstructions, just visualizing the input dataset.

The Brightfield Challenge dataset is basically just a raw z-stack of 2D images, plus SWC files. This project seeks to explore "doing neuroscience on Colab." So, crank out some code that does basic image processing on the image stack

Each neuron has an image stack.

The whole image stack is on the order of 6 to 60 GB
Each stack has a few hundred images
Each image file is on the order of 10s of MB each
So, how about "compress" the image stack into an animated GIF file of no more than a few MB, for easy viewing over the web? Looks like ImageJ already does this. That's Java; would be nice to find a Python implementation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.