GithubHelp home page GithubHelp logo

bioio-devs / bioio Goto Github PK

View Code? Open in Web Editor NEW
27.0 6.0 1.0 6.63 MB

Image reading, metadata management, and image writing for Microscopy images in Python

Home Page: https://bioio-devs.github.io/bioio/OVERVIEW.html

License: BSD 3-Clause "New" or "Revised" License

Python 99.22% Just 0.78%
dask image-metadata microscopy python scientific-computing scientific-formats xarray aicsimageio

bioio's Introduction

BioIO

Build Status Documentation PyPI version License Python 3.9+

Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Pure Python


Documentation

See the full documentation on our GitHub pages site

Example Usage (see full documentation for more examples)

Install bioio alongside OME TIFF and OME ZARR plug-ins with pip (this example won't use the OME ZARR plug-in):

pip install bioio bioio-ome-tiff bioio-ome-zarr

from bioio import BioImage

# Get a BioImage object
img = BioImage("my_file.tiff")  # selects the first scene found
img.data  # returns 5D TCZYX numpy array
img.xarray_data  # returns 5D TCZYX xarray data array backed by numpy
img.dims  # returns a Dimensions object
img.dims.order  # returns string "TCZYX"
img.dims.X  # returns size of X dimension
img.shape  # returns tuple of dimension sizes in TCZYX order
img.get_image_data("CZYX", T=0)  # returns 4D CZYX numpy array

# Get the id of the current operating scene
img.current_scene

# Get a list valid scene ids
img.scenes

# Change scene using name
img.set_scene("Image:1")
# Or by scene index
img.set_scene(1)

# Use the same operations on a different scene
# ...

Plug-in Registry

Bioio handles a variety of different image types through specific plug-ins. The bioio-dev supported plug-ins can be found within this registry.

Plug-in Extension Repository
arraylike ArrayLike Built-In
bioio-czi .czi Repo
bioio-dv .dv, .r3d Repo
bioio-imageio .jpg, .png, Full List Repo
bioio-lif .lif Repo
bioio-nd2 .nd2 Repo
bioio-ome-tiff .ome.tiff, .tiff Repo
bioio-ome-tiled-tiff .tiles.ome.tif Repo
bioio-ome-zarr .zarr Repo
bioio-sldy .sldy, .dir Repo
bioio-tifffile .tif , .tiff Repo
bioio-tiff-glob .tiff (glob) Repo
bioio-bioformats Full List Repo

Each reader plugin should closely follow the specification laid out in bioio-base. As such, it is likely common that reader plugins won't distribute their own documentation and users should instead review bioio_base.reader.Reader for API documentation for the underlying Reader API. We encourage plugin authors to publish their own documentation if they change or include new features into their published image readers.

Issues

Click here to view all open issues in bioio-devs organization at once

bioio's People

Contributors

brianwhitneyai avatar dependabot[bot] avatar evamaxfield avatar seanleroy avatar toloudis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

armavica

bioio's Issues

Read from public `s3://` paths without authentication, without requiring any code change from users

Feature Description

If a file is hosted publicly on S3 and a user without AWS credentials set up must use fs_kwargs: BioImage("s3://bucketname/path/to/file", fs_kwargs=dict(anon=True)). (I'm thinking specifically about OME ZARRs, but this is likely relevant to all readers.)

Instead, bioio should be able to handle this internally and let the user write BioImage("s3://bucketname/path/to/file").

Solution

As far as I can tell, the proper way to check if a user is authenticated to read a file is to attempt to read it and see if there's an error, so the solution I think is to try to read files twice with logic similar to the following.

try:
   # __init__ with user's fs_kwargs
except SomethingSpecific as e:
   if protocol == "s3://":
       # __init__ with user's fs_kwargs plus {anon: True}
   else:
       raise e

Alternatives

Looks like they tried it in s3fs but was reverted “unfortunately, it led to far more problems than it solved. I’d be happy to see a more solid implementation, if some wants to try.”

Add AICSImage

Feature Description

Add AICSImage from aicsimageio. Need a new name though... maybe BioImage?

Full documentation

Feature Description

RTFM

Use Case

We want to maintain at a minimum the level of documentation standards begun in aicsimageio.

Solution

Bring docs over from aicsimageio, rewrite as necessary. First version, probably the api/examples will be similar but the installation will be different.

Alternatives

Rewrite from scratch?

can't instantiate BioImage in v1.0

Describe the Bug

Exception has occurred: TypeError
Can't instantiate abstract class BioImage with abstract methods current_resolution_level, resolution_levels, set_resolution_level
  File "D:\src\aics\cellbrowser-tools\cellbrowser_tools\bin\make_zarr_timeseries_segs.py", line 236, in <module>
    im = BioImage(filepath, reader=CziReader)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Can't instantiate abstract class BioImage with abstract methods current_resolution_level, resolution_levels, set_resolution_level

Expected Behavior

class BioImage should instantiate without error

Reproduction

im = new BioImage

Environment

  • bioio Version: 1.0
  • bioio-base Version 1.0
  • bioio-czi Version 1.0

bioio-tifffile and bioio-ome-tiff parse ome-tiff metadata differently

Describe the Bug

bioio-tifffile.Reader does not properly parse ome-tiff metadata, but bioio-ome-tiff does. There is not notice given to the user which reader is being used, and given bioio-tifffile being required for bioio-ome-tiff to work (see: bioio-devs/bioio-ome-tiff#11), results in bad default behavior where bioio-tifffile is being used as the default reader.

Expected Behavior

I would expect bioio_ome_tiff to be the preferred reader if the given files is an ome-tiff. I would also expect if it is not the default reader, that bioio-tifffile would parse the metadata correctly.

Reproduction

from bioio import BioImage
import bioio_ome_tiff
import bioio_tifffile

img = BioImage(
    img_path,
)

print(img.metadata)
print(f'channels: {img.channel_names}')
print(f'pixel sizes: {img.physical_pixel_sizes}')
<OME xmlns="http://www.openmicroscopy.org/Schemas/OME/2016-06" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openmicroscopy.org/Schemas/OME/2016-06 http://www.openmicroscopy.org/Schemas/OME/2016-06/ome.xsd" Creator="aicsimageio 4.11.0">
  <Image ID="Image:0">
    <Pixels ID="Pixels:0:0" DimensionOrder="XYZCT" Type="uint16" SizeX="1676" SizeY="1694" SizeZ="1" SizeC="2" SizeT="1" PhysicalSizeX="0.18097556" PhysicalSizeY="0.18097556">
      <Channel ID="Channel:0:0" Name="AF546" SamplesPerPixel="1"/>
      <Channel ID="Channel:0:1" Name="Oblique" SamplesPerPixel="1"/>
      <TiffData IFD="0" PlaneCount="2"/>
    </Pixels>
  </Image>
</OME>
channels: ['Channel:0:0', 'Channel:0:1']
pixel sizes: PhysicalPixelSizes(Z=None, Y=1.0, X=1.0)

The same exact behavior is reproduced with

img = BioImage(
    img_path,
    reader=bioio_ome_tiff.Reader
)

But is very different with the ome_tiff reader

img = BioImage(
    img_path,
    reader=bioio_ome_tiff.Reader
)

print(img.metadata)
print(img.metadata.images)
print(f'channels: {img.channel_names}')
print(f'pixel sizes: {img.physical_pixel_sizes}')

returns:

images=[<1 field_type>] creator='aicsimageio 4.11.0'
[Image(
   id='Image:0',
   pixels={'channels': [{'id': 'Channel:0:0', 'name': 'AF546', 'samples_per_pixel': 1}, {'id': 'Channel:0:1', 'name': 'Oblique', 'samples_per_pixel': 1}], 'tiff_data_blocks': [{'plane_count': 2}], 'id': 'Pixels:0:0', 'dimension_order': <Pixels_DimensionOrder.XYZCT: 'XYZCT'>, 'type': <PixelType.UINT16: 'uint16'>, 'size_x': 1676, 'size_y': 1694, 'size_z': 1, 'size_c': 2, 'size_t': 1, 'physical_size_x': 0.18097556, 'physical_size_y': 0.18097556},
)]
channels: ['AF546', 'Oblique']
pixel sizes: PhysicalPixelSizes(Z=None, Y=0.18097556, X=0.18097556)

Environment

  • OS Version: Windows 11
  • bioio Version: 1.0.1
  • bioio-tifffile Version: 1.0.0
  • bioio-ome-tiff Version: 1.0.0

Admin: update pre-commit to `ruff`

From: #48 (comment)

Just as a general note, this is all fine but ruff does all of these in one except for black. My pre-commit configs are now: black, ruff, mypy

In short, ruff is a new-ish tool for linting, formating, isorting, etc. It pull in all of the standards from existing systems into a single tool and is much faster at processing a whole repo's worth of changes.

See here for an example of my current pre-commit config: https://github.com/evamaxfield/rs-graph/blob/main/.pre-commit-config.yaml

Ignoring the notebooks section and the mypy extended types installs I generally think this is the minimum that pre-commit needs to be set up with now that Ruff does so much of the work.

Add new ome-zarr-writer

Feature Description

We have a new ome-zarr-writer implementation that works much better for converting our large time series data. We'd like to make that a part of bioio now.

Use Case

We have lots of data to convert to ome-zarr. The current ome zarr writer is unstable and doesn't work well on large data.

Solution

Clean up the code from here:
https://github.com/allen-cell-animated/cellbrowser-tools/blob/nucmorph/cellbrowser_tools/ome_zarr_writer.py
and add it to this repo.
Add documentation as needed.

Question: How should we deprecate the old zarr writer that already exists in here?

Alternatives

Centralized guess dim order

          Just curious. I think this may be the third time we have this code in multiple places? (Base) TiffReader, this one, and somewhere else?? Should we centralize this to a standard function that is called `guess_dims_from_shape` that all of these different Readers call?

Originally posted by @evamaxfield in #47 (comment)

Unable to open czi line-scan kymographs - missing raw frame(t) pixel data

First, thanks for working on this critical package. I understand and appreciate how complex opening proprietary binary file formats can be. I followed the package from aicsimageio to its new home at bioio and am excited to get this working!

Describe the Bug

I would like to open 'line scan' czi files using bioio. When I do this, some image data is missing, it does not include the correct number of frames(t). In both cases below, I get only one (1) frame whereas I expect 1000 and 10000 in my two example files (see below).

Background. Lines scans are where you perform, well, a line scan over and over. With something like 1000 or 10000 repeated scans of a line. You then make a 2D image of all the pixel data by constructing a 2D image where each line scan is one row in the image. What results is referred to as a kymograph or space/time image. For example, each line scan might be 1024 pixels and it is repeatedly acquired 1000 times. You then make a kymograph image from all the raw pixel data and it has a pixel size/shape of 1000 x 1024 (or visa-versa).

Problem is, when I open with bioio, I am only getting one line-scan, e.g. one frame(t), rather than the expected 1000 or 10000.

Expected Behavior

I would expect all the raw pixel data would be available and it is not. It seems to be missing frames(t), with 1 when I expect 1000 or 1000

Reproduction

Here are two sample files

  • CO2.czi with 512 pixels per line and 10000 repeated line scans.
  • Image 14.czi with 1024 pixels per line and 1000 repeated line scans.

In this example, bioio got the pixels per line correct (e.g. 1024) but there is no raw pixel data corresponding to the repeated line-scans. In all case frames(T) is just 1. I expect some dimension to represent the repeated line scans, e.g. 1000.

import os
from bioio import BioImage
#from aicspylibczi import CziFile  # tried this too but same bug/behavior

path = 'path/to/my/example/file/Image 14.czi'
print(f'loading {os.path.split(path)[1]}')

img = BioImage(path)

print('img.data.shape:', img.data.shape)  # (1, 2, 1, 1, 1024)

print('=== xarray_data:')
print(img.xarray_data)  # <xarray.DataArray (T: 1, C: 2, Z: 1, Y: 1, X: 1024)>

print('dims:', img.dims)  # <Dimensions [T: 1, C: 2, Z: 1, Y: 1, X: 1024]>
print('dims.order:', img.dims.order)  # TCZYX
print('dims.X:', img.dims.X)  # 1024
print('dims.Y:', img.dims.Y)  # 1
print('dims.T:', img.dims.T)  # 1
print('img.shape:', img.shape)  # (1, 2, 1, 1, 1024)

# returns the X dimension pixel size as found in the metadata
print('physical_pixel_sizes.X:', img.physical_pixel_sizes.X) 

Results in this output

loading Image 14.czi
img.data.shape: (1, 2, 1, 1, 1024)
=== xarray_data:
<xarray.DataArray (T: 1, C: 2, Z: 1, Y: 1, X: 1024)>
array([[[[[ 0,  0,  0, ...,  0,  0,  0]]],
        [[[39, 66,  0, ...,  0,  0,  0]]]]], dtype=uint8)
Coordinates:
  * C        (C) <U6 'NDD R1' 'NDD R4'
  * Z        (Z) float64 0.0
  * Y        (Y) float64 0.0
  * X        (X) float64 0.0 0.06599 0.132 0.198 ... 67.31 67.37 67.44 67.5
Dimensions without coordinates: T
Attributes:
    unprocessed:  <Element 'ImageDocument' at 0x14fe8ae80>
dims: <Dimensions [T: 1, C: 2, Z: 1, Y: 1, X: 1024]>
dims.order: TCZYX
dims.X: 1024
dims.Y: 1
dims.T: 1
img.shape: (1, 2, 1, 1, 1024)

physical_pixel_sizes.X: 0.06598669529504321

Ideas

Idea 1) There are recent updates from Zeiss on a different/newer package called pylibCZIrw

I see bioio is using aicspylibczi which is a fork of elhuhdron/pylibczi.

I also see another version of pylibczi coming directly from zeiss (no github) but can download source from PyPi pylibCZIrw. This has very recent updates (Dec 4, 2023). I am not able to install this because I haven't been able to build the wheels on a macOS machine with apple silicon (M2) CPU (afaik). The email contact there is [email protected].

Idea 2) I can open these czi files using Fiji/Bio-Formats importer and get all the pixel data

Fiji environment is

Fiji: 2.14.0/1.54f (2023-07-07)
Bio-Formats Plugins for ImageJ 7.0.1 (2023-10-16)

File image 14.czi
Each line (image) is (1024, 1)
Channels (c): 2
Slices(z): 1
Frames(t): 1000
physical pixel size: 0.066

manually saved as tif and openend with tifffile yields a good shape of (1000, 1, 1024)

File CO2.czi

Each line (image) is (512,1)
Channels (c): 2
Slices(z): 1
Frames(t): 10000
physical pixel size: 0.0175789

manually saved as tif and openend with tifffile yields a good shape of (10000, 1, 512)

Environment

macOS 13.3.1 (a)
Python 3.11.5
bioio 1.0.1.dev2+gf594aa5
bioio-base 1.0.1.dev1+g1003511
bioio-czi 1.0.1.dev0+gcb2ed27.d20231212

p.s. I am actually using the bioio branch `fix/resolution-level from a recent issue and pull request.

Migration document from aicsimageio

Feature Description

Create a migration document that shows a recipe for switching over from aicsimageio for people who are ready to do so.

Use Case

We wish to encourage users to switch to the new system. Such a document would also allow us to easily see where are the difficult parts of this migration.

Solution

1. Install separate reader packages.
2. Update import statements and replace AICSImage with BioImage.
3. Profit! 

Reader-specific documentation can be hard to discover

Feature Description

The bioio repository should point to/link to/reference/copy documentation for the readers.

Use Case

The bioio-ome-zarr reader has the following documentation in its README.md:

If using an s3:// path to access a public S3 bucket, the BioImage constructor must be given a dictionary with anon: True in the fs_kwargs argument.

This may be relevant to many readers, but consistency and coordination across all readers is challenging due to the many repositories.

Solution

  1. Option 1: bioio's README could link to bioio-ome-zarr's README
  2. Option 2: the text from the bioio-ome-zarr example above could be copied into bioio's README
  3. Option 3: Centralize all documentation by updating all readers' READMEs to say "Documentation for this repository lives at https://github.com/bioio-devs/bioio/" and adding a section for each reader to the central bioio README

Improve error message on UnsupportedFileFormatError

Describe the Bug

UnsupportedFileFormatError not useful

im = bioio.BioImage('some.nd2')
File ~/miniforge3/envs/chunglab/lib/python3.11/site-packages/bioio/bio_image.py:171, in BioImage.determine_reader(image, fs_kwargs, **kwargs)
    168 # If we haven't hit anything yet, we likely don't support this file / object
    169 # with the current plugins installed
    170 image_type = str(type(image))
--> 171 raise biob.exceptions.UnsupportedFileFormatError(
    172     "BioImage",
    173     image_type,
    174     msg_extra=(
    175         "You may need to install an extra format dependency. "
    176         "See bioio README for list of some known plugins."
    177     ),
    178 )

UnsupportedFileFormatError: BioImage does not support the image: '<class 'str'>'. 
You may need to install an extra format dependency. 
See bioio README for list of some known plugins.

Expected Behavior

  1. The type is shown incorrectly image_type = str(type(image)) is very likely to simply show '<class 'str'>' which is unlikely to help the user .... looks like this is fixed in main
  2. The bioio README doesn't have a list of known plugins
  3. It would be nice to have an in-library awareness of known plugins. that is, if you're going to be maintaining a list in a README somewhere, you might as well put that in that code too to suggest a specific plugin to install.

Reproduction

Environment

  • OS Version: [e.g. macOS 11.3.1]
  • bioio Version: [e.g. 0.5.0]

Modify Plug-in Search Capabilities

Feature Description

plugins.py is responsible for searching for plug-ins to use with this repository. Instead of letting any plug-in found be accepted, the search should reject any that don't meet the minimum bioio-types specified.

Use Case

Currently any plug-in matching a pattern criteria will be found and supplied which is great for now. However, as this repository evolves it is likely the contract between this repository and the readers (plug-ins) will too potentially leading to conflicts. We have an independent base that both this repository and the plug-ins depend on, bioio-types, so using bioio-types as a way to determine the compatibility should alleviate this.

Plugin Reader Metadata Does Not Standardize Supported Suffixes

Describe the Bug

Found while working on understanding #55 (#54): Some reader metadata files include leading . while others do not. This results in our plugin cache having duplicate entries for what is the same suffix.

For example:

No other readers do however.

This causes problems during plugin parsing and organization on the bioio side because we create our plugin cache based off of the suffixes. If bioio-tifffile and bioio-ome-tiff both list tiff and .tiff respectively in their reader_metadata files. That will create two entries rather than one.

Expected Behavior

I don't think we can always trust that the plugin authors will follow a standard regarding some arbitrary suffix information. Instead we can do some minor clean up during plugin cache creation time to either always remove or always add a leading period.

Reproduction

Add print statements checking for leading "." during plugin cache creation (my debug prints), or, print the plugin cache after creation and look for duplicate suffixes.

Clarify that the equivalent of aicsimageio.types.PhysicalPixelSizes is bioio_base.types.PhysicalPixelSizes

Feature Description

Please make it clearer in the documentation that the equivalent of aicsimageio.types.PhysicalPixelSizes is bioio_base.types.PhysicalPixelSizes.

Use Case

I was having trouble finding the BioIO equivalent of aicsimageio.types.PhysicalPixelSizes in the documentation and had to delve into the code to find that it was under bioio_base.types. I use it to reassign some missing metadata to an image when saving it with OmeTiffWriter.
I was unable to find it in the AICSImageIO migration guide too (also the Reader Installation Instructions link says "Not Found" and does not lead anywhere on the migration page).

Solution

Add section for bioio_base to the API Reference.

Alternatives

Describe in the AICSImageIO migration guide that "aicsimageio.types.PhysicalPixelSizes is now bioio_base.types.PhysicalPixelSizes".

bioio-tifffile takes precedence over bioio-ome-tiff for ome-tiffs

Describe the Bug

bioio-tifffile takes precedence over bioio-ome-tiff even for ome.tiff files.

Expected Behavior

OME-TIFFs should be correctly picked up by bioio-ome-tiff

Reproduction

I am in napari_bioio but its the same test data:

from bioio import BioImage

img = BioImage("napari_bioio/tests/resources/variance-cfe.ome.tiff")
img

img returns as <BioImage [plugin: bioio-tifffile installed at 2024-07-18 16:09:20.787038, Image-is-in-Memory: False]>

Environment

  • OS Version: [e.g. macOS 11.3.1]
  • bioio Version: [e.g. 0.5.0]
❯ pip freeze | grep bioio

bioio==1.0.3
bioio-base==1.0.4
bioio-imageio==1.0.0
bioio-ome-tiff==1.0.0
bioio-ome-zarr==1.0.1
bioio-tiff-glob==1.0.0
bioio-tifffile==1.0.0

Already determined that the issue is that plugin list isn't sorted into longest matching suffix first (i.e. longer suffixes are more specific and we should try them first in the case of plugins handling multiple filetypes)

Bioio plugin dump unable to report plugins

Describe the Bug

When using bioio.plugins.dump() the list of plugins cannot be reported due to a KeyError: 'author'. I believe this issue is also effecting the determine_plugin method.

Reproduction

Created a python 3.10 virtual env and pip install bioio bioio-ome-tiff. You need to install one of the plugins to see this bug but any of them seem to reporduce.

import bioio

bioio.plugins.dump_plugins()

Output

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/allen/aics/microscopy/brian_whitney/repos/bioio/test.py in line 3
      [1](file:///allen/aics/microscopy/brian_whitney/repos/bioio/test.py?line=0) import bioio
      [2](file:///allen/aics/microscopy/brian_whitney/repos/bioio/test.py?line=1) from bioio_ome_tiff import Reader
----> [3](file:///allen/aics/microscopy/brian_whitney/repos/bioio/test.py?line=2) bioio.plugins.dump_plugins()

File /allen/aics/microscopy/brian_whitney/repos/bioio/bioio/plugins.py:255, in dump_plugins()
    [253](file:///allen/aics/microscopy/brian_whitney/repos/bioio/bioio/plugins.py?line=252) dist = ep.dist
    [254](file:///allen/aics/microscopy/brian_whitney/repos/bioio/bioio/plugins.py?line=253) if dist is not None:
--> [255](file:///allen/aics/microscopy/brian_whitney/repos/bioio/bioio/plugins.py?line=254)     print(f"  Author  : {dist.metadata['author']}")
    [256](file:///allen/aics/microscopy/brian_whitney/repos/bioio/bioio/plugins.py?line=255)     print(f"  Version : {dist.version}")
    [257](file:///allen/aics/microscopy/brian_whitney/repos/bioio/bioio/plugins.py?line=256)     print(f"  License : {dist.metadata['license']}")

File [~/.pyenv/versions/bioio-import-bug/lib/python3.10/site-packages/importlib_metadata/_adapters.py:54](https://vscode-remote+ssh-002dremote-002bdev-002daics-002dbwp-002d001-002ecorp-002ealleninstitute-002eorg.vscode-resource.vscode-cdn.net/allen/aics/microscopy/brian_whitney/repos/bioio/~/.pyenv/versions/bioio-import-bug/lib/python3.10/site-packages/importlib_metadata/_adapters.py:54), in Message.__getitem__(self, item)
     [52](file:///home/brian.whitney/.pyenv/versions/bioio-import-bug/lib/python3.10/site-packages/importlib_metadata/_adapters.py?line=51) res = super().__getitem__(item)
     [53](file:///home/brian.whitney/.pyenv/versions/bioio-import-bug/lib/python3.10/site-packages/importlib_metadata/_adapters.py?line=52) if res is None:
---> [54](file:///home/brian.whitney/.pyenv/versions/bioio-import-bug/lib/python3.10/site-packages/importlib_metadata/_adapters.py?line=53)     raise KeyError(item)
     [55](file:///home/brian.whitney/.pyenv/versions/bioio-import-bug/lib/python3.10/site-packages/importlib_metadata/_adapters.py?line=54) return res

KeyError: 'author'

Environment

Python 3.10
Installs: pip install bioio bioio-ome-tiff

Features / Bugs / Eva Notes

Wanted to give the new libraries a whirl and figured I would make a couple of notes as I go along. I will try to resolve these issues / features this weekend or something as many of them are "optional" / maybe nice for dev and debugging experience.


  1. BIOIO_BASE_DIST_NAME needs to be updated from bioio-types to bioio-base

  2. It would be nice to have access to what plugin read the file somehow. Currently on image read, the __repr__ returns: <BioImage [Reader: Reader, Image-is-in-Memory: False]> but it might be better to return <BioImage [plugin: {plugin_name}, Image-is-in-Memory: False]>, esp. because most plugins will generally all use the same class name of Reader instead of say OmeTiffReader, better might be <BioImage [reader_path: {python_module_path_of_reader}, Image-is-in-Memory: False]>

  3. plugin_cache should be made into a set, dict, or deduped list:

In [2]: from bioio.plugins import get_plugins

In [3]: get_plugins()
Out[3]: 
[PluginEntry(entrypoint=EntryPoint(name='bioio-tifffile', value='bioio_tifffile', group='bioio.readers'), metadata=<class 'bioio_tifffile.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-tiff', value='bioio_ome_tiff', group='bioio.readers'), metadata=<class 'bioio_ome_tiff.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-zarr', value='bioio_ome_zarr', group='bioio.readers'), metadata=<class 'bioio_ome_zarr.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-lif', value='bioio_lif', group='bioio.readers'), metadata=<class 'bioio_lif.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347)]

In [4]: get_plugins()
Out[4]: 
[PluginEntry(entrypoint=EntryPoint(name='bioio-tifffile', value='bioio_tifffile', group='bioio.readers'), metadata=<class 'bioio_tifffile.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-tiff', value='bioio_ome_tiff', group='bioio.readers'), metadata=<class 'bioio_ome_tiff.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-zarr', value='bioio_ome_zarr', group='bioio.readers'), metadata=<class 'bioio_ome_zarr.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-lif', value='bioio_lif', group='bioio.readers'), metadata=<class 'bioio_lif.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-tifffile', value='bioio_tifffile', group='bioio.readers'), metadata=<class 'bioio_tifffile.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-tiff', value='bioio_ome_tiff', group='bioio.readers'), metadata=<class 'bioio_ome_tiff.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-zarr', value='bioio_ome_zarr', group='bioio.readers'), metadata=<class 'bioio_ome_zarr.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-lif', value='bioio_lif', group='bioio.readers'), metadata=<class 'bioio_lif.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347)]
  1. get_plugins should be called at the top of determine_reader? Currently cannot read files until manual get_plugins call (ignore debugging print statements):
In [1]: from bioio import BioImage, plugins

In [2]: img = BioImage("../aicsimageio/aicsimageio/tests/resources/3d-cell-viewer.ome.tiff")
---------------------------------------------------------------------------
UnsupportedFileFormatError                Traceback (most recent call last)
Cell In[2], line 1
----> 1 img = BioImage("../aicsimageio/aicsimageio/tests/resources/3d-cell-viewer.ome.tiff")

File ~/active/cell/bioio/bioio/bio_image.py:196, in BioImage.__init__(self, image, reader, reconstruct_mosaic, fs_kwargs, **kwargs)
    186 def __init__(
    187     self,
    188     image: biob.types.ImageLike,
   (...)
    192     **kwargs: Any,
    193 ):
    194     if reader is None:
    195         # Determine reader class and create dask delayed array
--> 196         ReaderClass = BioImage.determine_reader(
    197             image, fs_kwargs=fs_kwargs, **kwargs
    198         )
    199     else:
    200         # Init reader
    201         ReaderClass = reader

File ~/active/cell/bioio/bioio/bio_image.py:177, in BioImage.determine_reader(image, fs_kwargs, **kwargs)
    174 # If we haven't hit anything yet, we likely don't support this file / object
    175 # with the current plugins installed
    176 image_type = str(type(image))
--> 177 raise biob.exceptions.UnsupportedFileFormatError(
    178     "BioImage",
    179     image_type,
    180     msg_extra=(
    181         "You may need to install an extra format dependency. "
    182         "See bioio README for list of some known plugins."
    183     ),
    184 )

UnsupportedFileFormatError: BioImage does not support the image: '<class 'str'>'. You may need to install an extra format dependency. See bioio README for list of some known plugins.

In [3]: plugins.get_plugins()
Out[3]: 
[PluginEntry(entrypoint=EntryPoint(name='bioio-tifffile', value='bioio_tifffile', group='bioio.readers'), metadata=<class 'bioio_tifffile.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-tiff', value='bioio_ome_tiff', group='bioio.readers'), metadata=<class 'bioio_ome_tiff.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-zarr', value='bioio_ome_zarr', group='bioio.readers'), metadata=<class 'bioio_ome_zarr.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347),
 PluginEntry(entrypoint=EntryPoint(name='bioio-lif', value='bioio_lif', group='bioio.readers'), metadata=<class 'bioio_lif.reader_metadata.ReaderMetadata'>, timestamp=1698949015.0319347)]

In [4]: img = BioImage("../aicsimageio/aicsimageio/tests/resources/3d-cell-viewer.ome.tiff")

In [5]: img
Out[5]: <BioImage [Reader: Reader, Image-is-in-Memory: False]>
  1. I think the error that is throw in determine_reader should include the image path (and parameter type) rather than just the parameter type. For example: UnsupportedFileFormatError: BioImage does not support the image: '<class 'str'>'. You may need to install an extra format dependency. See bioio README for list of some known plugins. is a confusing error because uhhh "what do you mean you don't take a str path? Do you want a pathlib.Path path?"

  2. Related to 2 and 4, I think all of the readers calling their image reading class Reader is causing namespace / object override issues? I have bioio-lif installed locally and can read LIF files directly using it's reader but reading fails from bioio directly. I think this is especially true because I installed bioio-tifffile later than bioio-ome-tiff and even though I am trying to read an OME-TIFF, it is using the tifffile reader. I also wonder if this is because each of the plugins has this line in their base level __init__.py: from .reader import Reader and that is causing the namespace override, whichever plugin is installed most recently is the only plugin that has a "valid" Reader object. All the others are just pointers to the newly installed Reader?

In [1]: import bioio_lif

In [2]: bioio_lif.Reader("../aicsimageio/aicsimageio/tests/resources/tiled.lif")
Out[2]: <Reader [Image-is-in-Memory: False]>

In [3]: from bioio import BioImage, plugins

In [4]: plugins.get_plugins()
Out[4]: 
[PluginEntry(entrypoint=EntryPoint(name='bioio-tifffile', value='bioio_tifffile', group='bioio.readers'), metadata=<class 'bioio_tifffile.reader_metadata.ReaderMetadata'>, timestamp=1698950076.8994923),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-tiff', value='bioio_ome_tiff', group='bioio.readers'), metadata=<class 'bioio_ome_tiff.reader_metadata.ReaderMetadata'>, timestamp=1698950076.8994923),
 PluginEntry(entrypoint=EntryPoint(name='bioio-ome-zarr', value='bioio_ome_zarr', group='bioio.readers'), metadata=<class 'bioio_ome_zarr.reader_metadata.ReaderMetadata'>, timestamp=1698950076.8994923),
 PluginEntry(entrypoint=EntryPoint(name='bioio-lif', value='bioio_lif', group='bioio.readers'), metadata=<class 'bioio_lif.reader_metadata.ReaderMetadata'>, timestamp=1698950076.8994923)]

In [5]: BioImage("../aicsimageio/aicsimageio/tests/resources/tiled.lif")
---------------------------------------------------------------------------
UnsupportedFileFormatError                Traceback (most recent call last)
Cell In[5], line 1
----> 1 BioImage("../aicsimageio/aicsimageio/tests/resources/tiled.lif")

File ~/active/cell/bioio/bioio/bio_image.py:191, in BioImage.__init__(self, image, reader, reconstruct_mosaic, fs_kwargs, **kwargs)
    181 def __init__(
    182     self,
    183     image: biob.types.ImageLike,
   (...)
    187     **kwargs: Any,
    188 ):
    189     if reader is None:
    190         # Determine reader class and create dask delayed array
--> 191         ReaderClass = BioImage.determine_reader(
    192             image, fs_kwargs=fs_kwargs, **kwargs
    193         )
    194     else:
    195         # Init reader
    196         ReaderClass = reader

File ~/active/cell/bioio/bioio/bio_image.py:172, in BioImage.determine_reader(image, fs_kwargs, **kwargs)
    170 image_value = str(image)
    171 image_type = str(type(image))
--> 172 raise biob.exceptions.UnsupportedFileFormatError(
    173     "BioImage",
    174     f"{image_value} ({image_type})",
    175     msg_extra=(
    176         "You may need to install an extra format dependency. "
    177         "See bioio README for list of some known plugins."
    178     ),
    179 )

UnsupportedFileFormatError: BioImage does not support the image: '../aicsimageio/aicsimageio/tests/resources/tiled.lif (<class 'str'>)'. You may need to install an extra format dependency. See bioio README for list of some known plugins.

In [6]: img = BioImage("../aicsimageio/aicsimageio/tests/resources/3d-cell-viewer.ome.tiff")

In [7]: type(img.metadata)
Out[7]: str

Add writers

Feature Description

Add all the writers currently supported in aicsimageio

Consider a public plugin registry

Feature Description

Idea: If we encourage community development of plugins, it would be nice to have a registry of plugins.

Use Case

How will users find plugins for file formats they are interested in?

Solution

As a minimum, we could add a link to the list of bioio-devs repos in the documentation (or even a python method that lists them as strings!)
Documentation of bioio could suggest ways for users to find plugins they are interested in.

Alternatives

A registry of all plugins (e.g. searching pypi for bioio_*). Napari is an example of having a big infrastructure for managing community plugin contributions in a Python project. (Probably bigger than what we need)
Or a registry of "blessed" plugins?

remove zarr<2.18.0 constraint

Feature Description

Tests were failing on #48 because of more recent zarr versions so I had to pin to pre-2.18. Ideally this should not be a constraint.

Use Case

We want to be able to take advantage of all the fixes and improvements that come with new zarr releases!

Solution

Unpin to zarr<3 and eventually allow v3 support too

writer compression parameter

Feature Description

It may benefit many users to be able to set compression parameters for both zarr and tiff writing.

Use Case

See AllenCellModeling/aicsimageio#553

Solution

Add compression args to writer classes. I guess these can be specific to the supported compression types of the underlying libraries. Consider just forwarding kwargs through.

reader chunk_dims are too coarse-grained for dask

Feature Description

Allow more specificity in requested chunk_dims for get_image_dask_data (_read_delayed).

Use Case

Dask best practices for chunk sizing: https://docs.dask.org/en/latest/delayed-best-practices.html
Currently bioio only allows you to choose a dimension AXIS for chunking and always chunks with the whole range of that dimension.
When reading a large file with something like t=500, c=2, z=150, y=1000, x=2000, (each xy slice is about 4MB) we don't really have the option to chunk by just a few, or even half, of the z slices.

Solution

Not sure about the api to use here, but possibly pass in a chunk_size with actual numeric values.
So the user would probably do this, as an example:

im = BioImage(path)
dims = im.dims
im.get_image_dask_data(chunk_size=[1,1,1,dims.Y*0.5, dims.X*0.5]). # (Defaults to something sensible?)

timeseries writer tests break in actual use

Describe the Bug

The first and third tests which are commented out here (on test_timeseries_writer ) fail when uncommented because the data array created has inconsistent shapes for the frames output from the imageio reader.

Expected Behavior

The tests run as expected and the DefaultReader is able to read an image created with the parameters of those tests.

Reproduction

This is reproducible in aicsimageio version 4.11.0 if you add
reader.dask_data.compute() or reader.data to the same test which effectively does what the test in bioio is doing (stitches multi-frame images together and stacks them). The resulting error in both cases is the same:
ValueError: could not broadcast input array from shape (100,100,4) into shape (1,100,100)
It appears all frames are the same shape in this case except the first one.

Environment

  • OS Version: macOS 12.6.6
  • bioio Version: developer branch feature/add-writers @ commit hash ______

Plan to keep up with aicsimageio maintenance updates

Feature Description

during this interim period before bioio is ready for release, there could be updates in aicsimageio that get things out of sync. How will we keep things in sync till we are ready to release?
See #9 (review)

I am not proposing a specific strategy but we should at least consider how to deal with this.

HTTP URL for ZARR fails to be validated by `determine_plugin` for bioio-ome-zarr

Description: see title

Code to reproduce

from bioio import BioImage
path = "https://allencell.s3.amazonaws.com/aics/nuc_morph_data/data_for_analysis/baseline_colonies/20200323_09_small/raw.ome.zarr"
image = BioImage(path)  # FileNotFoundError
print(image.get_image_dask_data())

Workaround

If the OME ZARR reader is specified explicitly, determine_plugin is bypassed and the image can be loaded. bioio.BioImage(path, reader=bioio_ome_zarr.Reader).

Environment

I'm testing against bioio-ome-zarr PR 17 which improves support for reading from S3.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.