ome / ome-zarr-py Goto Github PK

View Code? Open in Web Editor NEW

136.0 20.0 50.0 755 KB

Implementation of next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.

Home Page: https://pypi.org/project/ome-zarr

License: Other

Python 100.00%

zarr ome ngff ome-zarr

ome-zarr-py's Introduction

ome-zarr-py

Tools for multi-resolution images stored in Zarr filesets, according to the OME NGFF spec.

See Documentation for usage information.

Documentation

Documentation will be automatically built with readthedocs.

It can be built locally with:

$ pip install -r docs/requirements.txt
$ sphinx-build -b html docs/source/ docs/build/html

Tests

Tests can be run locally via tox with:

$ pip install tox
$ tox

To enable pre-commit code validation:

$ pip install pre-commit
$ pre-commit install

Release process

This repository uses bump2version to manage version numbers. To tag a release run:

$ bumpversion release

This will remove the .dev0 suffix from the current version, commit, and tag the release.

To switch back to a development version run:

$ bumpversion --no-tag [major|minor|patch]

specifying major, minor or patch depending on whether the development branch will be a major, minor or patch release. This will also add the .dev0 suffix.

Remember to git push all commits and tags.

License

Distributed under the terms of the BSD license, "ome-zarr-py" is free and open source software

ome-zarr-py's People

Contributors

Stargazers

Watchers

ome-zarr-py's Issues

Don't hardcode channel axis, or any other metadata really, if array is not OME?

Hardcoding the dimensions to TCZYX and channel_axis=1 makes sense when the array can be ascertained to be an OME array, but it seems pretty bold to do it on any random array, of any shape, too. Example:

import numpy as np
import zarr

fout = zarr.open('demo-zarr-3d.zarr', mode='w', shape=(5, 10, 10), dtype=np.float32) 
fout[:] = np.random.random((5, 10, 10))

If I then do:

napari demo-zarr-3d.zarr

in the command line, having installed ome-zarr, I get:

Missing wells cause reading failures

https://forum.image.sc/t/working-with-ome-zarr-stored-on-prem-object-storage/54567/4

$ ome_zarr info https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/plates/422.zarr
Traceback (most recent call last):
  File "/home/djme/miniconda3/envs/zarr2/bin/ome_zarr", line 8, in <module>
    sys.exit(main())
  File "/home/djme/miniconda3/envs/zarr2/lib/python3.9/site-packages/ome_zarr/cli.py", line 165, in main
    ns.func(ns)
  File "/home/djme/miniconda3/envs/zarr2/lib/python3.9/site-packages/ome_zarr/cli.py", line 29, in info
    list(zarr_info(args.path))
  File "/home/djme/miniconda3/envs/zarr2/lib/python3.9/site-packages/ome_zarr/utils.py", line 28, in info
    for node in reader():
  File "/home/djme/miniconda3/envs/zarr2/lib/python3.9/site-packages/ome_zarr/reader.py", line 604, in __call__
    node = Node(self.zarr, self)
  File "/home/djme/miniconda3/envs/zarr2/lib/python3.9/site-packages/ome_zarr/reader.py", line 58, in __init__
    self.specs.append(Plate(self))
  File "/home/djme/miniconda3/envs/zarr2/lib/python3.9/site-packages/ome_zarr/reader.py", line 433, in __init__
    self.get_pyramid_lazy(node)
  File "/home/djme/miniconda3/envs/zarr2/lib/python3.9/site-packages/ome_zarr/reader.py", line 467, in get_pyramid_lazy
    self.numpy_type = self.get_numpy_type(image_node)
  File "/home/djme/miniconda3/envs/zarr2/lib/python3.9/site-packages/ome_zarr/reader.py", line 513, in get_numpy_type
    return image_node.data[0].dtype
IndexError: list index out of range

cc: @will-moore

Test suite failure for PR #

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1435869064

Test suite failure for PR #

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1435906838

Deprecate chunks keyword in write_image` and write_image_multiscales

See conversation at #161 (comment)

The storage_options keyword introduced in 0.3.0 should support setting the chunks to the Zarr dataset. This option also gives the extra flexibility of being able to set different chunk sizes for each resolution. The explicit chunks keyword could be deprecated in an upcoming 0.x.0 release of the ome-zarr library.

IndexError('list index out of range') for SPW

The SPW images exported via omero zarr export works in napari:

$ napari https://minio-dev.openmicroscopy.org/idr/idr0001-graml-sysgro/pr59_nested/2551.zarr/A/5/0

But not the plate itself (visible in vizarr with nested PR):
https://deploy-preview-85--vizarr.netlify.app/?source=https://minio-dev.openmicroscopy.org/idr/idr0001-graml-sysgro/pr59_nested/2551.zarr

Trying to open a nested plate with this PR (same error with the 'nested' plate above, or an older v0.1 plate):

$ napari https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/plates/422.zarr
10:51:46 ERROR PluginError: Error in plugin 'ome_zarr', hook 'napari_get_reader'
  Cause was: IndexError('list index out of range')
    in file: /Users/wmoore/Desktop/ZARR/ome-zarr-py/ome_zarr/reader.py
    at line: 513
     author: The Open Microscopy Team
    package: ome-zarr
        url: https://github.com/ome/ome-zarr-py
    version: 0.0.18.dev0

This also fails using the current master branch (c3f641d) but with a different error:

$ napari https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/plates/422.zarr
10:50:25 ERROR PluginError: Error in plugin 'ome_zarr', hook 'napari_get_reader'
  Cause was: ArrayNotFoundError("array not found at path %r' ''")
    in file: /opt/anaconda3/envs/napari/lib/python3.9/site-packages/zarr/core.py
    at line: 186
     author: The Open Microscopy Team
    package: ome-zarr
        url: https://github.com/ome/ome-zarr-py
    version: 0.0.18.dev0

And in fact it's not working for any of the older versions of ome-zarr-py that I've tried, so I don't know when it last worked (or what version of napari etc).

Originally posted by @will-moore in #75 (comment)

Odd error on bad input

Passing the top-level container from bioformats2raw to ome_zarr download leads to the following error:

z) ~/opt/ome_zarr_test_suite $./scripts/ome_zarr_downloads data/64x64-fake-v0.2
Traceback (most recent call last):
  File "/usr/local/anaconda3/envs/z/bin/ome_zarr", line 33, in <module>
    sys.exit(load_entry_point('ome-zarr', 'console_scripts', 'ome_zarr')())
  File "/opt/ome-zarr-py/ome_zarr/cli.py", line 165, in main
    ns.func(ns)
  File "/opt/ome-zarr-py/ome_zarr/cli.py", line 35, in download
    zarr_download(args.path, args.output)
  File "/opt/ome-zarr-py/ome_zarr/utils.py", line 61, in download
    common = strip_common_prefix(paths)
  File "/opt/ome-zarr-py/ome_zarr/utils.py", line 113, in strip_common_prefix
    min_length = min([len(x) for x in parts])
ValueError: min() arg is an empty sequence

Installation issue with ome-zarr-py

Running:

pip install ome-zarr

Produces build errors:

c-blosc/internal-complibs/zstd-1.4.1/decompress/zstd_decompress.c:92:21: error: use of undeclared identifier 'ZSTD_FRAMEHEADERSIZE_PREFIX'
ZSTD_FRAMEHEADERSIZE_PREFIX - ZSTD_FRAMEIDSIZE :
^
c-blosc/internal-complibs/zstd-1.4.1/decompress/zstd_decompress.c:93:21: error: use of undeclared identifier 'ZSTD_FRAMEHEADERSIZE_PREFIX'
ZSTD_FRAMEHEADERSIZE_PREFIX;
^
c-blosc/internal-complibs/zstd-1.4.1/decompress/zstd_decompress.c:94:24: error: use of undeclared identifier 'ZSTD_FRAMEHEADERSIZE_PREFIX'
ZSTD_STATIC_ASSERT(ZSTD_FRAMEHEADERSIZE_PREFIX >= ZSTD_FRAMEIDSIZE);
^
c-blosc/internal-complibs/zstd-1.4.1/decompress/zstd_decompress.c:379:23: error: use of undeclared identifier 'ZSTD_FRAMEHEADERSIZE_PREFIX'
while (srcSize >= ZSTD_FRAMEHEADERSIZE_PREFIX) {
^
c-blosc/internal-complibs/zstd-1.4.1/decompress/zstd_decompress.c:631:28: error: use of undeclared identifier 'ZSTD_FRAMEHEADERSIZE_MIN'
remainingSrcSize < ZSTD_FRAMEHEADERSIZE_MIN+ZSTD_blockHeaderSize,

Environment:
macOS (0.15.6 (19G73))
Python 3.7.7

Test suite failure for commit

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1546553644

add compressor to write_image

i couldn't see an exposed interface for compressing the chunks. is this intentional? if not, i will send a PR to alter write_image to take a compressor keyword and passing it on to zarr array creation.

Store axes as `_ARRAY_DIMENSIONS`

Noticed while reviewing glencoesoftware/bioformats2raw#121, the ome_zarr.writer API currently does not store the axes metadata as _ARRAY_DIMENSIONS under the .zattrs of each resolution array.

Since this is a requirement of the spec,introduced in 0.3 - see https://ngff.openmicroscopy.org/0.3/#multiscale-md, the relevant writer APIs should be updated before releasing 0.3.0.

Test suite failure for PR #

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1436716045

RFE: check all invalidations before raising exception

Current validation implementations throw on the first error, rather than finding all validation errors and listing them. Instead, an exception with all the errors could be raised or alternatively a Validation object (or similar) could be returned.

Originally posted by @joshmoore in #157 (comment)

Example invocations not working due to SSL issues (on some systems)

Frustratingly none of the example invocations seemed to be working for me.

E.g. ome_zarr info https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/ just gave me the following diagnostics:

WARNING:ome_zarr:unreachable: https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/.zarray
WARNING:ome_zarr:unreachable: https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/.zgroup
not an ome-zarr: https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/

Digging further "unreachable" was apparently due to the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 376, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 996, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 352, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 370, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/usr/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: DH_KEY_TOO_SMALL] dh key too small (_ssl.c:1108)

And this seems to be connected to Debian (in my case Ubuntu) where the default security level for TLS connections was apparently raised from level 1 to level 2:

One sub-optimal work around seems to be manually overriding the security level.

Beyond documenting this so others who run into the same issue might have to do less digging ...
it might be nice to forward the underlying error to the user (I found "unreachable" somewhat misleading in this case).

Assuming the issue partly lies with the server where these images are hosted ...
could someone here notify them of the issue or would know who to notify?

Add class annotations and/or other metadata properties to labels

Currently the labels spec supports the declaration of a label-value and its associated color.

Commonly, label values have other associated information including the most obvious, the class name. napari also supports display of label properties, so this would be a nice additional feature for the reader plugin.

I think the critical requirements for these properties should be:

Supporting an easy mapping between a given property and the label-value/s it is associated with
Enforcing as few rules as possible on what kinds of properties can be accepted
Supporting an arbitrary number of properties

There are three ways I can see the spec supporting these additional properties:

Arbitrary number of lists of max length n for a label image containing n label values, each corresponding to a property. The index in the list corresponds to the integer label-value e.g.

    "image-label": {
        "version": "0.1",
        "colors": [
            {
                "label-value": 1,
                "rgba": [
                    255,
                    100,
                    100,
                    255
                ]
            },
            {
                "label-value": 2,
                "rgba": [
                    0,
                    40,
                    200,
                    255
                ]
            },
            {
                "label-value": 3,
                "rgba": [
                    148,
                    50,
                    165,
                    255
                ]
            }
        ],
        "properties": [
            {
                "class": [
                    "Urban",
                    "Water",
                    "Agriculture"
                ],
                "area_m2":
                [
                    "400",
                    "1532",
                    "590"
                ]
            }
        ]
    }

I think this is least explicit, and less intuitive than the next approaches.

Declare another group similar to colors, where each label-value has its own associated properties:

{
    "multiscales": [
        {
            "datasets": [
                {
                    "path": "0"
                },
                {
                    "path": "1"
                },
                {
                    "path": "2"
                },
                {
                    "path": "3"
                }
            ],
            "version": "0.1"
        }
    ],
    "image-label": {
        "version": "0.1",
        "colors": [
               ...
        ],
        "properties": [
            {
                "label-value": 1,
                "class": "Urban",
                "area_m2": "400"

            },
            {
                "label-value": 2,
                "class": "Water",
                "area_m2": "1532"
            },
            {
                "label-value": 3,
                "class": "Agriculture",
                "area_m2": "590"

            }
        ]
    }
}

This is explicit, but has the disadvantage of duplicating the label-value definitions.

Make color another property e.g.

        "properties": [
            {
                "label-value": 1,
                "rgba": [
                    255,
                    100,
                    100,
                    255
                ],
                "class": "Urban",
                "area_m2": "400"

            },
            {
                "label-value": 2,
                "rgba": [
                    148,
                    50,
                    165,
                    255
                ],
                "class": "Water",
                "area_m2": "1532"
            },
            {
                "label-value": 3,
                "rgba": [
                    148,
                    50,
                    165,
                    255
                ],
                "class": "Agriculture",
                "area_m2": "590"

            }
        ]

This doesn't duplicate label-values, and has the benefit of keeping all properties associated with a particular label-value in one spot.

On the implementation side, I think the differences in parsing the properties are negligible.

I'd love to hear what other people think are appropriate ways to represent the properties in the label metadata, or what they think the best option is.

Improve error handling if file exists

download a zarr file ome_zarr download https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/
re-run the command
This leads to

  File "/Applications/anaconda2/envs/zarr/lib/python3.6/site-packages/zarr/errors.py", line 17, in err_contains_array
    raise ValueError('path %r contains an array' % path)
ValueError: path '' contains an array

Cannot query the dataset example

Dear,

I cannot query the test dataset described in the documentation:

ome_zarr info https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/
ERROR:ome_zarr.cli:not a zarr: None

Test suite failure for commit

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1679439418

Test suite failure for PR #

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1435982238

parse_url in write mode does not truncate

Passing mode=w to parse_url does not truncate existing files as it should.
Minimal example:

import numpy as np
import ome_zarr.writer
import zarr
file_path = "my.ome.zarr"
loc = ome_zarr.io.parse_url(file_path, mode="w")
group = zarr.group(loc.store)
shape = (1, 1, 1, 256, 256)
ome_zarr.writer.write_image(np.random.rand(*shape), group)

Fails with ContainsArrayError: path '0' contains an array when run the second time.

Test suite failure for PR #

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1436678961

ome-zarr-py on conda-forge

Would you guys have anything against putting this package on conda-forge? I can handle the process of bringing it up there, and there would not be any increased maintenance effort required from your side. Conda-forge will watch for new releases on pypi...

write_image assumes particular axisorder

Heyo, I just tried to naively write an ndarray to a an ome.zarr file, but didn't manage. My array was something like numpy.zeros((100, 200, 1), dtype='uint8') where I considered axes to be yxc.

of course when calling write_image I specified the axes argument. The scaler, however, will assume a certain axisorder. This leads to weird errors that are not super obvious.

I'd say this should be documented, or circumvented (e.g. making Scaler aware of axes, using a temporary adding the appropriate axes, etc...)

Test suite failure for PR #

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1435804999

Data repo sanity checks

@jni pointed out that the multiscale information for 4007801 was missing. Only path "0" was listed. That has now been corrected, but one sanity check would be to detect if all arrays that are present are listed in some multiscale metadata.

Test suite failure for PR #

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1436507889

HCS stitching of Wells to Plate broken for data less than 5D

The concatenation of Wells into a Plate assumes 5D images.
This is not always the case with v0.3, since this version supports 2D -> 5D

e.g. https://uk1s3.embassy.ebi.ac.uk/0.3_samples/7751.zarr
See this plate in vizarr
cc @sbesson

Trouble registering napari plugin

Excited to see this! Perhaps I'm missing something, but is there a step required to register the reader with napari? The current example doesn't work for me:

~/demos/napari-ome
py3.8 ❯ ipython             
# Python 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 07:56:27) 
# Type 'copyright', 'credits' or 'license' for more information
# IPython 7.13.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import napari                                                                                                                                          

In [2]: import ome_zarr                                                                                                                                        

In [3]: %gui qt                                                                                                                                                

In [4]: viewer = napari.Viewer()                                                                                                                               

In [5]: viewer.open('https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/')                                                                                
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-32f855cb8c65> in <module>
----> 1 viewer.open('https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/')

AttributeError: 'Viewer' object has no attribute 'open'

Validate data

See ome/ome_zarr_test_suite#11

Currently ome-zarr-py doesn't validate data, e.g. checking that essential metadata is present or that data is expected dimensions.

Maybe add this functionality to the info command by default (perhaps with a --no-validate option)?

Test suite failure for PR #

Test suite status: failure see https://github.com/ome/ome_zarr_test_suite/actions/runs/1436621039

ome_zarr info: add options to introspect nested Zarr groups

Taking the output of bioformat2raw 0.3.0 which is a collection of OME-Zarr multiscale images as an example:

bioformats2raw "test&series=2.fake" "test.zarr"
(zarr) [sbesson@pilot-zarr1-dev ~]$ ome_zarr info test.zarr/
(zarr) [sbesson@pilot-zarr1-dev ~]$

returns nothing but the leaves can be introspected:

(zarr) [sbesson@pilot-zarr1-dev ~]$ ome_zarr info test.zarr/0/
/home/sbesson/test.zarr/0 [zgroup]
 - metadata
   - Multiscales
 - data
   - (1, 1, 1, 512, 512)
   - (1, 1, 1, 256, 256)
(zarr) [sbesson@pilot-zarr1-dev ~]$ ome_zarr info test.zarr/1/
/home/sbesson/test.zarr/1 [zgroup]
 - metadata
   - Multiscales
 - data
   - (1, 1, 1, 512, 512)
   - (1, 1, 1, 256, 256)

We might want to introduce some flags allow to scan a Zarr group without any OME-specific metadata. While the use case above is trivial, this will bring typical scalability issues associated with scanning a nested layout and/or loading metadata from a lot of leaves.

/cc @dominikl @pwalczysko

Refactor PlateLabels

#62 introduced the PlateLabels class which requires some special casing in order to provide 2 nodes from the same URL. One idea for working around this was to use a #labels suffix or similar to differentiate.

see: BioSchemas/specifications#475 however for potential RDF issues with #suffixes.

Allow remote S3 tests to be skipped in network-isolated environmets

3 of the tests fail with an error which seems to occur as AssertionError: not a zarr: None but is then captured as SystemExit: 2. I am not entirelly sure what's up with this.

Here is the full build and test log, including commands used and the full verbose output (the error backtrace seems to be rather lengthy): https://ppb.chymera.eu/7914d8.log

[test-bot] pip install --pre is failing

The Test pre-releases workflow failed on 2021-12-10 00:58 UTC

The most recent failing test was on py
with commit: d738d17

Full run: https://github.com//actions/runs/1561387803

(This post will be updated if another test fails, as long as this issue remains open.)

Review aiohttp cap

see: #127

Once fsspec/filesystem_spec#819 is closed, the cap can likely be removed.

Provide upgrade tools

With v0.2 released, ome_zarr (and likely other libraries) should provide a way for users to upgrade their datasets to nested layout.

[Security] Workflow precommit.yml is using vulnerable action pre-commit/action

The workflow precommit.yml is referencing action pre-commit/action using references v2.0.0. However this reference is missing the commit 80db042ff08cdddbbbfb6f89c06a6bfc4dddf0b7 which may contain fix to the some vulnerability.
The vulnerability fix that is missing by actions version could be related to:
(1) CVE fix
(2) upgrade of vulnerable dependency
(3) fix to secret leak and others.
Please consider to update the reference to the action.

Improving signal-to-noise with snoopycrimecop

Like following this repo to keep track of issues that may be coming up or use cases users present. However it seems recently a bot ( snoopycrimecop ) was added here, which is generating a fair number of issues (and associated notifications). Am wondering if there is a way to adjust how the bot behaves to allow for easier tracking of user issues. As one option, would it be possible to have the bot comment in only one dedicated issue? Or is there some other way that it might be configured that would have a similar result?

.zattrs and .zgroup should be downloaded first, not last

The download code leaves downloading of the .zattrs and .zgroup files for last. However, because partially downloaded zarrs are valid thanks to the fact that zarr is a sparse format, they should be downloaded first so that the user is left with a valid zarr even if the download is interrupted.

Reference code here:

ome-zarr-py/ome_zarr/utils.py

Lines 71 to 94 in db60b82

 for path, node in sorted(zip(paths, nodes)): 

 target_path = output_path / Path(*path) 

 resolutions: List[da.core.Array] = [] 

 datasets: List[str] = [] 

 for spec in node.specs: 

 if isinstance(spec, Multiscales): 

 datasets = spec.datasets 

 resolutions = node.data 

 if datasets and resolutions: 

 pbar = ProgressBar() 

 for dataset, data in reversed(list(zip(datasets, resolutions))): 

 LOGGER.info(f"resolution {dataset}...") 

 with pbar: 

 data.to_zarr(str(target_path / dataset)) 

 else: 

 # Assume a group that needs metadata, like labels 

 zarr.group(str(target_path)) 

 with (target_path / ".zgroup").open("w") as f: 

 f.write(json.dumps(node.zarr.zgroup)) 

 with (target_path / ".zattrs").open("w") as f: 

 metadata: JSONDict = {} 

 node.write_metadata(metadata) 

 f.write(json.dumps(metadata))

Rename parameter: fmt -> format

The name of fields in the methods in the library are mostly explicit e.g. columns, wells.
format should be used instead of fmt. See #153 (comment)

Polygon and other vector data

Just adding an issue to keep track of the question I raised here:

https://forum.image.sc/t/polygon-and-other-roi-annotations-in-ome-zarr/47990/9

According to the discussion above there does not seem to be a standard for storing
polygon and other vector annotations in yet. There seems to be some interest in
this.

Update ome-zarr-py to display masks

Since napari reader can return a layer type, See https://napari.org/docs/plugins/hook_specifications.html#napari.plugins.hook_specifications.napari_get_reader
we if this plugin can recognise a mask then it can be handled as a labels layer by napari.

And this plugin should also handle any conversion of dimensions needed. See ome/omero-cli-zarr#11 (comment)

parseurl fails to parse ome data with windows file paths

Hi, I have generated some local zarr stores with ome formatting on windows but ome_zarr.parseurl returns these as a remote store...

The problem is the scheme separates out the windows drive letter as the result.scheme.

from urllib.parse import urlparse
from ome_zarr import LocalZarr, RemoteZarr

def parse_url(path):
    result = urlparse(path)
    if result.scheme in ("", "file"):
        # Strips 'file://' if necessary
        return LocalZarr(result.path)
    else:
        return RemoteZarr(path)

path = "C:/biomic/devdata/test_zarr100.zarr"
zarr_store = parse_url(path)
type(zarr_store)


Out[91]: ome_zarr.RemoteZarr

It seems for windows compatibility it may be best to check if the directory exists before trying to parse as a url...here is what I added to make it work...

from urllib.parse import urlparse
from ome_zarr import LocalZarr, RemoteZarr
from pathlib import Path

def parse_url(path):
    try:
        if Path(path).is_dir():
            return LocalZarr(path)
    except OSError:
        result = urlparse(path)
        if result.scheme in ("", "file"):
            # Strips 'file://' if necessary
            return LocalZarr(result.path)
        else:
            return RemoteZarr(path)

path = "C:/biomic/devdata/test_zarr100.zarr"
zarr_store = parse_url(path)
type(zarr_store)

Out[92]: ome_zarr.LocalZarr

Happy to open a pull request.

Thanks!

Allow use of Zarr stores other than FSStore

We've been working on a remote Zarr workflow that would require the use of a HTTPFileSystem. However, the current implementation of ome-zarr-py has the FSStore kinda baked into it. I've thought a little bit about how to accommodate alternative fsspec implementations, but everything I could think of would require more serious refactoring than I wanted to start without getting some input from others first. To avoid breaking anything important, I had a couple questions.

Is there a reason that storage creation is the responsibility of the Format classes? I would've expected Format take a storage as an argument and tell you the format version based on that, but instead it seems like you need to assume a format in order to create a store, which you then check the format of (see

ome-zarr-py/ome_zarr/io.py

Lines 43 to 54 in 30b9a64

 loader = fmt 

 if loader is None: 

 loader = CurrentFormat() 

 self.__store = loader.init_store(self.__path, mode) 

 self.__init_metadata() 

 detected = detect_format(self.__metadata) 

 if detected != fmt: 

 LOGGER.warning(f"version mismatch: detected:{detected}, requested:{fmt}") 

 self.__fmt = detected 

 self.__store = detected.init_store(self.__path, mode) 

 self.__init_metadata()

Is there a problem with passing a store to ZarrLocation instead of a path? We could extract the path as necessary, but then ZarrLocation would be able to accept any fsspec compatible storage without any extra code in ome-zarr-py.

As is probably implied by my questions, I think the code should be refactored such that ZarrLocation takes a store as a constructor argument and the format detects the version of that store rather than the current workflow. This will change the constructor signature for ZarrLocation though, which seems like a problem. Should I make a very similar parallel class (e.g. ZarrStoreLocation), or perform the refactoring? Or is there some important advantage of the design which I've overlooked? I appreciate any input on how to move forward here. Thanks!

Documentation & parsing ome-zarr with original zarr package

Dear all,

Could you tell me if there is any documentation available for ome-zarr-py?
Also, is it possible to straightforwardly parse zarr arrays contained in ome-zarr fileset using original zarr Python package (https://github.com/zarr-developers/zarr-python)?

Best regards,
Aliaksei

Bloated dependencies and installation

I think that the dependencies are a bit bloated in this package.
In particular:

opencv dependency: this should be avoided if there is also scikit-image in the dependencies; for me this makes the package unusable in many deployments because opencv clashes with other dependencies. Afaik opencv is only used for nearest neighbor downsampling, which can also be done with scikit-image.
vispy dependency: why? there is no visualisation involved here. Probably this is a leftover from the napari cookiecutter.
napari: same as vispy

Also, in the installation the package installs a napari plugin, which should be removed in favor of napari-ome-zarr.

Dispatch workflow is failing

See https://github.com/ome/ome-zarr-py/actions/workflows/dispatch.yml e.g. https://github.com/ome/ome-zarr-py/runs/4140227357?check_suite_focus=true

Run peter-evans/repository-dispatch@v1
Error: Parameter token or opts.auth is required

	for path, node in sorted(zip(paths, nodes)):
	target_path = output_path / Path(*path)
	resolutions: List[da.core.Array] = []
	datasets: List[str] = []
	for spec in node.specs:
	if isinstance(spec, Multiscales):
	datasets = spec.datasets
	resolutions = node.data
	if datasets and resolutions:
	pbar = ProgressBar()
	for dataset, data in reversed(list(zip(datasets, resolutions))):
	LOGGER.info(f"resolution {dataset}...")
	with pbar:
	data.to_zarr(str(target_path / dataset))
	else:
	# Assume a group that needs metadata, like labels
	zarr.group(str(target_path))

	with (target_path / ".zgroup").open("w") as f:
	f.write(json.dumps(node.zarr.zgroup))
	with (target_path / ".zattrs").open("w") as f:
	metadata: JSONDict = {}
	node.write_metadata(metadata)
	f.write(json.dumps(metadata))

	loader = fmt
	if loader is None:
	loader = CurrentFormat()
	self.__store = loader.init_store(self.__path, mode)

	self.__init_metadata()
	detected = detect_format(self.__metadata)
	if detected != fmt:
	LOGGER.warning(f"version mismatch: detected:{detected}, requested:{fmt}")
	self.__fmt = detected
	self.__store = detected.init_store(self.__path, mode)
	self.__init_metadata()

ome / ome-zarr-py Goto Github PK

ome-zarr-py's Introduction

ome-zarr-py

Documentation

Tests

Release process

License

ome-zarr-py's People

Contributors

Stargazers

Watchers

Forkers

ome-zarr-py's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs