GithubHelp home page GithubHelp logo

hdmf-zarr's People

Contributors

alejoe91 avatar bendichter avatar mavaylon1 avatar oruebel avatar rly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

hdmf-zarr's Issues

Create release on Conda

The Conda release is not a must but would be nice to have and will be good to do as learning experience

[Feature]: Parallel Write Support for HDMF-Zarr

What would you like to see added to HDMF-ZARR?

Parallel Write Support for HDMF-Zarr

Allow NWB files written using the Zarr backend to leverage multiple threads or CPUs to enhance speed of operation.
Objectives

Zarr is built to support efficient Python parallelization strategies, both multi-processing and multi-threaded
HDMF-Zarr currently handles all write operations (including buffering and slicing) without exposing the necessary controls to enable these strategies

Approach and Plan
Identify the best injection point for parallelization parameters in the io.write() stack of HDMF-Zarr
Progress and Next Steps
TODO
Background and References

Is your feature request related to a problem?

No response

What solution would you like?

Identify the best injection point for parallelization parameters in the io.write() stack of HDMF-Zarr (as controlled via the NWBZarrIO)

Essentially revolving around the line https://github.com/hdmf-dev/hdmf/blob/2f9ec567ebe1df9fccb05f139d2f669661e50018/src/hdmf/backends/hdf5/h5_utils.py#L61 from the main repo (which might be what is used to delegate the command here as well?

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

[Feature] Explore adding support for more Zarr storage backends

Explore adding support for other relevant Zarr backend stores to ZarrIO. See https://zarr.readthedocs.io/en/stable/api/storage.html for a list of possibly relevant stores. A few relevant tasks related to this issue are:

  • Add new data stores to ZarrIO

    • Add support for SQLite store ZarrIO
    • Add support for read-only ZipStore to ZarrIO. The ZipStore in Zarr is not mutable, i.e., writing to datasets must be aligned with chunks and attributes must be added all at once since files cannot be updated in the zip file once created. Because of this, it will be challenging to support write using ZipStore using the current implementation. Users would need to instead write with the DirectoryStore and Zip the folder afterwards.
    • Evaluate adding support for other Zarr stores (e.g., database stores). Resolution of links may need to be handled differently for different stores.
  • Add tests and tutorials

    • Add unit tests for NWBZarrIO to test writing with multiple different storage backends
    • Add tutorial for using different data backends
  • ZarrIO updates

    • Update handling of references on export when using file-based Zarr stores to make sure links are created correctly
    • Update handling of references on read to ensure references to the new stores are resolved correctly on read

This issue is also related to #62, which added support for different variants of the DirectoryStore (e.g,. TempStore and NestedDirectoryStore``).

Setup CodeCov

  • Setup CodeCov pipeline for the repo
  • Require minimum coverage for patches as part of PRs
  • Add CodeCov badge to the README.md

Pipelines failing due to codecov

CI pipelines are failing with:

ERROR: Could not find a version that satisfies the requirement codecov==2.1.12 (from versions: 2.1.13)

PyNWB warnings in plot_convert_nwb_hdf5.py

The tutorial plot_convert_nwb_hdf5.py currently raises the following warnings.

/Users/oruebel/Devel/nwb/pynwb/src/pynwb/core.py:47: UserWarning: OpticalSeries 'StimulusPresentation_encoding': The number of frame indices in 'starting_frame' should have the same length as 'external_file'.
  warn(error_msg)
/Users/oruebel/Devel/nwb/pynwb/src/pynwb/core.py:47: UserWarning: OpticalSeries 'StimulusPresentation_encoding': Either external_file or data must be specified (not None), but not both.
  warn(error_msg)

These warnings were added in the latest version of PyNWB and it appears to be an issue with the data on DANDI itself rather than being an issue in the tutorial or HDMF-Zarr. Changing the dataset that is being used should address this issue.

Setup readthedocs

  • Setup readthedocs built for the dev branch and the stable release
  • Add docs badge to the README

Add support for ruff and the other modern tooling that is now in HDMF

To keep in line with HDMF:
The Python Packaging Authority is gradually phasing out setup.py in favor of pyproject.toml. We make the change here.

This involves removing versioneer as a dependency due to challenges to get it to play nicely with pyproject.toml. Using setuptools_scm for setting the package version appears to be an adequate replacement.

We will also now be using the popular black and ruff tools to impose a strict, mostly uncompromising style on the code base. ruff replaces flake8 and isort and performs additional checks. Running these tools on the code base involves changing basically every python file...

We will now use ruff to replace flake8. It is significantly faster, sorts imports, and performs additional checks.

Finally, to help automate the usage of black, ruff, codespell, I recommend that developers install and use pre-commit, which runs these tools as well as several other helpful utility functions, to clean up the code and identify issues prior to every commit.

Fix support for external links on export

Exporting of HDF5 files with external links is not yet fully implemented/tested. tests/unit/test_io_zarr.py defines several test cases for this scenario that are not yet passing that would need to be addressed in order to complete support for external links on export:, e.g.,

  • test_soft_link_dataset
  • test_external_link_group
  • test_external_link_dataset
  • test_external_link_link
  • test_attr_reference
  • test_append_data
  • test_append_external_link_data
  • test_append_external_link_copy_data
  • test_export_dset_refs
  • test_export_cpd_dset_refs,
  • hdmf.backends.hdf5.h5tools.HDF5IO uses the export_source argument on export. Need to check whether we may need to use it here as well to address this issue.

Arrays possibly being transposed when converting NWB files from HDF5 to ZARR

The tutorial for converting NWB data from HDF5 to Zarr, currently shows the following warnings (see also See https://hdmf-zarr.readthedocs.io/en/latest/tutorials/plot_convert_nwb.html#read-the-zarr-file-back-in)

Screen Shot 2022-12-13 at 12 31 26 AM

In particular the warning warn("Length of data does not match length of timestamps. Your data may be transposed. Time should be on " is something that should be looked at, as it appears that (some) arrays may for some reason be transposed in the conversion.

Update tox.ini to use test_gallery.py and fix gallery-python-3.7 tests

  1. Update tox.ini to use test_gallery.py to be in line with HDMF
  2. Currently both linux-gallery-python3.7-minimum and windows-gallery-python3.7-minimum will pass locally when running "python test.py --example", but not during the github checks. I've also tested a version of test_gallery.py by running in a branch "python test_gallery.py"; however, this returns an error regarding missing files. (Refer to attached images)
    Screen Shot 2022-12-21 at 11 01 36 AM

[Bug]: Test are failing with latest HDMF

What happened?

The latest HDMF adds the HDMFIO.can_read method. Several tests are using a "dummy" OtherIO class which is missing this method. As a result, tests are failing because the OtherIO class cannot be instantiated.

TypeError: Can't instantiate abstract class OtherIO with abstract method can_read

Steps to Reproduce

See e.g., https://github.com/hdmf-dev/hdmf/actions/runs/5504859898/jobs/10031532667?pr=890

Traceback

See e.g., https://github.com/hdmf-dev/hdmf/actions/runs/5504859898/jobs/10031532667?pr=890

Operating System

Linux

Python Executable

Conda

Python Version

3.9

Package Versions

No response

Code of Conduct

Add PyNWB tests

NeurodataWithoutBorders/pynwb#1018 we updates the PyNWB test harness to add ZarrIO to the rountrip tests, which in turn runs all HDF5 roundtrip tests that are defined in PyNWB also for Zarr. This requires changing the test harness in PyNWB, instead it would be useful to be able to “inject” new I/O backends in the PyNWB test harness so that we can specify those tests here, rather than implementing this in PyNWB and making PyNWB dependent on hdmf-zarr.

Note: The following changes from NeurodataWithoutBorders/pynwb#1018 have already been ported:

  • docs/notebooks/zarr_file_conversion_test.ipynb from the the PyNWB PR has been ported to docs/gallery/plot_convert_nwb.py here
  • The changes in src/pynwb/__init__.py from the PyNWB PR have been added in src/hdmf_zarr/nwb.py here
    However, the changes to the tests have not been ported (at least not fully) and the test harness of PyNWB has undergone some refactoring in the meantime as well so we'll need to check how to best implement these tests

Dimension warning for ElectricalSeries in NWBZarrIO Tutorial

The tutorial for Creating NWB files using NWBZarrIO currently raises the following warnings:

/home/docs/checkouts/readthedocs.org/user_builds/hdmf-zarr/envs/dev/lib/python3.7/site-packages/pynwb/ecephys.py:93: UserWarning: The second dimension of data does not match the length of electrodes. Your data may be transposed.
  warnings.warn("The second dimension of data does not match the length of electrodes. Your data may be "
/home/docs/checkouts/readthedocs.org/user_builds/hdmf-zarr/envs/dev/lib/python3.7/site-packages/pynwb/ecephys.py:93: UserWarning: The second dimension of data does not match the length of electrodes. Your data may be transposed.
  warnings.warn("The second dimension of data does not match the length of electrodes. Your data may be "

It appears that those warnings are due to errors in the tutorial itself regarding the initialization of the test data, rather than being a bug in the library itself.

Add support for storing region references

Region references are not yet fully implemented in ZarrIO. To implement region references will require updating:

  1. ZarrReference to add a region key to support storing the selection for the region,
  2. ZarrIO.__get_ref to support passing in the region definition to be added to theZarrReference
  3. ZarrIO.write_dataset already partially implements the required logic for creating region references by checking for hdmf.build.RegionBuilder` inputs but will likely need updates as well
  4. ZarrIO.__read_dataset to support reading region references, which may also require updates to ZarrIO.__parse_ref and
    ZarrIO.__resolve_ref,
  5. and possibly other parts of ZarrIO

Support lazy read of object references

Currently object references are always loaded and resolved on read. To avoid loading potentially reading and resolving large amounts of references on read, it would be ideal if references could be resolved lazily.

See also:

if has_reference:
try:
# TODO Should implement a lazy way to evaluate references for Zarr
data = deepcopy(data[:])
self.__parse_ref(kwargs['maxshape'], obj_refs, reg_refs, data)
except ValueError as e:
raise ValueError(str(e) + " zarr-name=" + str(zarr_obj.name) + " name=" + str(name))

[Bug]: DeployRelease

What happened?

I made a release that had a bug due to using 3.10 tox tests instead of 3.11. We merged a fix to that. I did a manual release but noticed that the password secret for PYPI was being sliced at a special character. I am assuming the fact it failed was due to me doing a manually release earlier and it existing on PYPI already. As a result, this bug report is to be a point of reference for the next release if it fails. If it passes, this bug will be removed.

Steps to Reproduce

Run the workflow for deploy_release

Traceback

Successfully installed Pygments-2.15.1 SecretStorage-3.3.3 bleach-6.0.0 certifi-2023.7.22 cffi-1.15.1 charset-normalizer-3.2.0 cryptography-41.0.2 docutils-0.20.1 idna-3.4 importlib-metadata-6.8.0 jaraco.classes-3.3.0 jeepney-0.8.0 keyring-24.2.0 markdown-it-py-3.0.0 mdurl-0.1.2 more-itertools-9.1.0 pkginfo-1.9.6 pycparser-2.21 readme-renderer-40.0 requests-2.31.0 requests-toolbelt-1.0.0 rfc3986-2.0.0 rich-13.4.2 six-1.16.0 twine-4.0.2 urllib3-2.0.4 webencodings-0.5.1 zipp-3.16.2
hdmf_zarr-0.3.0-py3-none-any.whl
hdmf_zarr-0.3.0.tar.gz
/home/runner/work/_temp/3aa02331-9c9c-4b3b-ae0b-61969c85efb9.sh: line 4: M7je3: command not found
usage: twine upload [-h] [-r REPOSITORY] [--repository-url REPOSITORY_URL]
                    [-s] [--sign-with SIGN_WITH] [-i IDENTITY] [-u USERNAME]
                    [-p PASSWORD] [--non-interactive] [-c COMMENT]
                    [--config-file CONFIG_FILE] [--skip-existing]
                    [--cert path] [--client-cert path] [--verbose]
                    [--disable-progress-bar]
                    dist [dist ...]
twine upload: error: the following arguments are required: dist

Operating System

Linux

Python Executable

Python

Python Version

3.11

Package Versions

No response

Code of Conduct

Zarr links should be relative to root

Hi guys,

I was able to successfully export to NWB-zarr sorting info + waveforms and electrode table (using neuroconv).

I performed the conversion remotely and then downloaded the resulting files. When I try to read the file locally, I get a bad link error:

ValueError                                Traceback (most recent call last)
Cell In [5], line 4
      1 nwbfile_path = "/home/alessio/Documents/data/debug/ecephys_632269_2022-10-13_15-41-42_zarr.nwb"
      3 io = NWBZarrIO(nwbfile_path, "r")
----> 4 nwbfile = io.read()

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/backends/io.py:38, in HDMFIO.read(self, **kwargs)
     35 @docval(returns='the Container object that was read in', rtype=Container)
     36 def read(self, **kwargs):
     37     """Read a container from the IO source."""
---> 38     f_builder = self.read_builder()
     39     if all(len(v) == 0 for v in f_builder.values()):
     40         # TODO also check that the keys are appropriate. print a better error message
     41         raise UnsupportedOperation('Cannot build data. There are no values.')

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:937, in ZarrIO.read_builder(self)
    935 @docval(returns='a GroupBuilder representing the NWB Dataset', rtype='GroupBuilder')
    936 def read_builder(self):
--> 937     f_builder = self.__read_group(self.__file, ROOT_NAME)
    938     return f_builder

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:973, in ZarrIO.__read_group(self, zarr_obj, name)
    971 # read sub groups
    972 for sub_name, sub_group in zarr_obj.groups():
--> 973     sub_builder = self.__read_group(sub_group, sub_name)
    974     ret.set_group(sub_builder)
    976 # read sub datasets

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:973, in ZarrIO.__read_group(self, zarr_obj, name)
    971 # read sub groups
    972 for sub_name, sub_group in zarr_obj.groups():
--> 973     sub_builder = self.__read_group(sub_group, sub_name)
    974     ret.set_group(sub_builder)
    976 # read sub datasets

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:973, in ZarrIO.__read_group(self, zarr_obj, name)
    971 # read sub groups
    972 for sub_name, sub_group in zarr_obj.groups():
--> 973     sub_builder = self.__read_group(sub_group, sub_name)
    974     ret.set_group(sub_builder)
    976 # read sub datasets

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:982, in ZarrIO.__read_group(self, zarr_obj, name)
    979     ret.set_dataset(sub_builder)
    981 # read the links
--> 982 self.__read_links(zarr_obj=zarr_obj, parent=ret)
    984 self._written_builders.set_written(ret)  # record that the builder has been written
    985 self.__set_built(zarr_obj, ret)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:1008, in ZarrIO.__read_links(self, zarr_obj, parent)
   1006     l_path = os.path.join(link['source'], link['path'].lstrip("/"))
   1007 if not os.path.exists(l_path):
-> 1008     raise ValueError("Found bad link %s in %s to %s" % (link_name, self.__path, l_path))
   1010 target_name = str(os.path.basename(l_path))
   1011 target_zarr_obj = zarr.open(l_path, mode='r')

ValueError: Found bad link device in /home/alessio/Documents/data/debug/ecephys_632269_2022-10-13_15-41-42_zarr.nwb to results/ecephys_632269_2022-10-13_15-41-42_zarr.nwb/general/devices/Device

After debugging, the l_path is indeed the path on my remote machine. I think saving links relative to the zarr root should fix it.

[Feature]: Remove support for python 3.7

What would you like to see added to HDMF-ZARR?

Remove all python 3.7 options and requirements.

Is your feature request related to a problem?

No response

What solution would you like?

Remove all python 3.7 options and requirements.

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

[Bug]: Could not find already-built Builder for DynamicTable 'electrodes' in BuildManager

What happened?

Attempted a basic NWB file write attempt using the Zarr backend and hit a snag - unsure how to proceed

Steps to Reproduce

from pynwb.testing.mock.file import mock_NWBFile
from pynwb.testing.mock.ecephys import mock_ElectricalSeries
from hdmf_zarr import NWBZarrIO

nwbfile = mock_NWBFile()
nwbfile.add_acquisition(mock_ElectricalSeries())

with NWBZarrIO(path="/home/jovyan/Downloads/test_zarr.nwb", mode="w") as io:
    io.write(nwbfile)

Traceback

/opt/conda/lib/python3.10/site-packages/hdmf_zarr/backend.py:92: UserWarning: The ZarrIO backend is experimental. It is under active development. The ZarrIO backend may change any time and backward compatibility is not guaranteed.
  warnings.warn(warn_msg)
---------------------------------------------------------------------------
ReferenceTargetNotBuiltError              Traceback (most recent call last)
Cell In[9], line 2
      1 with NWBZarrIO(path="/home/jovyan/Downloads/test_zarr.nwb", mode="w") as io:
----> 2     io.write(nwbfile)

File /opt/conda/lib/python3.10/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File /opt/conda/lib/python3.10/site-packages/hdmf_zarr/backend.py:160, in ZarrIO.write(self, **kwargs)
    158 """Overwrite the write method to add support for caching the specification"""
    159 cache_spec = popargs('cache_spec', kwargs)
--> 160 super(ZarrIO, self).write(**kwargs)
    161 if cache_spec:
    162     self.__cache_spec()

File /opt/conda/lib/python3.10/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File /opt/conda/lib/python3.10/site-packages/hdmf/backends/io.py:56, in HDMFIO.write(self, **kwargs)
     54 """Write a container to the IO source."""
     55 container = popargs('container', kwargs)
---> 56 f_builder = self.__manager.build(container, source=self.__source, root=True)
     57 self.write_builder(f_builder, **kwargs)

File /opt/conda/lib/python3.10/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File /opt/conda/lib/python3.10/site-packages/hdmf/build/manager.py:188, in BuildManager.build(self, **kwargs)
    184     self.logger.debug("Using prebuilt %s '%s' for %s '%s'"
    185                       % (result.__class__.__name__, result.name,
    186                          container.__class__.__name__, container.name))
    187 if root:  # create reference builders only after building all other builders
--> 188     self.__add_refs()
    189     self.__active_builders.clear()  # reset active builders now that build process has completed
    190 return result

File /opt/conda/lib/python3.10/site-packages/hdmf/build/manager.py:233, in BuildManager.__add_refs(self)
    230 call = self.__ref_queue.popleft()
    231 self.logger.debug("Adding ReferenceBuilder with call id %d from queue (length %d)"
    232                   % (id(call), len(self.__ref_queue)))
--> 233 call()

File /opt/conda/lib/python3.10/site-packages/hdmf/build/objectmapper.py:952, in ObjectMapper.__set_attr_to_ref.<locals>._filler()
    948 def _filler():
    949     self.logger.debug("Setting reference attribute on %s '%s' attribute '%s' to %s"
    950                       % (builder.__class__.__name__, builder.name, spec.name,
    951                          attr_value.__class__.__name__))
--> 952     target_builder = self.__get_target_builder(attr_value, build_manager, builder)
    953     ref_attr_value = ReferenceBuilder(target_builder)
    954     builder.set_attribute(spec.name, ref_attr_value)

File /opt/conda/lib/python3.10/site-packages/hdmf/build/objectmapper.py:895, in ObjectMapper.__get_target_builder(self, container, build_manager, builder)
    893 target_builder = build_manager.get_builder(container)
    894 if target_builder is None:
--> 895     raise ReferenceTargetNotBuiltError(builder, container)
    896 return target_builder

ReferenceTargetNotBuiltError: electrodes (root/acquisition/ElectricalSeries/electrodes): Could not find already-built Builder for DynamicTable 'electrodes' in BuildManager

Operating System

Linux

Python Executable

Conda

Python Version

3.11

Package Versions

DANDI Hub basic kernel on 6/15/2023 with only hdmf-zarr installed manually

Code of Conduct

Remove test.py

Testing using test.py is deprecated. Tests should be run using either pytest or python test_gallery.py. Let's remove test.py to reduce confusion.

Add test to ensure links keep working after files are moved

In #44 and #46 we changed reference to use relative paths to improve portability of files. During debugging we tested links continue to function when file paths change by changing the current working directory. For future testing we should add a test where we generate a file with references and then move the file to a different path and then open the file with different relative and absolute path (and using different Python working directories) to make sure links continue to function when files are being moved. See the example here for the tutorial which can be turned into a unit test:

###############################################################################
# Test opening the file
# ---------------------
with NWBZarrIO(path=path, mode="r") as io:
infile = io.read()
###############################################################################
# Test opening with the absolute path instead
# -------------------------------------------
with NWBZarrIO(path=absolute_path, mode="r") as io:
infile = io.read()
###############################################################################
# Test changing the current directory
# ------------------------------------
import os
os.chdir(os.path.abspath(os.path.join(os.getcwd(), "../")))
with NWBZarrIO(path=absolute_path, mode="r") as io:
infile = io.read()

Save object id's as part of links and references

  • To enhance portability of links and references it would be nice to store the object_id of the Zarr file in addition to the relative path when a link/reference points to an external file. This will be useful both for error checking but can also help resolve links in case that path's are not valid

[Documentation]: favicon

What would you like changed or added to the documentation and why?

Add favicon to docs

Do you have any interest in helping write or edit the documentation?

No.

Code of Conduct

Rename Mixin classes in test_io_convert

Rename the Mixin classes used to implement conversion tests from TestCaseConvertMixin to MixinTestCaseConvert to avoid issues with pytest picking up the abstract mixin classes as actual tests.

[Bug]: Min req tests failing on `import zarr`

What happened?

The nightly macos-python3.7-minimum tests have been failing for 2 days. See stacktrace.

Steps to Reproduce

See https://github.com/hdmf-dev/hdmf-zarr/actions/runs/5301947163/jobs/9596558206

Traceback

==================================== ERRORS ====================================
________________ ERROR collecting tests/unit/test_io_convert.py ________________
ImportError while importing test module '/Users/runner/work/hdmf-zarr/hdmf-zarr/tests/unit/test_io_convert.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../hostedtoolcache/Python/3.7.17/x64/lib/python3.7/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/unit/test_io_convert.py:40: in <module>
    from hdmf_zarr.backend import (ZarrIO,
src/hdmf_zarr/__init__.py:1: in <module>
    from .backend import ZarrIO
src/hdmf_zarr/backend.py:12: in <module>
    import zarr
.tox/py37-minimum/lib/python3.7/site-packages/zarr/__init__.py:2: in <module>
    from zarr.codecs import *
.tox/py37-minimum/lib/python3.7/site-packages/zarr/codecs.py:2: in <module>
    from numcodecs import *
.tox/py37-minimum/lib/python3.7/site-packages/numcodecs/__init__.py:32: in <module>
    from numcodecs.bz2 import BZ2
.tox/py37-minimum/lib/python3.7/site-packages/numcodecs/bz2.py:1: in <module>
    import bz2 as _bz2
../../../hostedtoolcache/Python/3.7.17/x64/lib/python3.7/bz2.py:19: in <module>
    from _bz2 import BZ2Compressor, BZ2Decompressor
E   ModuleNotFoundError: No module named '_bz2'
__________________ ERROR collecting tests/unit/test_zarrio.py __________________
ImportError while importing test module '/Users/runner/work/hdmf-zarr/hdmf-zarr/tests/unit/test_zarrio.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../hostedtoolcache/Python/3.7.17/x64/lib/python3.7/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/unit/test_zarrio.py:12: in <module>
    from tests.unit.base_tests_zarrio import (BaseTestZarrWriter,
tests/unit/base_tests_zarrio.py:13: in <module>
    import zarr
.tox/py37-minimum/lib/python3.7/site-packages/zarr/__init__.py:2: in <module>
    from zarr.codecs import *
.tox/py37-minimum/lib/python3.7/site-packages/zarr/codecs.py:2: in <module>
    from numcodecs import *
.tox/py37-minimum/lib/python3.7/site-packages/numcodecs/__init__.py:32: in <module>
    from numcodecs.bz2 import BZ2
.tox/py37-minimum/lib/python3.7/site-packages/numcodecs/bz2.py:1: in <module>
    import bz2 as _bz2
../../../hostedtoolcache/Python/3.7.17/x64/lib/python3.7/bz2.py:19: in <module>
    from _bz2 import BZ2Compressor, BZ2Decompressor
E   ModuleNotFoundError: No module named '_bz2'

Operating System

Linux

Python Executable

Conda

Python Version

3.7

Package Versions

No response

Code of Conduct

conda-linux-python3.7-minimum test failing

py37-minimum create: /home/runner/work/hdmf-zarr/hdmf-zarr/.tox/py37-minimum
ERROR: invocation failed (exit code 1), logfile: /home/runner/work/hdmf-zarr/hdmf-zarr/.tox/py37-minimum/log/py37-minimum-0.log
================================== log start ===================================
AttributeError: 'dict' object has no attribute 'select'

=================================== log end ====================================
ERROR: InvocationError for command /usr/share/miniconda/envs/true/bin/python3.7 -m virtualenv --download --python /usr/share/miniconda/envs/true/bin/python3.7 py37-minimum (exited with code 1)
___________________________________ summary ____________________________________
ERROR:   py37-minimum: InvocationError for command /usr/share/miniconda/envs/true/bin/python3.7 -m virtualenv --download --python /usr/share/miniconda/envs/true/bin/python3.7 py37-minimum (exited with code 1)

Setup branch protections

Setup GitHub branch protections for the dev branch (similar to the setup in HDMF) to: i) require PRs and prevent direct commits to the dev branch, ii) require that PRs pass the main CI checks (see #10)

Add support for dtype and shape on ZarrDataIO

In HDMF hdmf-dev/hdmf#747 added the ability to setup datasets on write ahead of time without having the actual data. This was accomplished by adding the option shape and dtype parameters on DataIO. ZarrDataIO currently does not support these parameters; see:

# NOTE: dtype and shape of the DataIO base class are not yet supported by ZarrDataIO.
# These parameters are used to create empty data to allocate the data but
# leave the I/O to fill the data to the user.
super(ZarrDataIO, self).__init__(data=data,
dtype=None,
shape=None)

To match functionality with HDF5DataIO it would be useful to add support for shape and dtype to ZarrDataIO and update the ZarrIO backend to support creation of empty datasets from ZarrDataIO objects that only have the shape and dtype specified but contain no actual data. The changes needed to ZarrDataIO should be fairly minimal (i.e., essentially just update the docval and handling in init). The main changes required should be in ZarrIO to check for the case when ZarrDataIO.data is empty and support creation of empty Zarr datasets.

Add roundtrip test for Zarr with DataChunkIterator

#72 fixed an issue where the zarr_dtype attribute was not set on write when DataChunkIterator is being used (which in turn caused an error on read). We should add rountrip tests using DataChunkIterator for write to cover this case.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.