GithubHelp home page GithubHelp logo

Comments (7)

payno avatar payno commented on July 2, 2024

Indeed looks like an issue with the locking option or the locking parameter.
If I do

with HDF5File(filename=file_path, mode="r", locking=True) as h5f_silx:
    data = h5f_silx[data_path][()]

then this work, else

with HDF5File(filename=file_path, mode="r", locking=False) as h5f_silx:
    data = h5f_silx[data_path][()]

fails.
But I don't see why I would need to do some file locking in this particular case.

from silx.

payno avatar payno commented on July 2, 2024

if I deactivate the locking from h5py I have the same behavior:

with h5py.File(file_path, mode="r", locking=False) as h5f:
    data = h5f[data_path][()]

so looks like a miss alignment of the flags between silx and h5py

from silx.

payno avatar payno commented on July 2, 2024

So to resume if locking is a boolean:

with h5py.File(file_path, mode="r", locking=locking) as h5f:

and

with HDF5File(file_path, mode="r", locking=locking) as h5f:

will provide the same result but

if locking is **None** (default value) then this will give different results

from silx.

payno avatar payno commented on July 2, 2024

maybe the self._LOCKING_MGR.set_locking(locking) is missing some extra test. Because at this point locking can be None.

from silx.

payno avatar payno commented on July 2, 2024

So avoid redefining the 'locking' value before passing it to h5py.File seems to make it work.
But I don't know if this fit the design

--- a/src/silx/io/h5py_utils.py
+++ b/src/silx/io/h5py_utils.py
@@ -388,7 +388,7 @@ class File(h5py.File):
             )
             if locking is None:
                 locking = enable_file_locking
-        locking = _hdf5_file_locking(
+        _hdf5_file_locking(
             mode=mode, locking=locking, swmr=swmr, libver=libver
         )
         if self._LOCKING_MGR is None:

to test:

from silx.io.utils import get_data
from silx.io.url import DataUrl
from silx.io.h5py_utils import File as HDF5File
import h5py
import numpy

file_path = "output/output.nx"
data_path = "/entry0000/instrument/detector/data"

def data_is_valid(dataset: numpy.array):
	return data[0:10].max() > 0 and data[10:20].max() > 0 and data[20:30].max() > 0

for locking in (None, "best-effort", True, False):
	print("++++ test locking", locking, " ++++")
	# default h5py.File (works)
	with h5py.File(file_path, mode="r", locking=locking) as h5f:
		data = h5f[data_path][()]

	print(f"default h5py works: {data_is_valid(data)}")

	print("****************************")
	# default HDF5File (fails)
	with HDF5File(filename=file_path, mode="r", locking=locking) as h5f_silx:
		data = h5f_silx[data_path][()]

	print(f"silx default HDF5File works: {data_is_valid(data)}")

from silx.

payno avatar payno commented on July 2, 2024

ok so at the end this HDF5 which doesn't handle properly VirtualSource referenced as ./file and expects . instead.... but only when file_locking is set to False...

to reproduce:

import numpy
import h5py

print("** Write test datasets")
data = numpy.linspace(1, 10, 100, dtype=numpy.float32).reshape((10, 10))


with h5py.File("test.h5", mode="w") as h5f:
    h5f["/data"] = data
    vsource_1 = h5py.VirtualSource(".", "/data", shape=(10, 10))
    layout = h5py.VirtualLayout(shape=(10, 10), dtype="f4")
    layout[:] = vsource_1
    h5f.create_virtual_dataset("/vds_ok", layout, fillvalue=-5)

    vsource_1 = h5py.VirtualSource("./test.h5", "/data", shape=(10, 10))
    layout = h5py.VirtualLayout(shape=(10, 10), dtype="f4")
    layout[:] = vsource_1
    h5f.create_virtual_dataset("/vds_failed", layout, fillvalue=-5)


print("** Read with locking")
with h5py.File("test.h5", mode="r", locking=True) as h5f:
    vds_data = h5f["/vds_ok"][()]
    assert numpy.array_equal(data, vds_data)
    vds_data = h5f["/vds_failed"][()]
    assert numpy.array_equal(data, vds_data)


print("** Read without locking")
with h5py.File("test.h5", mode="r", locking=False) as h5f:
    print("  - VDS OK")
    vds_data = h5f["/vds_ok"][()]
    assert numpy.array_equal(data, vds_data)
    print("  - vds_failed")
    vds_data = h5f["/vds_failed"][()]
    assert numpy.array_equal(data, vds_data)

Thanks @t20100 I will try to do an issue at HDF5 level once the rush of tomotools release will end.

from silx.

payno avatar payno commented on July 2, 2024

ping @t20100: reported here: HDFGroup/hdf5#4080

from silx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.