GithubHelp home page GithubHelp logo

#viewsetups >100 about pybdv HOT 10 CLOSED

constantinpape avatar constantinpape commented on May 28, 2024
#viewsetups >100

from pybdv.

Comments (10)

constantinpape avatar constantinpape commented on May 28, 2024 1

I have updated this:

  • elf is fully optional now and you can convert to n5 as long as you have z5py in your env (conda install -c conda-forge z5py)
  • I have disabled the check for number of setup ids for n5.

from pybdv.

constantinpape avatar constantinpape commented on May 28, 2024

the HDF5 structure does not support more than 2 digit ViewSetups.
Would N5 support it? I can see the setupN directory structure. Is there a limitation to the digits of N?

Yes, the n5 structure supports an arbitrary number of setups. To make it work in pybdv I would need to change this check so it is only triggered for hdf5:
https://github.com/constantinpape/pybdv/blob/master/pybdv/converter.py#L90-L92

BTW: Where can I find the elf package? I cannot get pybdv's n5 conversion to work.

You can find it here; but maybe I shouldn't depend on it in pybdv. I will check if it's easy to replace later.
https://github.com/constantinpape/elf

from pybdv.

martinschorb avatar martinschorb commented on May 28, 2024

OK, cool,

then I will just use n5 if there are more than 100 ViewSetups.

elf is a bit of an ambiguous package name. I did some research and just could not find the right one... Maybe just rename it. And yes, that obvious place I did not check...

from pybdv.

martinschorb avatar martinschorb commented on May 28, 2024

Hi,

this seems to work.

The conversion to n5 with the default chunk size (that is 64, correct?) takes ages. I guess this is because the group shrae filers are not optimized for dealing with such many small files...?

Is there any mechanism in n5 (or similar other data format) that would overcome this?

from pybdv.

martinschorb avatar martinschorb commented on May 28, 2024

I just found that even reading the chunks seems very slow from the group shares as compared to h5.
Is there some hybrid format, or would you just increase the chunk size?

from pybdv.

constantinpape avatar constantinpape commented on May 28, 2024

I just found that even reading the chunks seems very slow from the group shares as compared to h5.
Is there some hybrid format, or would you just increase the chunk size?

Normally h5 and n5 should be more or less the same speed; could you maybe post the h5 and n5 file where you have observed this, the exact environment you have used and the access pattern?

from pybdv.

martinschorb avatar martinschorb commented on May 28, 2024

can you see /g/emcf/schorb/data/BDV/montages/LLP_001/bdv_LLP ?

That's both the same thing. N5 took 5x as long to create using the current master commit.

When loading in BDV, h5 appears instantaneously while N5 takes >20 s until reaching the stage where bdv-playground considers the data loaded and performs the centering and auto-contrast. This is in a VM, so IO to the group share should be comparable.

from pybdv.

martinschorb avatar martinschorb commented on May 28, 2024

that's with default chunk size (64,64,64). It gets a bit better when setting the chunks to something like (1,512,512) instead.

from pybdv.

constantinpape avatar constantinpape commented on May 28, 2024

Ok, I had a look at the data. Indeed I also see quite a big difference in the loading speed.
However, this data is 2d, so (1, 64, 64) chunks are tiny! I would definitely go with (1, 512, 512).

In my experience, the over-head of reading the individual chunks from file system is not too large; however at some point it becomes problematic.

At some point I measured it on the Janelia distributed file system and there it wasn't a big problem if chunks were ~ 64** 3 size; we should measure this at EMBL at some point as well to determine what is a good minimal size.

In any case, for 2d data I would always go with at least (1, 512, 512) chunks.

from pybdv.

constantinpape avatar constantinpape commented on May 28, 2024

I closed this, feel free to reopen if this is still relevant.

from pybdv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.