GithubHelp home page GithubHelp logo

Comments (6)

meggart avatar meggart commented on August 15, 2024 2

Finally the issue got solved through merging #59. Starting from your example:

z = rand(100,100,1000)
z[3,5,:]
z[5,9,:]
z[10,6,:]

You can now write as

z[[CartesianIndex((3,5)), CartesianIndex((5,9)),CartesianIndex((10,6))]]

and DiskArrays will make sure every affected chunks will be accessed only once, so you have optimal read performance from remote sources. ALternatively you can do:

mask = falses(100,100)
mask[3,5] = true
mask[5,9] = true
mask[10,6] = true
z[mask,:]

which will in the end access the same machinery as the example mentioned above.

from diskarrays.jl.

meggart avatar meggart commented on August 15, 2024 1

Hi @alex-s-gardner I would really like do dig into this right now, but due to private issues + some project meetings this week I have to postpone this until next week. In case you don't hear anything back, feel free to ping me to remind me of the issue.

from diskarrays.jl.

alex-s-gardner avatar alex-s-gardner commented on August 15, 2024

Digging a bit more I've come up with the following:

mask = falses(size(z))
mask[[CartesianIndex((1,5)),CartesianIndex((5,9)),CartesianIndex((10,6))],:] .= true
z[mask]

is this the most efficient approach?

from diskarrays.jl.

alex-s-gardner avatar alex-s-gardner commented on August 15, 2024

In my specific case indexing using a logical array is extremely inefficient which leaves me without a practical solution:

mask = falses(size(foo["var"]))
mask[1, 1, :] .= true 
foo["var"][mask]

takes 30 seconds to read in and:

foo["var"][1,1,:]
takes 0.5 seconds to read

from diskarrays.jl.

alex-s-gardner avatar alex-s-gardner commented on August 15, 2024

Just a note that the above should have been written as:
z[[CartesianIndex((3,5)), CartesianIndex((5,9)),CartesianIndex((10,6))], :]

from diskarrays.jl.

alex-s-gardner avatar alex-s-gardner commented on August 15, 2024

@meggart I noticed the the dimensions get moved around in unintuitive ways

dc = Zarr.zopen(path2zarr)

size(dc["v"])
(834, 834, 84396)

r = [1 4 6 1]
c = [7 8 9 3]

cartind = CartesianIndex.(r,c)

size(dc["v"][cartind, :])
(1, 84396, 4)

cartind = CartesianIndex.(r',c')

size(dc["v"][cartind, :])
(4, 84396, 1)

This is different behavior that non DiskArrays
z = rand(100,100,1000)

cartind = CartesianIndex.(r,c)

size(z[cartind,:])
(1, 4, 1000)

cartind = CartesianIndex.(r',c')
size(z[cartind,:])
(4, 1, 1000)

from diskarrays.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.