GithubHelp home page GithubHelp logo

Comments (9)

Sacha0 avatar Sacha0 commented on May 26, 2024 3

It might be worth simultaneously executing on the exported-API nomenclature shift from nonzeros to stored that various parties have been discussing for some time; i.e., rather than a new nonzeroinds export, export storedinds and simultaneously introduce and export storedvals. Best!

from sparsearrays.jl.

ndinsmore avatar ndinsmore commented on May 26, 2024 1

It seems to me that Julia has been trying to move towards semantic correctness. While it is semantically correct to ask for the rowvals(vec::SparseColumnView) , it is not correct to ask for the rowvals(vec::SparseVector) as it stands today where a vector is the analog of a "row matrix" or "row vector".

If you want an orientation agnostic method lets try to name it so.

from sparsearrays.jl.

ndinsmore avatar ndinsmore commented on May 26, 2024

If this is done I would advocate it should be in a slightly more descriptive manner. Such that there is also something for SparseMatrixCSC and potentially per JuliaLang/julia#31329 SparseMatrixCSR.

An interface that got directly at the column pointers would be great as well.

from sparsearrays.jl.

ndinsmore avatar ndinsmore commented on May 26, 2024

@Sacha0 for matrices I think that we should also make sure that the methods are representative of the stored structures. So rowvals I think makes sense but nzrange is a little ambiguous.

For vectors what about indvals since that is more in line with the matrix notation.

from sparsearrays.jl.

chethega avatar chethega commented on May 26, 2024

@Sacha0 So you are saying new functions storedinds and storedvals that are only defined for sparse vectors, sparse column views (and potentially transposes of sparse vectors / sparse column views)? And nonzeroinds gets retained for the time, until eventually nonzeros for sparse vectors and nonzeroinds get deprecated?

from sparsearrays.jl.

KlausC avatar KlausC commented on May 26, 2024

IMHO for SparseMatrixCSC we should stick to the exported interface nonzeros - rowvals - nzrange and adapt the interface for SparseVector accordingly.

That means: replace SparseArrays.nonzeroinds by rowvals and formally introduce nzrange(V, col) = 1:length(V) where V<:SparseVector.

That would allow to handle Union{SparseVector,SparseMatrixCSC} uniformly.
The unexported storedinds, storedvals, ... which were introduced to achieve this uniformity, would not be needed anymore.

from sparsearrays.jl.

Sacha0 avatar Sacha0 commented on May 26, 2024

I can't find the relevant discussions immediately --- if someone has the bandwidth to do the necessary issue tracker spelunking, that would be great! --- so in brief I'll reiterate the motivations for moving towards storedinds / storedvals:

  1. Some time ago we moved away from eagerly stripping numerical zeros from the structure of SparseMatrixCSCs, and in conjunction began moving away from the nonzeros terminology / towards the now-more-correct stored.

  2. The SparseVector and SparseMatrixCSC data structures are sufficiently similar that it's possible to build an abstraction layer (abstracting away their field name differences, and otherwise hiding e.g. the absence of column pointers in SparseVector) that enables unification of implementation of a nice set of operations applicable to both SparseVectors and SparseMatrixCSCs. Unifying field names between these data structures insofar as possible facilitates and simplifies realization of that abstraction layer.

  3. Supporting SparseMatrixCSR at some point would be great. Minimizing the amount of implementation duplication between SparseVector, SparseMatrixCSC, and SparseMatrixCSR in so doing would be great. One way to facilitate that implementation deduplication is to extend the preceding abstraction layer to include the hypothetical SparseMatrixCSR, and unifying the field names of those three types insofar as possible would facilitate and simplify the realization of that abstraction layer.

    (More concretely, the data structure underlying CSC is fundamentally the same as that for CSR. So one can construct a type for that data structure, e.g. SparseMatrixCSX, with a parameter indicating whether the X is C or R, and make SparseMatrixCSC and SparseMatrixCSR corresponding type aliases. One could generalize the thing still further and implement SparseVector on top of it too.)

  4. Going a step further, it would be great to extend that abstraction layer to Adjoint, Transpose, and other wrapped types of sparse matrices/vectors, such that... you see where this is going :) ... generic implementations can be written for operations common to all of those types. And so it would be great if the nomenclature could make sense for all these things.

Hence storeinds and storedvals, and potentially pointers or something similar. IIRC, we didn't have enough deprecation cycles and developer bandwidth to make this happen prior to 1.0, but the plan was to use dot-overloading to introduce aliases for the present field names, deprecate the present field names, and then in a future release cycle swap the aliases for the field names. Best!

from sparsearrays.jl.

KlausC avatar KlausC commented on May 26, 2024

@Sacha0 this vision of a general, concise, and consistent sparse object has my full support. I proposed the (as I know now) legacy interface nonzeros - rowvals - nzrange because that are currently (v1.2) the only exported functions.

I also find it necessary to make a difference between the field names (which should not be used by the user at all), and the accessor methods which could be named storedvals - storedinds - indrange or similar. These may be consistently extended to SparseMatrixCSX, SparseVector, and more or less straight forward to SubArray{,Sparse...}. The last one is subject of a PR I am currently busy with.
I use indrange instead of pointer_range to keep the analogy with nzrange.

There should typically be no need to access the fields directly. All sparse or wrapped sparse types supporting this interface could make use of the standard column-wise (for CSR row-wise) processing loop:
(col, row, and the constant 2 should be replaced by generic names for CSR)

for col in axes(A, 2), j in indrange(A, col)
    row = storedinds(A, k)
    val = storedvals(A, k)
   ... do the job for (col, row, val) ...
end

from sparsearrays.jl.

ndinsmore avatar ndinsmore commented on May 26, 2024

@Sacha0 I couldn't agree with this more:

(More concretely, the data structure underlying CSC is fundamentally the same as that for CSR. So one can construct a type for that data structure, e.g. SparseMatrixCSX, with a parameter indicating whether the X is C or R, and make SparseMatrixCSC and SparseMatrixCSR corresponding type aliases. One could generalize the thing still further and implement SparseVector on top of it too.)

To take it further, beyond using the same data structure for matrix and vectors (you are only saving 3 Int), it strikes me that without much difficulty you could also use the same data structure for uncompressed sparse where you leave room in a given"column" to insert more rows.

from sparsearrays.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.