Comments (29)
I would be ok with forbidding creation of sparse matrices where zero(T)
is not defined. Would be good to hear other thoughts.
from sparsearrays.jl.
I wonder whether Base should maybe define
zero(::String) == ""
, which would at least make that specific case work.
This should absolutely not be defined. The empty string is the multiplicative unit of the string semigroup which is why we have one(String) == ""
. There is no additive identity because we do not use +
for concatenation. If we did, then "foo" + "bar"
should produce the regular expression r"foo|bar"
. String and regular expressions form a semiring under the operations of *
for concatenation and +
for alternation. So the correct value for zero(String)
if any, should be a regular expression that matches nothing, so we could have zero(::Type{String}) = r"(?!)"
.
from sparsearrays.jl.
It looks like we must either commit a "breaking change" or provoke a "type piracy" by the user, if we want to extend the element types of Sparse...
beyond Number
s.
Defending my proposal of sparse_unstored
etc.: it seems to be a crime of second degree, if we pirate these new functions compared to iszero
and zero
:-) .
Thinking more about adding type parameters, and why it disgusts me adding two functions as replacements of iszero
and zero
, I believe, these two functions are jointly representing one single concept (of storage). Maybe a question of economy and extensibility to encode this concept in a single parameter. With other words, can't we pass a single trait name to encompass what we want. That could look like so:
struct SparseMatrixCSC{Tv,Ti,NullityConcept} ... end
SparseMatrixCSC{Tv,Ti}(...) = SparseMatrixCSC{Tv,Ti,DefaultNullityTrait}(...)
iszero(x, ::Type{<:DefaultNullityTrait}) = iszero(x)
zero(T::Type, ::Type{<:DefaultNullityTrait}) = zero(T)
abstract type DefaultNullityTrait end
That would allow the use with different concepts to define
abstract type StringNullityTrait end
Base.iszero(x::AbstractString, ::Type{<:StringNullityTrait}) = isempty(x)
Base.zero(::Type{T}, ::Type{<:StringNullityTrait}) where T<:AbstractString = T("")
and inject this behavior into the sparse matrix using constructor SparseMatrixCSC{Tv,Ti,StringNullityTrait}(...)
from sparsearrays.jl.
I agree with that, but the problem remains on that we cannot change type for the current SparseArray
.
What I think is that this allows to solve most of the problems without breaking anything.
Also allows to be extensible, for example, someone could define a DefaultArray
with a general default value (no strictly zero). This would allow to have something like what @StefanKarpinski proposed in some places before (cannot find the link rn), but the ideas was that some operation over matrices (including for pure function, map
) are closed if we allow a general default value.
Again, this would be more elegant (and general) if we didn't make the distintion of the zero
case, but right now that is the accepted and stable interface for stdlib SparseArray
.
from sparsearrays.jl.
can you provide an MWE?
from sparsearrays.jl.
Sorry for the pollution, I just noticed that although I had dev
ed the main
branch from here, it would still use the sysimage version. Downloading Julia nightly, it works as expected.
from sparsearrays.jl.
Happy to accept a PR and see this work.
from sparsearrays.jl.
Tried, with mixed results ;-) Doing so will forbid creating sparse vectors or matrices of types T
with no defined zero(T)
for instance in SparseVector{Tv,Ti}(s::AbstractVector{Tv})
as it needs to determine if the values are non-zero. That would be fine I think, except that in line 1246 of stdlib/SparseArrays/test/sparse.jl
there is this test
A = sparse(["a", "b"])
@test_throws MethodError findmin(A, dims=1)
That now fail in the creation of the vector. I don't know what to think, is SparseVector{String,Ti}
something that should be supported? All remaining tests pass (both for vector and matrices).
All in all I think that the best would be to be able to specify the non-stored value for sparse arrays.
from sparsearrays.jl.
See also JuliaLang/julia#19561, JuliaLang/julia#11408.
from sparsearrays.jl.
See also JuliaLang/julia#19561, JuliaLang/julia#11408.
Thanks @stevengj. I was unaware of JuliaLang/julia#11408. If there is no consensus regarding the drop of support for types T without zero(T), then maybe it is better close this (and the PR)?
from sparsearrays.jl.
I'm in favor of restricting SparseMatrixCSC
to elements with a zero
but regardless of that discussion, I think JuliaLang/julia#30580 improves that the situation for StaticArrays
, right?
from sparsearrays.jl.
I'm in favor of restricting
SparseMatrixCSC
to elements with azero
but regardless of that discussion, I think JuliaLang/julia#30580 improves that the situation forStaticArrays
, right?
Yes, but I admit that I found the example after I ran into the "bug" in the code. The current code uses != 0
in several places, which is of course wrong for non-numerical types that define zero
; but at least it doesn't give an error for non-numerical types that do not define zero
(although !=0
is always true there, so for instance the sparse-dense-sparse round-trip gives a matrix full of stored zeros). Using iszero
would mean to really drop support for those types (e.g. String
) even for simple storage (which I don't know if people really use, but at least it appears in some tests).
from sparsearrays.jl.
D4M sometimes uses sparse arrays to store non-numeric data.
I found this issue after being tripped up by the fact that broadcast non-numeric data didn't work, and was a little surprised that you couldn't broadcast an Any[1, 2]
with a sparse matrix either, though I see now why that's the case. Regardless of the conclusions here, it would be nice if that error message were improved to be a bit more specific. Thanks to everyone for taking the time to thoroughly map out the territory — I learned a lot reading through some of the issues just now.
I wonder whether Base should maybe define zero(::String) == ""
, which would at least make that specific case work.
from sparsearrays.jl.
I wonder whether Base should maybe define
zero(::String) == ""
, which would at least make that specific case work.
I get your point, but at least from the mathematical point of view it seems an awkward stretch IMHO: ""
is the identity element of the multiplicative (non-abelian) semigroup (*,String)
, whereas zero(T)
should be the identity of the additive group (+,T)
. I think all this awkwardness could be avoided if one could specify the "non-stored" element of sparse arrays. On numerical data, linear operations could be fast-tracked if the non-stored element iszero
so as to retain the current performance (and a sensible default could mimic current or desired behavior). I'm not saying all this is easy to implement though of course, but it seems like a fun project.
from sparsearrays.jl.
Right, yeah, I realized later that I was confused. Returning regular expressions seems mathematically correct but not very nice for performance (or type-stability).
from sparsearrays.jl.
I ran into the following error today when trying to use spare arrays with non-numeric data types. I could create a SparseVector{Symbol} but couldn't use it as intended
julia> v = SparseArrays.sparsevec([1;22], [:a, :b])
22-element SparseVector{Symbol,Int64} with 2 stored entries:
[1 ] = :a
[22] = :b
julia> findfirst(isequal(:b), v)
ERROR: MethodError: no method matching zero(::Type{Symbol})
If these types do not really support working with non-numeric values, then I would also suggest to forbid their creation
from sparsearrays.jl.
As far as I know, the SparseVector
and SparseMatrixCSC
types have minimal requirements, which are support for iszero
and zero
. Both are essential for the intended functionalty of the structures - and make not much sense for Symbol
or String
.
Neverthless it is not desirable restrict the element types to (for example) Number
s, because there are other senseful types, which have both iszero
and zero
.
from sparsearrays.jl.
I wonder if we could not simply add two new parameters SparseMatrixCSC{Tv,Ti,Z,IZ}
, and make the constructors make them default to zero
and iszero
respectively. This would make using SparseMatrixCSC
as storage for non-numerical types possible, on pair with Matrix
.
from sparsearrays.jl.
I would prefer not to add functions as type parameters.
Alternatively introduce new functions like sparse_unstored
and is_sparse_unstored
, which default to zero
and iszero
for Number
, and which can be redefined for other types.
from sparsearrays.jl.
Is there a particular reason why functions as type parameters are to be avoided? (honest question)
I don't like the sparse_unstored
and is_sparse_unstored
solution, because then user-defining it e.g. for String
would be type piracy...
from sparsearrays.jl.
Is there a particular reason why functions as type parameters are to be avoided? (honest question)
The reason that made me feel uncomfortable with this was, that it introduced a breaking change into the type system.
would be type piracy...
Not, if we defined default implementation for String
and other relevant types.
from sparsearrays.jl.
Not, if we defined default implementation for
String
and other relevant types.
For every type? This doesn't feel right to me, as in a sense would be making String
aware of sparse structures. And the problem would still stand for types defined in packages, no?
from sparsearrays.jl.
Hi! I was redirected to this issue form JuliaLang/julia#42536 since I'm using SparseArrays
with Num
from Symbolics.jl and all the (!)== 0
messes with the use since is returns an Expresion
instead of a Bool
.
Is there any consensus as to whether the iszero
and zero
interface for a type is required to work with SparseArrays?
from sparsearrays.jl.
An alternative solution, with other benefits is here: https://github.com/JuliaLang/julia/issues/41036
This looks like it may be related to proposals involving isnullable
, although I can't find them at the moment.
As far as I can tell https://github.com/JuliaLang/julia/issues/41036 involves no breakage, inefficiency, or pirating. In fact it clarifies the role of the unstored element.
from sparsearrays.jl.
An alternative solution, with other benefits is here: JuliaLang/julia#41036 This looks like it may be related to proposals involving
isnullable
, although I can't find them at the moment.As far as I can tell JuliaLang/julia#41036 involves no breakage, inefficiency, or pirating. In fact it clarifies the role of the unstored element.
This is similar I think to the sparse_unstored
proposal above in this thread by @KlausC and the nullable
solution by @Suavesito-Olimpiada in https://discourse.julialang.org/t/is-there-a-defined-minimal-interface-for-a-type-to-work-with-sparsearrays/69602 if I'm not mistaken.
from sparsearrays.jl.
I think that although the core idea of sparse arrays for linear algebra and storage is the same, they are quite different on the semantics.
Under the actual implementation, for linear algebra you want to have a additive "zero" element under a ring. But for storage sparse array, all you want to have is a "nullable" (unstored) element with no other properties attached to it. Under this circumstances the sparse array idea for linear algebra is a subset of the more general idea of a storage sparse array.
One solution that come to my mind that allow the current implementation to continue working is to define a AbstractStorageSparseArrays
type where you can define a Unstored
and IsUnstored
type parameters which are functions that define the "default" unstored element and the check if an element is unstored.
Then base AbstractSparseArrrays
on this new type and set this parameters as default to zero
and iszero
on this.
The basic idea is to have something as the following
abstract type AbstractStorageSparseArray{Tv, Ti, N, Unstored, IsUnstored} <: AbstractArray{Tv, N} end
abstract type AbstractSparseArray{Tv, Ti, N} <: AbstractStorageSparseArray{Tv, Ti, N, zero, iszero} end
const AbstractStorageSparseVector{Tv, Ti, Unstored, IsUnstored} = AbstractStorageSparseArray{Tv, Ti, 1, Unstored, IsUnstored}
const AbstractSparseVector{Tv, Ti} = AbstractSparseArray{Tv, Ti, 1}
struct SparseVector{Tv, Ti<:Integer} <: AbstractSparseVector{Tv, Ti}
# fields...
end
And base the basic interface for sparse arrays on the new AbstractStorageSparseArray
. The use of Unstored
and IsUnstored
as type parameters allows to define in the most convenient way what elements are not stored and the unstored value, and the definition of AbstractSparseArray
does not change the current type.
This also avoids type piracy, allowing for example to do StorageSparseArray{String, Int, isempty, emptystring}
(defining emptystring() = ""
in the corresponding module).
Maybe I'm missing things, I've tried to go to read all the places where this came up, but it is really hard to find them all and remember all the points made.
from sparsearrays.jl.
I agree with almost everything (I proposed something similar some comments above), except that I think that having a generic unstored would be useful also for algebraic elements (e.g. Float64
), so I would not separate between "storage" and "linear algebra" arrays. The distinction will come automatically because some algebraic operations will fail if the corresponding operations on elements are not defined...
from sparsearrays.jl.
this has been fixed
julia> nnz(sparse(Matrix(s)))
0
from sparsearrays.jl.
Still not working with Symbolics. I think it should not be closed yet.
from sparsearrays.jl.
Related Issues (20)
- Factorize a sparse symmetric indefinite matrix
- Regression for `mul!` from 1.9 to 1.10 HOT 11
- Using `@view` leads to 100x performance loss HOT 7
- Loads of warnings about method redefinitions HOT 2
- Wrapper generator needs updating to pick SuiteSparse headers from new location HOT 1
- Single precision support in CHOLMOD HOT 3
- Cholesky F.PtL \ Av where Av is a view does not work HOT 1
- Base.stack is underperforming for SparseArrays HOT 3
- Regression in invalidations caused by SparseArrays HOT 3
- Elementwise multiplication by a view of a dense matrix gives a dense matrix
- `findmin(A; dims=1)` is much slower than manually looping over. HOT 1
- Sparse array of string types HOT 17
- Memory Mapped SparseArrays HOT 3
- Extra allocations when using generalized `mul!` operation
- Attempting to run sparse `qr` produces StackOverflow when run on a sparse matrix of `ForwardDiff.Dual`. HOT 6
- Inconsistent addition between sparse and dense HOT 1
- `ldiv` of `LUFactorization` can throw `SingluarException` HOT 1
- Thread-safe dropstored! HOT 1
- Merge SparseMatricesCSR.jl in HOT 2
- Support zero-based indices HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sparsearrays.jl.