GithubHelp home page GithubHelp logo

juliasparse / sparsearrays.jl Goto Github PK

View Code? Open in Web Editor NEW
76.0 14.0 42.0 2.89 MB

SparseArrays.jl is a Julia stdlib

Home Page: https://sparsearrays.juliasparse.org/

License: Other

Julia 100.00%
julia sparse-matrix stdlib

sparsearrays.jl's Introduction

SparseArrays

Documentation Build Status

This package ships as part of the Julia stdlib.

SparseArrays.jl provides functionality for working with sparse arrays in Julia.

Using development versions of this package

To use a newer version of this package, you need to build Julia from scratch. The build process is the same as any other build except that you need to change the commit used in stdlib/SparseArrays.version.

It's also possible to load a development version of the package using the trick used in the Section named "Using the development version of Pkg.jl" in the Pkg.jl repo, but the capabilities are limited as all other packages will depend on the stdlib version of the package and will not work with the modified package.

The main environment may become inconsistent so you might need to run Pkg.instantiate() and/or Pkg.resolve() in the main or project environments if Julia complains about missing Serialization.jl in this package's dependencies.

For older (1.8 and before) SuiteSparse.jl needs to be bumped too.

Updating SuiteSparse

In order to upgrade SparseArrays.jl to use a new release of SuiteSparse, the following steps are necessary:

  1. Update SuiteSparse in Yggdrasil
  2. Update the SuiteSparse wrappers in SparseArrays.jl/gen and generate the new wrappers
  3. Run BumpStdlibs to update the SparseArrays.jl version in julia master
  4. Update the relevant stdlibs in Julia to pull in the new releases

sparsearrays.jl's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sparsearrays.jl's Issues

Drop StridedArray restriction for some sparse mul!s?

Currently, mul!(C, A, B, α, β) for the case A is a sparse matrix is defined as

https://github.com/JuliaLang/julia/blob/2d4f4d26a0c5a74717d826dc7e1e62f7650a2e9d/stdlib/SparseArrays/src/linalg.jl#L34-L52

This method is not used in case B is Symmetric, sparse, etc. JuliaLang/julia#33214 (comment)

Would it be crazy to remove restriction B::Union{StridedVector,AdjOrTransStridedMatrix}? As indexing on B does not appear in the inner-most loop, I wonder if it is not such a bad default. Of course, it depends on how dense A is (for col = 1:size(A, 2) could be considered "inner-most" if A is super sparse). However, it's not immediately apparent that there is a case where the generic fallback can be much better than the code assuming that A is sparse.

sparse matrix * sparse vector is slow when the vector is relatively dense

Hi,

The current implementation seems to produce a dense result vector which is then converted to sparse before returning. I ran into memory scaling issues. Perhaps something along the lines of my_mul below would be worth having?

A = sprand(100000000,1000,.00001)
x = sprand(1000,  0.3)

@time y1=A*x
@time y2=my_mul(A,x)

@assert y1.nzval==y2.nzval
@assert y1.nzind==y2.nzind

 0.511000 seconds (11 allocations: 778.183 MB, 16.05% gc time)
 0.073359 seconds (59 allocations: 21.794 MB)

function my_mul{TvA,TiA,TvX,TiX}(A::SparseMatrixCSC{TvA,TiA}, x::AbstractSparseVector{TvX,TiX})
    m, n = size(A)
    length(x) == n || throw(DimensionMismatch())
    Tv = promote_type(TvA, TvX)
    Ti = promote_type(TiA, TiX)

    xnzind = x.nzind
    xnzval = x.nzval
    Acolptr = A.colptr
    Arowval = A.rowval
    Anzval = A.nzval
    ynzind=Ti[]
    ynzval=Tv[]
    
    m == 0 && return sparsevec(ynzind, ynzval, m)
    
    @inbounds for i = 1:length(xnzind)
        v = xnzval[i]
        if v != zero(v)
            j = xnzind[i]
            for r = A.colptr[j]:(Acolptr[j+1]-1)
                push!(ynzind, Arowval[r])
                push!(ynzval, Anzval[r] * v)
            end
        end
    end

    return sparsevec(ynzind, ynzval, m)
end

Export SparseArrays.nonzeroinds

I believe SparseArrays.nonzeroinds should be documented and exported, since it is the equivalent of rowvals for sparse vectors.

Cannot solve Ax = B for sparse matrices A, B

A\B works if A is a sparse matrix and B is a sparse vector. A\B does not work if B is a sparse matrix.

Example:

using SparseArrays, LinearAlgebra
A = sparse(I,100,100)
B = sparse(I,100,100)
C = sprandn(100,0.5)
A\B # will fail 
A\C # will work. 

Sometimes I can get around this by dividing column by column, but it's nevertheless still a strange bug to have.

Broadcast operations across multiple dimensions materialize zeros in sparse matrices

Broadcasting division where the second argument has multiple dimensions causes the zeros in the matrix to be materialized

julia> using SparseArrays

julia> m = sparse([0 2; 1 0])
2×2 SparseMatrixCSC{Int64, Int64} with 2 stored entries:
 ⋅  2
 1  ⋅

julia> m ./ 2
2×2 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
  ⋅   1.0
 0.5   ⋅

julia> m ./ [1 2]
2×2 SparseMatrixCSC{Float64, Int64} with 4 stored entries:
 0.0  1.0
 1.0  0.0

julia> m ./ [1, 2]
2×2 SparseMatrixCSC{Float64, Int64} with 4 stored entries:
 0.0  2.0
 0.5  0.0

But doesn't happen with multiplication or addition

julia> m .+ [0 0]
2×2 SparseMatrixCSC{Int64, Int64} with 2 stored entries:
 ⋅  2
 1  ⋅

julia> m .* [2 2]
2×2 SparseMatrixCSC{Int64, Int64} with 2 stored entries:
 ⋅  4
 2  ⋅

julia> m .* [2. 2]
2×2 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
  ⋅   4.0
 2.0   ⋅

Not sure it's related, but another weird case was identified by @mcabbott on slack eg

julia> m .* begin inv.([1 2]) end
2×2 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
  ⋅   1.0
 1.0   ⋅

julia> m .* inv.([1 2])
2×2 SparseMatrixCSC{Float64, Int64} with 4 stored entries:
 0.0  1.0
 1.0  0.0

I did some searching around, and couldn't tell if it's related to JuliaLang/julia#36551, but it doesn't seem to matter if dividing by a sparse array, eg

julia> julia> m ./ sparse([1 2])
2×2 SparseMatrixCSC{Float64, Int64} with 4 stored entries:
 0.0  1.0
 1.0  0.0

This was checked on 1.5 release, the release-1.6 branch from a couple of days ago, and a few hours old master.

Performance issue with (sparse) backsolve after LU factorization (turn off iterative refinement by default)

I noticed there was a performance issue with the backsolve that I thought I should share here.

Coming from MATLAB I thought the best way to show this performance issue was to benchmark the two on the same machine with the same data, so here it is. I perform a simple backslash (M \ x), a factorization (factorize(M)), and two backsolves (Mf \ x and Mf \ [x 2x 3x 4x]) using a somewhat-random matrix M and vector x. First, I use MATLAB (v2017b) to perform this test and I save M and x in a .mat file. Finally I load these in Julia (v1.1.0) using the MAT.jl package and perform the same test again. (I ] up'd my packages before this test.)

Here are the results (the code I used is at the end of this post):

  1. MATLAB performance:

    >> tic; Mf = linfactor(M); toc;
    Elapsed time is 45.817367 seconds.
    >> tic; M \ x; toc;
    Elapsed time is 34.809706 seconds.
    >> tic; linfactor(Mf, x); toc;
    Elapsed time is 0.352151 seconds.
    >> tic; linfactor(Mf, [x 2*x 3*x 4*x]); toc;
    Elapsed time is 1.288229 seconds.
  2. Julia performance:

    julia> @btime Mf = factorize(M) ;
      32.151 s (66 allocations: 8.71 GiB)
    
    julia> @btime M \ x ;
      33.191 s (74 allocations: 8.71 GiB)
    
    julia> @btime Mf \ x ;
      1.132 s (8 allocations: 1.22 MiB)
    
    julia> @btime Mf \ [x 2x 3x 4x] ;
      4.523 s (31 allocations: 5.95 MiB)

As you can see the backsolves in Julia take roughly 3–4 times as long as the MATLAB equivalent, while the factorization takes roughly the same time. Any idea what is the issue there? (Note using inplace ldiv! does not change those times).

(I know I am comparing apples to oranges, and that this could be an artefact of the random M and x but I seem to see this happen every time and I think there is too significant a difference in performance to be neglected. Please tell me if I am wrong 🙂)


The code I used:

  1. MATLAB code:

    n = 20000;
    M = sprand(n, n, 20/n) + speye(n);
    x = rand(n, 1);
    tic; Mf = linfactor(M); toc;
    tic; M \ x; toc;
    tic; linfactor(Mf, x); toc;
    tic; linfactor(Mf, [x 2*x 3*x 4*x]); toc;
    save test_matrix.mat M x
  2. Julia code

    using LinearAlgebra, SparseArrays, SuiteSparse, BenchmarkTools, MAT
    M = matread("test_matrix.mat")["M"] ;
    x = matread("test_matrix.mat")["x"] ;
    Mf = factorize(M) ;
    @btime Mf = factorize(M) ;
    @btime M \ x ;
    @btime Mf \ x ;
    @btime Mf \ [x 2x 3x 4x] ;

broadcast(/, ::SparseMatrixCSC, ::AbstractArray) stores unnecessary zeros

If A is a sparse matrix, B is a sparse or dense vector of non-zeros with appropriate size, then
C = A ./ B and C = A ./ B' return sparse matrices, which have nnz(C) == prod(size(C)); if A has a good number of structural zeros (few stored values), C has stored many zeros and the size of fields C.rowvaland C.nzval is extremely oversized.

That is not the case for A .* B, which has the expected sparsity structure like A.
The attempt A .* ( 1.0 ./ B) fails, though.

A work-around has to look like B2 = 1.0 ./ B; A .* B2.

sprand with function argument is confusing

sprand accepts a function argument rfn that is used to generate the non zero values of a sparse random matrix or vector. I find the calling convention with rfn to be inconsistent and confusing.

julia> sprand(0,0,1.0,rand)
0×0 SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> sprand(1,1,1.0,rand)
1×1 SparseMatrixCSC{Float64,Int64} with 1 stored entry:
  [1, 1]  =  0.686873

julia> sprand(1,1,1.0,i->fill(1.0,i))
1×1 SparseMatrixCSC{Float64,Int64} with 1 stored entry:
  [1, 1]  =  1.0

So far so good. What about generating Float32 values? The function accepts a type parameter

julia> sprand(0,0,1.0,rand,Float32)
0×0 SparseMatrixCSC{Float32,Int64} with 0 stored entries

julia> sprand(1,1,1.0,rand,Float32)
1×1 SparseMatrixCSC{Float64,Int64} with 1 stored entry:
  [1, 1]  =  0.411008

julia> sprand(1,1,1.0,i->fill(1.0,i),Float32)
1×1 SparseMatrixCSC{Float64,Int64} with 1 stored entry:
  [1, 1]  =  1.0

so the type parameter is ignored for non 0x0 matrices (rendering the method effectively type unstable by the way). This seems a bug, and is definitely ugly.

Moreover, sprand accepts also a RNG first argument:

using Random; r=Random.MersenneTwister(1);

julia> sprand(r,1,1,1.0,rand,Float32)
1×1 SparseMatrixCSC{Float64,Int64} with 1 stored entry:
  [1, 1]  =  0.644883

julia> sprand(r,1,1,1.0,i->fill(1.0,i),Float32)
ERROR: MethodError: no method matching (::getfield(Main, Symbol("##9#10")))(::MersenneTwister, ::Int64)
Closest candidates are:
  JuliaLang/julia#9(::Any) at REPL[36]:1
Stacktrace:
 [1] sprand(::MersenneTwister, ::Int64, ::Int64, ::Float64, ::getfield(Main, Symbol("##9#10")), ::Type{Float32}) at /home/ab/src/julia/build1.x/usr/share/julia/stdlib/v1.2/SparseArrays/src/sparsematrix.jl:1448
 [2] top-level scope at none:0

Here we learn that the signature for the passed method should change, now accepting r as well. I find this overly convoluted (I pass a function that gets passed an argument that I also pass: if I supply the function, I can pass it myself) and a bit inconsistent (the signature of the required rfn depends on other arguments).

Another inconsistency: without rfn, the type parameter can be also specified, but it goes as first argument instead of last.

Of course, most of these could be clarified in the documentation (now it is a bit terse):

  sprand([rng],[type],m,[n],p::AbstractFloat,[rfn])

  Create a random length m sparse vector or m by n sparse matrix, in which the probability of any element being nonzero is independently
  given by p (and hence the mean density of nonzeros is also exactly p). Nonzero values are sampled from the distribution specified by
  rfn and have the type type. The uniform distribution is used in case rfn is not specified. The optional rng argument specifies a
  random number generator, see Random Numbers.

  Examples
  ≡≡≡≡≡≡≡≡≡≡

  julia> sprand(Bool, 2, 2, 0.5)
  2×2 SparseMatrixCSC{Bool,Int64} with 2 stored entries:
    [1, 1]  =  true
    [2, 1]  =  true
  
  julia> sprand(Float64, 3, 0.75)
  3-element SparseVector{Float64,Int64} with 1 stored entry:
    [3]  =  0.298614

But I really think that the calling convention could be improved very easily. I propose to change the behaviour in the following way: whenever a function is passed, this function accepts always only one argument (the number of values to generate) and no type can be specified. In this way, the caller can supply the function (using a random generator or not) he wants and generating the type of values he want (that will become the Tv type of the matrix). A patch for this is very simple, I can submit it if there is interest. I suppose it will have to wait until 2.x for merging as it will be breaking though.

Incomplete promotion rules for sparse matrices

promote_type(typeof(sprand(10,10,0.1)),typeof(sprand(ComplexF64,10,10,0.1)))
SparseMatrixCSC{Tv,Int64} where Tv

vs in the dense case:

promote_type(typeof(rand(10,10)),typeof(rand(ComplexF64,10,10)))
Array{Complex{Float64},2}

getindex is very slow for adjoints of sparse arrays

Calling getindex on an adjoint of a sparse array falls back (tested on 1.2 and on master built from source today, both on Linux) to a general AbstractArray method, making it unusable for non-scalar indexing with large sparse arrays. Consider the following example:

julia> versioninfo()
Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i9-9820X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
julia> using SparseArrays
julia> n = 100000;m = 10000; nf = 25000;

julia> a = sprandn(n,m, 0.0005);

julia> bt = sprandn(m,n, 0.0005)';

julia> @time c = a[1:nf, :];
  0.091680 seconds (208.92 k allocations: 12.387 MiB)

julia> @time c = a[1:nf, :];
  0.007136 seconds (12 allocations: 1.976 MiB)

julia> @time c = bt[1:nf, :];
  9.522862 seconds (444.40 k allocations: 37.878 MiB, 0.55% gc time)

julia> @time c = bt[1:nf, :];
  9.280306 seconds (17 allocations: 15.308 MiB)

This was discussed last year on Discourse. That thread contains another example of the issue. A PR with a fix for sparse vectors (JuliaLang/julia#28654) was opened not long after that discussion but seems to have been abandoned. I don't have time to resurrect the PR at the moment but I thought it would be good to have an issue open to track this. Is fixing the issue on anyone's radar at the moment?

SparseArrays: export nonzeroinds for sparse vectors and sparse column views

As far as I know, we currently export access to nonzero values to sparse vectors and sparse column views, via SparseArrays.nonzeros. However, the corresponding SparseArrays.nonzeroinds is not exported, which makes these two methods of SparseArrays.nonzeros quite useless.

I think we should just write docs and export it: Users of sparse vectors / sparse column views are likely to rely on SparseArrays.nonzeroinds already, and would already be affected by any reorganization.

Sparse linear algebra and structural zeros

IEEE 754 has the unfortunate consequence that floating point numbers don't satisfy x*0 == 0. That x*0 == 0 holds is fundamental to the decoupling between the symbolic and numerical computations for sparse matrices which is arguably one of the most important optimizations for sparse matrix algorithms. Right now, Julia's sparse linear algebra code uses the sparsity pattern optimization extensively and is therefore not IEEE compliant, e.g.

julia> eye(2)*[NaN,1.0]
2-element Array{Float64,1}:
 NaN
 NaN

julia> speye(2)*[NaN,1.0]
2-element Array{Float64,1}:
 NaN
   1.0

Another example is

julia> (-eye(2))[1,2]
-0.0

julia> (-speye(2))[1,2]
0.0

Notice that IEEE compliance for the latter example (with the current sparse matrix implementation) would require storing 2*n^2 + n + 1 instead of 3n + 1 elements. This is also known as MemoryError.

The main question in this issue is to choose one of the two options below for sparse matrices (and probably all types with structural zeros)

  1. x*0 == 0 for all x
  2. x*0 != 0 for some x aka IEEE 754

where 0 is a structural zero. I think that 1. is the best choice since I believe the symbolic/numerical decoupling is very important when working with sparse matrices and I'm not sure if 2. desirable because of the computational and memory costs or if it is possible to achieve at all. I.e. one thing is that it would be a lot of work to handle the NaN and Inf cases in all of our sparse code and that it is unclear who would deliver the hours to do so but what about sparse direct solvers? They optimize over the sparsity pattern without considering NaNs or Infs and I'm not sure how we could handle that case. In theory, we could also choose 2. for some linear algebra functions and 1. for others but I think that would be more confusing and error prone.

However, we'd need to figure out the implications of going with 1. Should we consider throwing more instead of creating NaNs and Inf during sparse calculation, e.g. division with zero? What about sparse broadcast which now (I believe) follows IEEE 754? Since broadcast allows all kinds of operators, things are much more complicated there. If sparse broadcast behaves differently from sparse linear algebra, we'd need to be careful when using broadcast for convenience in the linear algebra code (but I think that would be pretty simple to handle.)

Finally, After discussing this issue in a couple of PRs, I'm pretty confident that we won't be able to reach a unanimous conclusion so when we think we understand the consequences, we'll probably need a vote. Otherwise, we'll keep discussing this everytime a PR touches one of the affected methods.

Rename fields of `SparseMatrixCSC`

Currently the 5 fields are named:

  • m
  • n
  • colptr
  • rowval
  • nzval

My complaints are:

  1. "row" and "col" are used in two of the fields, but not their respective sizes (m and n)
  2. rowval is confusing (what are "row values"?)
  3. colptr: isn't a pointer a C thing? we don't use it anywhere else in the language
  4. nzval: don't we now allow zeros in the non-sparse elements?

I would propose

  • nrow
  • ncol
  • coloffsets
  • rowindices
  • values

missing qr(A)' \ b for sparse

If A is a sparse matrix, then we get

julia> qr(A)' \ c
ERROR: MethodError: no method matching adjoint(::SuiteSparse.SPQR.QRSparse{Float64,Int64})

It would be nice to have this implemented.

For example, when minimizing
image
(least-square + affine), as discussed on discourse, you need to solve systems with both A and A' so it would be nice to easily re-use the QR factorization.

Another missing method is qr(A'). A workaround is to use qr(oftype(A, A')), but that's a bit annoying.

Feature request - specialized arithmetic for views of sparse matrices

I posted this originally as a question here https://discourse.julialang.org/t/slow-arithmetic-on-views-of-sparse-matrices/3644 and got some useful feedback. It seems though that this functionality could be built into julia proper.

Arithmetic on views of sparse matrices fall back on general methods and are thus slow. Here is an example:

d = sprand(Bool,10000,10000, 0.01)
e = view(d, rand(1:10000,5000), rand(1:10000,9000))

using BenchmarkTools
@benchmark sum($d, 1)
BenchmarkTools.Trial:
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     421.354 μs (0.00% GC)
  median time:      450.828 μs (0.00% GC)
  mean time:        463.575 μs (0.00% GC)
  maximum time:     1.226 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1


@benchmark sum($e, 1)
BenchmarkTools.Trial:
  memory estimate:  585.06 KiB
  allocs estimate:  51
  --------------
  minimum time:     3.286 s (0.00% GC)
  median time:      3.312 s (0.00% GC)
  mean time:        3.312 s (0.00% GC)
  maximum time:     3.339 s (0.00% GC)
  --------------
  samples:          2
  evals/sample:     1

Specialized functions for these operations should fix this.

This appears to be related to (but not identical to?) JuliaLang/julia#13438

Sparse arrays constructed with `sparse!` are inconsistent

Latest nightly:

julia> versioninfo()
Julia Version 1.3.0-DEV.558
Commit 79a57931a5 (2019-07-18 23:22 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

The documentation for sparse! says that you may reuse the input arrays for output to save memory. To reproduce:

import SparseArrays: sparse!, dropzeros!

# Sparse data for [1 3; 2 4]
I = [1, 2, 1, 2]
J = [1, 1, 2, 2]
V = [1, 2, 3, 4]

# Minimal lengths of each array according to documentation of sparse!
csrrowptr = Vector{Int}(undef, 3)   # should be >= m+1
csrcolval = Vector{Int}(undef, 4)   # should be >= length
csrnzval = Vector{Int}(undef, 4)    # should be >= length
klasttouch = Vector{Int}(undef, 2)  # should be >= n

B = sparse!(I, J, V, 2, 2, +, klasttouch, csrrowptr, csrcolval, csrnzval, I, J, V)
dropzeros!(B)

Error:

ERROR: LoadError: ArgumentError: new length must be ≥ 0
Stacktrace:
 [1] resize! at ./array.jl:1020 [inlined]
 [2] fkeep!(::SparseArrays.SparseMatrixCSC{Int64,Int64}, ::getfield(SparseArrays, Symbol("##17#18")), ::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/SparseArrays/src/sparsematrix.jl:1330
...

Cause: The matrix returned by sparse! has a colptr array that is too long, with zeros at the end. A possible workaround is just to forcibly resize it:

resize!(B.colptr, 3)
dropzeros!(B)  # works

Make length(A.nzval)==nnz(A)

We currently allow that length(A.nzval)>nnz(A) such that memory for future elements in the matrix can be preallocated. Instead nnz(A)=A.colptr[n+1]-1. However, as pointed out in JuliaLang/julia#30435 (comment), this could simply be handled by sizehint! and having two different mechanisms for ensuring extras allocations seems superfluous. Hence, it might be worth simplifying the sparse matrix code and enforce that nnz(A)==length(A.nzval).

Full array minus spares vector gives sparse array as output

A full array minus a sparse vector gives a sparse array with every position filled as an output. This seems inefficient. An example is below

julia> a = sparsevec([1,3,10], [.1, .3, .4])
10-element SparseVector{Float64,Int64} with 3 stored entries:
  [1 ]  =  0.1
  [3 ]  =  0.3
  [10]  =  0.4

julia> b = collect(1:10)
10-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

julia> b - a
10-element SparseVector{Float64,Int64} with 10 stored entries:
  [1 ]  =  0.9
  [2 ]  =  2.0
  [3 ]  =  2.7
  [4 ]  =  4.0
  [5 ]  =  5.0
  [6 ]  =  6.0
  [7 ]  =  7.0
  [8 ]  =  8.0
  [9 ]  =  9.0
  [10]  =  9.6

Document tol argument in sparse QR

julia> using LinearAlgebra, SparseArrays, SuiteSparse, Random

julia> srand(123);

julia> A = sprandn(10,4,0.5)*sprandn(4, 5, 0.5);

julia> size(qr(A, tol = 1e-16).R)
(5, 5)

julia> size(qr(A, tol = 2e-16).R)
(4, 5)

Operations on SparseMatrices depend heavily on inference

In JuliaLang/julia#19518, broadcast for sparse matrices underwent a major revision, overall a move in the right direction, IMHO. But it also resulted in a regression (in my eyes) if type inference fails for the operation/element type combination. This issue is meant to continue the discussion started in JuliaLang/julia#19518 (comment).

There are three aspects here that make the problem more acute than for ordinary Arrays.

  1. For sparse matrices, even basic operations like + and - just invoke broadcast internally. OTOH, for Arrays, these do some special tricks for the result type, which helps for the empty case. However, getting the type right for the empty case may not be important enough to warrant such special-casing the be replicated.
  2. However, broadcast for sparse matrices relies on inference to determine the output element type not only in the empty as in all-dimensions-zero case, but always also if no entries need to be stored (everything zero), making this case much more likely to occur.
  3. Once inference fails for an all-zero (or truely empty) case, the element type becomes Any, which precludes all further operations as zero(::Any) is undefined.

E.g.:

julia> x = spzeros(Real, 2, 2)
2×2 sparse matrix with 0 Real nonzero entries

julia> x + x
2×2 sparse matrix with 0 Any nonzero entries

julia> x + x + x
ERROR: MethodError: no method matching zero(::Type{Any})

Edit: Similarly also

julia> speye(Real, 3) * 1 * 1
ERROR: MethodError: no method matching zero(::Type{Any})

The immediate fix I have in mind (but I find the sparse broadcast machinery too intimidating on first glance to try to implement properly) is the following: If the result of broadcast(op, a, b) would have element type Any (i.e. inference failed) and nzval is empty, use typeof(op(zero(eltype(a)), zero(eltype(b)))) as element type instead. (Likewise for one- and more-argument cases, of course.) For the example above, this would give an element type of Int which would be much more useful than Any.

Cannot fill! or broadcast empty sparse arrays

julia> using SparseArrays

julia> 1 .+ spzeros(0, 1)
ERROR: ArgumentError: step cannot be zero
Stacktrace:
 [1] steprange_last(::Int64, ::Int64, ::Int64) at ./range.jl:123
 [2] Type at ./range.jl:113 [inlined]
 [3] Type at ./range.jl:165 [inlined]
 [4] _colon at ./range.jl:25 [inlined]
 [5] Colon at ./range.jl:23 [inlined]
 [6] _densestructure!(::SparseMatrixCSC{Float64,Int64}) at /home/mbauman/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/higherorderfns.jl:284
 [7] _map_notzeropres!(::getfield(SparseArrays.HigherOrderFns, Symbol("##3#4")){typeof(+),getfield(SparseArrays.HigherOrderFns, Symbol("##15#18")){Int64,getfield(SparseArrays.HigherOrderFns, Symbol("##19#22"))}}, ::Float64, ::SparseMatrixCSC{Float64,Int64}, ::SparseMatrixCSC{Float64,Int64}) at /home/mbauman/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/higherorderfns.jl:260
 [8] _noshapecheck_map(::getfield(SparseArrays.HigherOrderFns, Symbol("##3#4")){typeof(+),getfield(SparseArrays.HigherOrderFns, Symbol("##15#18")){Int64,getfield(SparseArrays.HigherOrderFns, Symbol("##19#22"))}}, ::SparseMatrixCSC{Float64,Int64}) at /home/mbauman/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/higherorderfns.jl:160
 [9] _shapecheckbc at /home/mbauman/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/higherorderfns.jl:977 [inlined]
 [10] _copy at /home/mbauman/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/higherorderfns.jl:968 [inlined]
 [11] _copy at /home/mbauman/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/higherorderfns.jl:973 [inlined]
 [12] copy at /home/mbauman/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/higherorderfns.jl:964 [inlined]
 [13] materialize(::Base.Broadcast.Broadcasted{SparseArrays.HigherOrderFns.SparseMatStyle,Nothing,typeof(+),Tuple{Int64,SparseMatrixCSC{Float64,Int64}}}) at ./broadcast.jl:716
 [14] top-level scope

julia> fill!(spzeros(0, 1), 1)
ERROR: ArgumentError: step cannot be zero
Stacktrace:
 [1] steprange_last(::Int64, ::Int64, ::Int64) at ./range.jl:123
 [2] Type at ./range.jl:113 [inlined]
 [3] Type at ./range.jl:165 [inlined]
 [4] _colon at ./range.jl:25 [inlined]
 [5] Colon at ./range.jl:23 [inlined]
 [6] _fillnonzero!(::SparseMatrixCSC{Float64,Int64}, ::Float64) at /home/mbauman/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/sparsevector.jl:1921
 [7] fill!(::SparseMatrixCSC{Float64,Int64}, ::Int64) at /home/mbauman/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/sparsevector.jl:1951
 [8] top-level scope

Bug when comparing and affecting SparseMatrixCSC

First bug

in sparsematrix.jl concerning

function ==(A1::AbstractSparseMatrixCSC, A2::AbstractSparseMatrixCSC)

please replace all valsX[jY]!=0 by !iszero Indeed when redefining zero() the wrong zero is not correctly take into account.

Here the fix:

function ==(A1::AbstractSparseMatrixCSC, A2::AbstractSparseMatrixCSC)
    size(A1) != size(A2) && return false
    vals1, vals2 = nonzeros(A1), nonzeros(A2)
    rows1, rows2 = rowvals(A1), rowvals(A2)
    m, n = size(A1)
    @inbounds for i = 1:n
        nz1,nz2 = nzrange(A1,i), nzrange(A2,i)
        j1,j2 = first(nz1), first(nz2)
        # step through the rows of both matrices at once:
        while j1 <= last(nz1) && j2 <= last(nz2)
            r1,r2 = rows1[j1], rows2[j2]
            if r1==r2
                vals1[j1]!=vals2[j2] && return false
                j1+=1
                j2+=1
            else
                if r1<r2
                    !iszero(vals1[j1]) && return false
                    j1+=1
                else
                    !iszero(vals2[j2]) && return false
                    j2+=1
                end
            end
        end
        # finish off any left-overs:
        for j = j1:last(nz1)
            !iszero(vals1[j]) && return false
        end
        for j = j2:last(nz2)
            !iszero(vals2[j]) && return false
        end
    end
    return true
end

Here some checks (only made on Julia 1.0.3):

using LinearAlgebra, SparseArrays, Printf

# function ==(A1::AbstractSparseMatrixCSC, A2::AbstractSparseMatrixCSC)
function FIX(A1::SparseMatrixCSC, A2::SparseMatrixCSC)
           size(A1) != size(A2) && return false
           vals1, vals2 = nonzeros(A1), nonzeros(A2)
           rows1, rows2 = rowvals(A1), rowvals(A2)
           m, n = size(A1)

           @inbounds for i = 1:n
               nz1,nz2 = nzrange(A1,i), nzrange(A2,i)
               j1,j2 = first(nz1), first(nz2)

               # step through the rows of both matrices at once:
               while j1 <= last(nz1) && j2 <= last(nz2)
                   r1,r2 = rows1[j1], rows2[j2]
                   if r1==r2
                       vals1[j1]!=vals2[j2] && return false
                       j1+=1
                       j2+=1
                   else
                       if r1<r2
                           !iszero(vals1[j1]) && return false
                           j1+=1
                       else
                           !iszero(vals2[j2]) && return false
                           j2+=1
                       end
                   end
               end
               # finish off any left-overs:
               for j = j1:last(nz1)
                   !iszero(vals1[j]) && return false
               end
               for j = j2:last(nz2)
                   !iszero(vals2[j]) && return false
               end
           end
           return true
       end


# New algebra redefining zero() !!
struct MP{T} <: Number λ::T end
Base.zero(::Type{MP{T}}) where T = MP(typemin(T))
Base.zero(x::MP{T})      where T = zero(typeof(x))
Base.one(::Type{MP{T}})  where T = MP(zero(T))
Base.one(x::MP{T})       where T = one(typeof(x))
Base.promote_rule(::Type{MP{T}}, ::Type{U}) where {T, U} = MP{T}
Base.convert(::MP{T}, x) where T = MP(T(x))
MP(x::MP) = MP(x.λ)
MP(S::SparseMatrixCSC{T,U}; preserve=true) where {T, U} = convert(SparseMatrixCSC{MP{T},U}, S)
mpzeros(::Type{T}, m::Int64, n::Int64) where T = spzeros(MP{T}, m, n)


# New algebra with false zeros
A = MP(sparse([1, 2], [1, 2], [0.0, 0.0]))
B = mpzeros(Float64, 2,2)
# A contains two MP(0.0) and B is empty
A == B # returns true which is incorrect
FIX(A, B) # returns false which is expected

# New algebra with real zeros
mp0 = zero(MP{Float64})
AA = sparse([1, 2], [1, 2], [mp0, mp0])
BB = mpzeros(Float64, 2,2)
# A contains two MP(-Inf) and B is empty
AA == BB # returns false which is incorrect
FIX(AA, BB) # returns true which is expected

# Normal algebra
AAA = sparse([1, 2], [1, 2], [0.0, 0.0])
BBB = spzeros(Float64, 2,2)
AAA == BBB # returns true which is correct
FIX(AAA, BBB) # returns true which is expected

Second bug

The = of Sparse is also impacted (but I could not locate the function).

# Inserting fake zero
B[1,1] = MP(0.0)
# B is still empty. This is an error because shall have MP(0.0) which is not equivalent to zero()

 # Inserting real zero
B[1,1] = mp0
# B is not empty. This is an error because mp0 is the zero()

Bug in map! for sparse matrices with aliasing

We've had aliasing safeguards for broadcast and generic map! for a while (see JuliaLang/julia#21693, JuliaLang/julia#25890 in particular). However, I believe this machinery was never ported to SparseArrays's map!:

julia> using SparseArrays

julia> s0 = sprand(10, 10, 0.1); s1 = sprand(10, 10, 0.1); sc = copy(s0);

julia> map!(+, s0, s0, s1);

julia> s0 == sc + s1
false

The last should be true, as in the dense case:

julia> s0 = Matrix(sprand(10, 10, 0.1)); s1 = Matrix(sprand(10, 10, 0.1)); sc = copy(s0);

julia> map!(+, s0, s0, s1);

julia> s0 == sc + s1
true

This behaviour is in 1.5.1 and master

Broadcast seems to work ok, although I think it doesn't use sparse-specific codepaths, so it is much slower than map!

julia> s0 = sprand(10,10,0.1); s1 = sprand(10,10,0.1); sc = copy(s0);

julia> s0 .= s0 .+ s1;

julia> s0 == sc + s1
true

`\` on Symmetric{Float32, SparseMatrixCSC{Float32, Int64} fails

julia> using SparseArrays, LinearAlgebra

julia> a = sprand(Float64, 10, 10, 0.2);

julia> a = a'*a + I;

julia> Symmetric(a) \ rand(10);

julia> a = sprand(Float32, 10, 10, 0.2);

julia> a = a'*a + I;

julia> Symmetric(a) \ rand(Float32, 10);
ERROR: MethodError: no method matching lufact!(::SparseMatrixCSC{Float32,Int64}, ::Val{true})
Closest candidates are:
  lufact!(::Union{DenseArray{T<:Union{Complex{Float32}, Complex{Float64}, Float32, Float64},2}, Base.ReinterpretArray{T<:Union{Complex{Float32}, Complex{Float64}, Float32, Float64},2,S,A} where S, Base.ReshapedArray{T<:Union{Complex{Float32}, Complex{Float64}, Float32, Float64},2,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{SubArray{T,N,P,I,true} where I<:Tuple{Union{Base.Slice, UnitRange},Vararg{Any,N} where N} where P where N where T, DenseArray}, SubArray{T<:Union{Complex{Float32}, Complex{Float64}, Float32, Float64},2,A,I,L} where L} where I<:Tuple{Vararg{Union{Int64, AbstractRange{Int64}, Base.AbstractCartesianIndex},N} where N} where A<:Union{Base.ReshapedArray{T,N,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{SubArray{T,N,P,I,true} where I<:Tuple{Union{Base.Slice, UnitRange},Vararg{Any,N} where N} where P where N where T, DenseArray} where N where T, DenseArray}, ::Union{Val{true}, Val{false}}) where T<:Union{Complex{Float32}, Complex{Float64}, Float32, Float64} at linalg\lu.jl:16
  lufact!(::Union{Hermitian{T,S}, Symmetric{T,S}} where S where T, ::Union{Val{true}, Val{false}}) at linalg\lu.jl:23
  lufact!(::Union{DenseArray{T,2}, Base.ReinterpretArray{T,2,S,A} where S, Base.ReshapedArray{T,2,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{SubArray{T,N,P,I,true} where I<:Tuple{Union{Base.Slice, UnitRange},Vararg{Any,N} where N} where P where N where T, DenseArray}, SubArray{T,2,A,I,L} where L} where I<:Tuple{Vararg{Union{Int64, AbstractRange{Int64}, Base.AbstractCartesianIndex},N} where N} where A<:Union{Base.ReshapedArray{T,N,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{SubArray{T,N,P,I,true} where I<:Tuple{Union{Base.Slice, UnitRange},Vararg{Any,N} where N} where P where N where T, DenseArray} where N where T, DenseArray} where T, ::Union{Val{true}, Val{false}}) at linalg\lu.jl:64
  ...
Stacktrace:
 [1] lufact!(::Symmetric{Float32,SparseMatrixCSC{Float32,Int64}}, ::Val{true}) at .\linalg\lu.jl:24
 [2] lufact at .\linalg\lu.jl:115 [inlined] (repeats 2 times)
 [3] \(::Symmetric{Float32,SparseMatrixCSC{Float32,Int64}}, ::Array{Float32,1}) at .\linalg\generic.jl:882
 [4] top-level scope

julia> versioninfo()
Julia Version 0.7.0-DEV.3234
Commit 2cc82d29e1* (2018-01-02 11:44 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, skylake)

setindex! for SparseMatrixCSC errors in corner cases

It seems that the TODO mentioned in

https://github.com/JuliaLang/julia/blob/master/stdlib/SparseArrays/src/sparsematrix.jl#L2701

causes an error reported in https://stackoverflow.com/questions/55326817/mapping-a-function-to-an-array-of-sparse-matrices-in-julia.

This issue is similar to JuliaLang/julia#29034 but seems to be a separate case (not 100% sure, as I do not know this part of sources very well).

A MWE is something like:

julia> mapslices(x -> "a", sprand(3, 3, 0.5), dims=2)
ERROR: MethodError: no method matching zero(::Type{String})

SparseMatrixCSR should be a type alias for Transpose{SparseMatrixCSC}

I am surprised that this isn't already in SparseArray, but should we define:
const SparseMatrixCSR{Tv,Ti} = Transpose{Tv,SparseMatrix{Tv,Ti}} where {Tv, Ti <: Integer}

This plus a few simple additional methods would make it so Julia had psuedo built in support for compressed sparse matrixes.

I am already using a definition like this and I can't believe that I am the only one.

If people agree I can take a shot at the PR.

similar(SparseVector{Float64,Int}, 5) throws error

I would expect similar to work for creating empty sparse vectors:

julia> similar(SparseVector{Float64,Int}, 5)
ERROR: MethodError: Cannot `convert` an object of type Tuple{Int64} to an object of type SparseVector{Float64,Int64}
This may have arisen from a call to the constructor SparseVector{Float64,Int64}(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] SparseVector{Float64,Int64}(::Tuple{Int64}) at ./sysimg.jl:114
 [2] similar(::Type, ::Int64) at ./abstractarray.jl:567
 [3] top-level scope

julia> similar(Vector{Float64}, 5)
5-element Array{Float64,1}:
 2.242449217e-314 
 2.2405945655e-314
 2.240475286e-314 
 2.2860478697e-314
 2.286045435e-314 

bug indexing into SparseArrays using CartesianIndices (plural)

I posted this to Discourse, and got the response that this was a bug.

When indexing into SparseArrays using CartesianIndices (plural), the following Julia-1.4 code

using SparseArrays
A = sparse(randn(4,4))
A[CartesianIndices((1:2,1:2))]

gives ERROR: MethodError: no method matching isless(::CartesianIndex{2}, ::Int64)…

As far as I can tell, this was fixed for a single CartesianIndex in this PR and this PR around Sept 2019, but it doesn’t seem to extend to CartesianIndices.

Missing definition for (\)(::SuiteSparse.CHOLMOD.Factor{Float64}, StridedVector{<:Complex})

The following errors for cholesky factorizations:

julia> using LinearAlgebra, SparseArrays
julia> A = cholesky(sparse(Matrix(I,10,10))); # sparse cholesky factorization with real elements
julia> z = rand(ComplexF64, 10); # complex RHS vector
julia> A\@views(z[:])
ERROR: InexactError: Float64(0.868342583034311 + 0.07242770884885896im)
Stacktrace:
 [1] Type at ./complex.jl:37 [inlined]
 [2] convert at ./number.jl:7 [inlined]
 [3] unsafe_store! at ./pointer.jl:118 [inlined]
 [4] SuiteSparse.CHOLMOD.Dense{Float64}(::SubArray{Complex{Float64},1,Array{Complex{Float64},1},Tuple{Base.Slice{Base.OneTo{Int64}}},true}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/SuiteSparse/src/cholmod.jl:802
 [5] \(::SuiteSparse.CHOLMOD.Factor{Float64}, ::SubArray{Complex{Float64},1,Array{Complex{Float64},1},Tuple{Base.Slice{Base.OneTo{Int64}}},true}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/SuiteSparse/src/cholmod.jl:1682
 [6] top-level scope at none:0

while the same operation works without the view:

julia> isapprox(A\z, z)
true

I've chased the issue down to being a simple missing definition in SuiteSparse, where we have the following (in cholmod.jl):

(\)(L::Factor{T}, B::Vector{Complex{T}}) where {T<:Float64} = complex.(L\real(B), L\imag(B))
(\)(L::Factor{T}, B::Matrix{Complex{T}}) where {T<:Float64} = complex.(L\real(B), L\imag(B))
(\)(L::Factor{T}, b::StridedVector) where {T<:VTypes} = Vector(L\Dense{T}(b))
(\)(L::Factor{T}, B::StridedMatrix) where {T<:VTypes} = Matrix(L\Dense{T}(B))

So, the analogous definitions for complex StridedVectors and StridedMatrixs are missing. I can submit the PR to base myself, but I wanted to open an issue first in case there is a reason they aren't defined.

Regression with arithmetic operations on sparse matrices

Possibly related to JuliaLang/julia#21370:

On 0.5.1:

julia> A = sprand(10,10,0.2);

julia> @benchmark 3.0*A
BenchmarkTools.Trial:
  memory estimate:  672 bytes
  allocs estimate:  4
  --------------
  minimum time:     139.775 ns (0.00% GC)
  median time:      153.360 ns (0.00% GC)
  mean time:        174.615 ns (9.75% GC)
  maximum time:     1.132 μs (79.94% GC)
  --------------
  samples:          10000
  evals/sample:     870

julia> @benchmark 3.0+A
BenchmarkTools.Trial:
  memory estimate:  1.75 KiB
  allocs estimate:  2
  --------------
  minimum time:     237.948 ns (0.00% GC)
  median time:      267.440 ns (0.00% GC)
  mean time:        303.627 ns (8.86% GC)
  maximum time:     1.451 μs (66.51% GC)
  --------------
  samples:          10000
  evals/sample:     464

julia> @benchmark A/3.0
BenchmarkTools.Trial:
  memory estimate:  672 bytes
  allocs estimate:  4
  --------------
  minimum time:     187.120 ns (0.00% GC)
  median time:      197.874 ns (0.00% GC)
  mean time:        220.143 ns (8.00% GC)
  maximum time:     1.349 μs (81.50% GC)
  --------------
  samples:          10000
  evals/sample:     723

On 0.6

julia> A = sprand(10,10,0.2);

julia> @benchmark 3.0*A
BenchmarkTools.Trial:
  memory estimate:  720 bytes
  allocs estimate:  7
  --------------
  minimum time:     242.706 ns (0.00% GC)
  median time:      261.936 ns (0.00% GC)
  mean time:        296.619 ns (9.39% GC)
  maximum time:     3.182 μs (80.41% GC)
  --------------
  samples:          10000
  evals/sample:     469

julia> @benchmark 3.0+A
BenchmarkTools.Trial:
  memory estimate:  2.02 KiB
  allocs estimate:  7
  --------------
  minimum time:     618.483 ns (0.00% GC)
  median time:      659.125 ns (0.00% GC)
  mean time:        727.738 ns (7.16% GC)
  maximum time:     5.575 μs (81.81% GC)
  --------------
  samples:          10000
  evals/sample:     176

julia> @benchmark A/3.0
BenchmarkTools.Trial:
  memory estimate:  720 bytes
  allocs estimate:  7
  --------------
  minimum time:     281.785 ns (0.00% GC)
  median time:      303.701 ns (0.00% GC)
  mean time:        339.489 ns (8.10% GC)
  maximum time:     4.909 μs (86.33% GC)
  --------------
  samples:          10000
  evals/sample:     298
Julia Version 0.6.0-pre.beta.253
Commit 6e70552* (2017-04-22 13:14 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, sandybridge)

mismatch of cat and vcat/hcat with sparsearrays and structural zeros

using SparseArrays

hcat(spzeros(1, 0), sparsevec([1], [0]))
1×1 SparseMatrixCSC{Float64,Int64} with 1 stored entry:
  [1, 1]  =  0.0

cat(spzeros(1, 0), sparsevec([1], [0]), dims = 2)
1×1 SparseMatrixCSC{Float64,Int64} with 0 stored entries

Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_HOME = c:/Users/eengad/AppData/Local/Julia-1.1.0/bin

Map for sparse matrices

Please consider the code below where julia is converting sparse matrix into dense one and doing map on the top. Should we have map defined only for non-zero elements in sparse matrix instead? Otherwise the slow down is an issue and one cannot apply effectively λ-calculus in this case!

julia> test = sprand(10^3,10^3,.01)
1000x1000 sparse matrix with 10144 Float64 entries:
    [80  ,    1]  =  0.993039
    [128 ,    1]  =  0.117601
    [152 ,    1]  =  0.974119
    [259 ,    1]  =  0.0362442
    [289 ,    1]  =  0.621536
    [371 ,    1]  =  0.653076
    [631 ,    1]  =  0.131718
    
    [439 , 1000]  =  0.0621062
    [538 , 1000]  =  0.109039
    [613 , 1000]  =  0.212955
    [620 , 1000]  =  0.147798
    [640 , 1000]  =  0.479203
    [702 , 1000]  =  0.88309
    [884 , 1000]  =  0.780324
    [892 , 1000]  =  0.0164652

julia> f = v -> v+1
(anonymous function)

julia> @time map(f,test)
elapsed time: 14.162436633 seconds (89388704 bytes allocated)
1000x1000 sparse matrix with 1000000 Float64 entries:
    [1   ,    1]  =  1.0
    [2   ,    1]  =  1.0
    [3   ,    1]  =  1.0
    [4   ,    1]  =  1.0
    [5   ,    1]  =  1.0
    [6   ,    1]  =  1.0
    [7   ,    1]  =  1.0
    
    [993 , 1000]  =  1.0
    [994 , 1000]  =  1.0
    [995 , 1000]  =  1.0
    [996 , 1000]  =  1.0
    [997 , 1000]  =  1.0
    [998 , 1000]  =  1.0
    [999 , 1000]  =  1.0
    [1000, 1000]  =  1.0

julia> @time (I,J,V)=findnz(test); sparse(I,J,map(f,V))
elapsed time: 0.000144994 seconds (243784 bytes allocated)
1000x1000 sparse matrix with 10144 Float64 entries:
    [80  ,    1]  =  1.99304
    [128 ,    1]  =  1.1176
    [152 ,    1]  =  1.97412
    [259 ,    1]  =  1.03624
    [289 ,    1]  =  1.62154
    [371 ,    1]  =  1.65308
    [631 ,    1]  =  1.13172
    
    [439 , 1000]  =  1.06211
    [538 , 1000]  =  1.10904
    [613 , 1000]  =  1.21296
    [620 , 1000]  =  1.1478
    [640 , 1000]  =  1.4792
    [702 , 1000]  =  1.88309
    [884 , 1000]  =  1.78032
    [892 , 1000]  =  1.01647

Operator norm (p = 2) for sparse matrices is not implemented

Using opnorm on a sparse matrix (of type SparseMatrixCSC) gives this error:

ERROR: ArgumentError: 2-norm not yet implemented for sparse matrices. Try opnorm(Array(A)) or opnorm(A, p) where p=1 or Inf.

Here is a piece of code which produces that error:

using LinearAlgebra
using SparseArrays
A = SparseMatrixCSC([1.0 2.0 0.0; 0.0 1.0 0.0; 0.0 0.0 0.3])
opnorm(A)

I am interested in working on that issue, but I would need some guidance.

The proper way to solve this (in my opinion) is to implement svdvals for sparse matrices, so that opnorm can be modifed to return the largest singular value of A by using svdvals for sparce matrices. If this is the way to go, which algorithm should be used to implement svdvals for sparse matrices?
There have been many discussions about implementing SVD for sparse matrices (e.g. JuliaLang/julia#6610) a long time ago...

If the proposed solution is not the desired way to resolve this, why can't we just replace the error message by returning opnorm(Array(A)) as it is suggested in the error message itself?

Deprecate field access of SparseMatrixCSC?

In JuliaLang/julia#33054 I proposed to expose AbstractSparseMatrixCSC interface so that it is possible for package developers to easily write SparseMatrixCSC-like custom matrices (without worrying about implementing all the complex functions including the broadcasting machineries). @ViralBShah was asking in JuliaLang/julia#32953 (comment) if it makes sense to deprecate field access. The rationale is that it pushes package authors to use AbstractSparseMatrixCSC interface so that subtypes of it other than SparseArrays.SparseMatrixCSC would work for their code. This is technically OK since those fields are not a part of the public API. (Edit: It may not be considered completely private. See: https://github.com/JuliaLang/julia/issues/33056#issuecomment-524583635)

As the accessor like nonzeros have exited for more than 4 years since Julia 0.4 (#8720), I'd assume that all existing and maintained code base already have migrated to the accessor methods. If that's the case, this deprecation would not be very destructive (although the effect would be small at the same time).

As a side note, I added getproperty definitions for SparseMatrixCSC and SparseVector in test suite in JuliaLang/julia#32953 so that we can enforce don't-use-fields rule in the future development of SparseArrays.jl itself. So, enforcing the rule inside SparseArrays.jl is not a strong enough argument to ban field access for users.

Sparse array index lowering

When implementing custom array types it's useful to independently resolve indices using to_indices, and subsequently index into an underlying array with them. Right now this works with Array, but fails when the array is sparse:

using SparseArrays
let x = sparse([1 2; 3 4])
    x[1, [true, false]] # Works
    
    I′ = to_indices(x, (1, [true, false]))
    x[I′...] # => getindex not defined for Base.LogicalIndex{Int64,Array{Bool,1}}
end

This also comes up when indexing into e.g a custom array with user-defined index types, or with Not from InvertedIndices.jl (this originally came up in an issue there.)

I think it has to do with the way that sparse arrays hook into the to_indices machinery.

Conversion from SparseMatrixCSC to Matrix uses the same `eltype`.

When converting a SparseMatrixCSC to a Matrix, the current implementation assumes that it can store the result in a matrix of the same eltype

https://github.com/JuliaLang/julia/blob/989de7903e492aa25f5a888c19b4b8e8a3f9f1f0/stdlib/SparseArrays/src/sparsematrix.jl#L404

However, this does not work of zero of the eltype returns an element of a different type.

MCWE:

struct A end
Base.zero(::Type{A}) = 0
using SparseArrays
S = sparse([1, 2], [1, 2], [A(), A()])
Matrix(S)

output:

ERROR: LoadError: MethodError: Cannot `convert` an object of type Int64 to an object of type A
Closest candidates are:
  convert(::Type{T}, ::T) where T at essentials.jl:123
Stacktrace:
 [1] fill!(::Array{A,2}, ::Int64) at ./array.jl:220
 [2] zeros at ./array.jl:403 [inlined]
 [3] zeros at ./array.jl:400 [inlined]
 [4] Array{T,2} where T(::SparseMatrixCSC{A,Int64}) at /home/blegat/git/julia/usr/share/julia/stdlib/v0.7/SparseArrays/src/sparsematrix.jl:404
...

The error is due to the call zero(A, 2, 2).

This example is a bit artificial but the problem also occurs in JuMP with sparse matrices of VariableRef for which the zero type is AffExpr.

Functionality issues with SparseMatrixCSC{T} when Missing<:T

I have a use case for sparse matrices where most values are 0 but some need to be indicated as missing. It seems like some very basic functionality for SparseMatrixCSC{T} is subtly broken/confusing when Missing <: T, and the semantics does not appear to correctly distinguish between unstored values being 0 or missing.

Examples on v1.3 of trying to create a 2x2 sparse matrix [missing 0; 0 0]:

Example 1: sparse(Vector, Vector, scalar)

I think this one is simply a missing method issue, assuming anyone would ever want to create a SparseMatrixCSC{Missing,Int64} (see Example 2).

julia> sparse([1], [1], 1, 2, 2) #create [1 0; 0 0]
2×2 SparseMatrixCSC{Int64,Int64} with 1 stored entry:
  [1, 1]  =  1

julia> sparse([1], [1], missing, 2, 2) #create [missing 0; 0 0]
ERROR: MethodError: no method matching sparse(::Array{Int64,1}, ::Array{Int64,1}, ::Missing, ::Int64, ::Int64)
...

Example 2: sparse(Vector, Vector, Vector)

Hits confusing corner case of literal list constructor and zero(::Missing), creating a result where zero cannot be distinguished from missing.

Ref: JuliaLang/julia#28854 JuliaLang/julia#31303

julia> A = sparse([1], [1], [missing], 2, 2) #resulting type treats unstored elements as missing, not zero; arguably confusing behavior due to zero(Missing) === missing
2×2 SparseMatrixCSC{Missing,Int64} with 1 stored entry:
  [1, 1]  =  missing

julia> A[1,2] # extracts zero of Missing, which is missing
missing

julia> B = sparse([1], [1], Union{Missing,Int}[missing], 2, 2) #works "as expected", but user has to know to supply the type
2×2 SparseMatrixCSC{Union{Missing, Int64},Int64} with 1 stored entry:
  [1, 1]  =  missing

julia> B[1,2]
0

Example 3: setting a previously unstored value to missing

julia> C = sparse([],[],Union{Missing,Int}[], 2, 2)
2×2 SparseMatrixCSC{Union{Missing, Int64},Int64} with 0 stored entries

julia> C[1,1] = missing
ERROR: TypeError: non-boolean (Missing) used in boolean context
Stacktrace:
 [1] _setindex_scalar!(::SparseMatrixCSC{Union{Missing, Int64},Int64}, ::Missing, ::Int64, ::Int64) at /Users/sabae/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.3/SparseArrays/src/sparsematrix.jl:2461
 [2] setindex!(::SparseMatrixCSC{Union{Missing, Int64},Int64}, ::Missing, ::Int64, ::Int64) at /Users/sabae/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.3/SparseArrays/src/sparsematrix.jl:2442
 [3] top-level scope at REPL[95]:1

Remove the need for SuiteSparse_wrapper.c

I think we can use either Clang.jl or a simple compile time C script to generate all the values in SuiteSparse_wrapper.c and build a .jl file of constants.

Can address JuliaLang/julia#20985 too.

I think we can also then try and bring back simultaneous support for 32 and 64-bit suitesparse.

sum(sparse) -> dense?

julia> A = sparse([rand() < 0.01 ? 1 : 0 for _ in 1:50, _ in 1:50])
50×50 SparseMatrixCSC{Int64,Int64} with 23 stored entries:
[...]

julia> sum(A, dims=1)
1×50 Array{Int64,2}:
 0  1  0  0  0  0  0  0  1  0  1  0  0  0  0  1  0  1  0  0  1  1  0  0  0  2  3  0  0  0  0  0  0  0  0  0  0  0  1  1  1  0  0  1  1  0  2  1  0  3

Sometimes you have a very sparse matrix and want to sum a slice of it. The slice may potentially have very many columns or rows that are entirely zero, in which case it makes a lot of sense to preserve sparsity in the output.

In the non-slice case, where every row and column usually contains at least one nonzero value, it makes sense to keep the result dense.

It would be great if preserving sparsity in sum and other similar reductions were allowed via a keyword argument, eg. sparse=true.

cc @simonbyrne, who suggested the keyword idea on Slack a few months ago.

Common abstractions for nonstructural indicies between Dense and Sparse Matrices

indices is our abstraction for getting the indices so one can iterate the index of a matrix.
It works great for dense matrices.

However, for sparse matrices, it is almost never what you want to do.
You instread only want to iterate the indicies of the nonstructural elements.
(because there are less of them)
That is done using rowvals, and nzrange

In a dense matrix one can say that all the elements are nonstructural.
So in a circumstance when you only want to iterate through the nonstructural elements,
you want to iterate through all the elements in a dense matrix.
More generally with a matrix of unknown type, you want to iterate through all the elements.

I thus propose that we should have an abstraction for getting the nonstructural indicies,
which falls back to getting all the indiices,
to make it easier to write code that works efficiently on spare matricies, and also works (as fast as is possible) on other types.

Some thing like

const SparseMatrix = AbstractSparseArray{<:Any,<:Any,2}
colinds(A::AbstractMatrix)  = indices(A,2)
colinds(A::SparseMatrix) = rowvals(A)

rowinds(A::AbstractMatrix, col::Integer) = indices(A,1)
rowinds(A::SparseMatrix, col::Integer) = nzrange(A, col)

(bikeshed on names pending)

There is also a similar relationship between nonzeros (sparse) and vec (dense/fallback).

I was discussing this on slack with @mbauman and @StefanKarpinski the other day,
and wanted to put it on GitHub before it was lost to the ages

SparseArrays: calling sparsevec on a sparse matrix is not a sparse vector

Julia is not detecting reshaped sparse arrays as sparse. This has a wierd consequence when trying to create a sparse vector from a sparse matrix.

julia> D = sparse(Diagonal([1,1,1,1]))
4×4 SparseMatrixCSC{Int64,Int64} with 4 stored entries:
  [1, 1]  =  1
  [2, 2]  =  1
  [3, 3]  =  1
  [4, 4]  =  1

julia> v = sparsevec(D)
16-element reshape(::SparseMatrixCSC{Int64,Int64}, 16) with eltype Int64:
 1
 0
 0
 0
 0
 1
 0
 0
 0
 0
 1
 0
 0
 0
 0
 1

julia> issparse(v)
false

julia> findnz(v)
ERROR: MethodError: no method matching findnz(::Base.ReshapedArray{Int64,1,SparseMatrixCSC{Int64,Int64},Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
Closest candidates are:
  findnz(::SparseArrays.AbstractSparseMatrixCSC{Tv,Ti}) where {Tv, Ti} at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\SparseArrays\src\sparsematrix.jl:1453
  findnz(::SparseVector{Tv,Ti}) where {Tv, Ti} at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\SparseArrays\src\sparsevector.jl:735
Stacktrace:
 [1] top-level scope at REPL[116]:1

ldiv!(Y, A::SuiteSparse.CHOLMOD.Factor, B) errors

The reason seems to be that in the 3-arg ldiv! its 2-arg-version is called, which is missing for sparse Cholesky factorizations.

MWE:

using LinearAlgebra, SparseArrays
A = sprand(100, 100, 0.01) + spdiagm(0 => ones(100))
x = rand(100)
y = similar(x)
ldiv!(y, factorize(A'A), x)

throws

ERROR: MethodError: no method matching ldiv!(::SuiteSparse.CHOLMOD.Factor{Float64}, ::Array{Float64,1})

Asking for the employed method gives

julia> @which ldiv!(y, factorize(A'A), x)
ldiv!(Y::Union{AbstractArray{T,1}, AbstractArray{T,2}} where T, A::Factorization, B::Union{AbstractArray{T,1}, AbstractArray{T,2}} where T) in LinearAlgebra at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.0/LinearAlgebra/src/factorization.jl:100

sparse arrays with algebraic, non-numerical data

using SparseArrays,StaticArrays

julia> T=SMatrix{2,2,Float64};

julia> s=sparse(1:10,1:10,fill(zero(T), 10));

julia> nnz(s)
10

julia> nnz(sparse(Matrix(s)))
100

This is just an example, there are other methods that don't work quite as they should (e.g. setindex!) The problem seems to be that the code for sparse matrices is riddled with != 0 checks instead of !iszero checks. This seems related to JuliaLang/julia#19561, but in that case the problem was for types without zero(T) for which sparse does not make real sense anyway. Maybe it would be reasonable to enforce the value type <: Number for SparseMatrixCSC? Are there people using it for non-number types? In that case one could fix all the != 0 instances and see if things work...

fast inplace broadcasting multiplication of SparseMatrixCSC and a Vector

I would like to know if this would be a good PR to the SparseArrays stdlib
I can't workout what this operation is called.
It does A.*b inplace for A::SparseMatrixCSC and a b::AbstractVector.

It seems loosely relevant to JuliaLang/julia#26561 (in that i was thinking about at that problem when I wrote it)

Inplace operations can avoid allocating memory so is faster.

using SparseArrays

function sparse_column_vecmul!(A::SparseMatrixCSC, x::AbstractVector{T}) where T
    size(A,2)==length(x) || DimensionMismatch()
    cols, rows, vals = findnz(A);
    
    x_ii=1
    x_val = @inbounds x[x_ii]
    rows_to_nan = Int64[]
    for A_ii in 1:length(rows)
        col= @inbounds cols[A_ii]
        row= @inbounds rows[A_ii]
        if row > x_ii #Note that our result is row sorted
            x_ii+=1
            x_val = @inbounds x[x_ii]
            if !isfinite(x_val) 
                # Got to deal with this later, row will become dense.
                push!(rows_to_nan, row)
            end
        end
        @inbounds vals[A_ii]*=x_val
    end

    # Go back and NaN any rows we have to
    for row in rows_to_nan
        for col in SparseArrays.nonzeroinds(@view(A[:,row]))
            # don't do the ones we already hit as they may be Inf (or NaN)
            @inbounds A[row,col] = T(NaN)
        end
    end
    
    A
end

Benchmarks

using BenchmarkTools
A = sprand(100,10,0.1)
x = rand(100)
  • @btime A.*x; 7.920 μs (17 allocations: 22.58 KiB)
  • @btime sparse_column_vecmul!(A, x) 1.044 μs (4 allocations: 2.47 KiB)

Not a perfectly fair comparison as A was being mutated but i doubt that changed the timing.

over 7x speedup is not to be sneered at given how big sparse matrixes become.

Broadcasting type constructor over sparse array results in Any sparse array

julia> versioninfo()
Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

julia> using SparseArrays

julia> Float64.(sprand(4,5,0.4) .> 0.5)
4×5 SparseMatrixCSC{Any,Int64} with 4 stored entries:
  [1, 1]  =  1.0
  [4, 3]  =  1.0
  [2, 5]  =  1.0
  [4, 5]  =  1.0

I was expecting SparseMatrixCSC{Float64,Int64} obviously. Same thing happens in the Linux version too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.