Comments (11)
👍 for @abraunst's comments; optimal methods for hypersparse arrays often differ from those for sparse arrays. As an alternative to the keyword idea, we could extend sum!
to accept a destination array, and dispatch on the destination type. Best!
from sparsearrays.jl.
Just to simplify the example, your slice looks very similar to s=sprand(10^7,150,10^-7,i->rand(1:10,i))
. It is a bit extreme, with only ~ 150 nonzeros in 1.5*10^9 entries. The fastest way to sum I found in this case was:
function sum2(x::SparseMatrixCSC)
o = spzeros(eltype(x), size(x,1))
@inbounds for i=1:nnz(x)
o[x.rowval[i]] += x.nzval[i]
end
o
end
I think that the problem with this kind of approaches is that it could fail badly when the density is a bit higher. For example, replacing the second 10^-7 with 10^-4 or so, which still gives a really sparse matrix (most entries in the output vector are still 0), then sum(s, dims=2)
is about 40 times faster. The difference becomes much larger for higher densities of course. I think it's really hard to optimize for all possible scenarios.
from sparsearrays.jl.
In general, the assumption for a SparseMatrix{R,C}SC, S
, is that the number of non-zeros is roughly O(size(S,1) + size(S,2))
. Since the result size of sum(S, dims=1)
and sum(S, dims=2)
are both of that size, it seems like the reasonable choice is to return a dense vector in both cases. For CSC the storage it makes no sense to return a sparse vector when summing along the columns, for example. For summing along the rows, the result might happen to be sparse if all the row values happen to be in a small subset of rows, but that's not something that should generally be expected, and there's no real reason to expect that any more than the result of summing the rows of a dense matrix might happen to have lots of zero values. In general, if size(S,1) ≈ size(S,2)
then we would expect both row and column sums to be dense, so producing a dense vector for these operations seems sane to me.
from sparsearrays.jl.
Yes, this is good to do - which is make the sum of SparseMatrixCSC along either dimension a SparseVector.
from sparsearrays.jl.
I think the right thing to do is to always have a sparse output, and the user can explicitly convert it to dense if required.
from sparsearrays.jl.
Note that
- the output
1xN
SparseMatrixCSC
would not be really sparse (i.e. requires still O(N) storage even if it's completely zero) - empty columns in the input
MxN
SparseMatrixCSC
still take space (because of thecolptr
vector) and need to be iterated over.
so summing in the dims=1
dimension would probably be slower with a sparse output. The dims=2
could maybe benefit some, but not sure how this "really really sparse" case is relevant. Note also that both A*ones(N)
and ones(1,N)*A
give dense arrays (i.e. alternative ways of achieving the same results).
from sparsearrays.jl.
The dims=2 could maybe benefit some, but not sure how this "really really sparse" case is relevant.
If it helps to illustrate my concrete use case with some size numbers, I worked with a square matrix of 11 million x 11 million census blocks, and the sum was returning a dense vector of length 11 million x 100 or so.
I used findnz
to find the nonzero elements in my columns of interest and then did a group sum by the row index, which was faster than using the built-in sum
.
from sparsearrays.jl.
The dims=2 could maybe benefit some, but not sure how this "really really sparse" case is relevant.
If it helps to illustrate my concrete use case with some size numbers, I worked with a square matrix of 11 million x 11 million census blocks, and the sum was returning a dense vector of length 11 million x 100 or so.
I used
findnz
to find the nonzero elements in my columns of interest and then did a group sum by the row index, which was faster than using the built-insum
.
Could you clarify the example? What do you mean by "columns of interest"? How many nonzeros in the output row column vector? What were the respective times (i.e. how much faster)?
from sparsearrays.jl.
Sorry, here's a quick example where a hacky sparse sum using Dicts outperforms the naive sum. Perhaps I'm missing something and this isn't a rigorous benchmark by any means, but I think it replicates the essence of the situation I found myself in where I was slicing a big square matrix and summing some of the rows or columns together.
using SparseArrays, SplitApplyCombine
function test()
s = let
I = []; J = []; V = Int[]
sz = 10_000_000
for i in 1:sz
j = rand(1:sz)
v = rand(1:10)
push!(I, i)
push!(J, j)
push!(V, v)
end
sparse(I, J, V)
end;
# my actual use had noncontiguous indexes in the second dimension
slice = s[:, 100_000:100_150];
@show typeof(slice)
@time a = sum(slice, dims=2);
@time b = let
nt = ((i=i, j=j, v=v) for (i, j, v) in zip(findnz(slice)...))
groupsum(x -> x.i, x -> x.v, nt)
end
a, b
end
a, b = test();
Timings on my laptop (2012 Macbook Pro) after compilation are
0.465381 seconds (10 allocations: 152.588 MiB, 82.09% gc time)
0.028346 seconds (29.54 k allocations: 1.358 MiB)
from sparsearrays.jl.
@Wimmerer do you think this would be a good feature to add?
from sparsearrays.jl.
I imagine this SparseVector will very often be dense but that's the only correct route.
from sparsearrays.jl.
Related Issues (20)
- Regression for `mul!` from 1.9 to 1.10 HOT 11
- Using `@view` leads to 100x performance loss HOT 7
- Loads of warnings about method redefinitions HOT 2
- Wrapper generator needs updating to pick SuiteSparse headers from new location HOT 1
- Single precision support in CHOLMOD HOT 3
- Cholesky F.PtL \ Av where Av is a view does not work HOT 1
- Base.stack is underperforming for SparseArrays HOT 3
- Regression in invalidations caused by SparseArrays HOT 3
- Elementwise multiplication by a view of a dense matrix gives a dense matrix
- `findmin(A; dims=1)` is much slower than manually looping over. HOT 1
- Sparse array of string types HOT 17
- Memory Mapped SparseArrays HOT 3
- Extra allocations when using generalized `mul!` operation
- Attempting to run sparse `qr` produces StackOverflow when run on a sparse matrix of `ForwardDiff.Dual`. HOT 6
- Inconsistent addition between sparse and dense HOT 1
- `ldiv` of `LUFactorization` can throw `SingluarException` HOT 1
- Thread-safe dropstored! HOT 1
- Merge SparseMatricesCSR.jl in HOT 2
- Support zero-based indices HOT 3
- Windows threading tests fail in GitHub Actions CI but pass in Buildkite CI
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sparsearrays.jl.