In LinearSolve.jl we directly use MKL_jll.jl in order to avoid OpenBLAS but not use lb

I'm not sure if UMFPACK is multithreaded <a class="user-mention notranslate" data-hove

Here's Tim's answer on the topic: <a class="issue-link js-issue-link" data-error-text=

Allow UMFPACK to be run with an alternative BLAS under the hood about sparsearrays.jl HOT 13 CLOSED

ChrisRackauckas commented on June 25, 2024

Allow UMFPACK to be run with an alternative BLAS under the hood

from sparsearrays.jl.

Comments (13)

rayegun commented on June 25, 2024

The only really feasible way to do this is to set up a second entirely disjoint instance of LBT. This would have to be an entirely separate package and probably JLL though. Which seems pretty crazy to me rather than just setting up an @info telling the user to use MKL on x86.

Why exactly would we want to give users better performance on just UMFPACK when they could have it for every BLAS call already with LBT?

from sparsearrays.jl.

ChrisRackauckas commented on June 25, 2024

Is the answer for LinearSolve.jl to just start doing using MKL?

from sparsearrays.jl.

rayegun commented on June 25, 2024

If it's available for that plat yeah. I really do think we're shooting ourselves in the foot trying to pull out specific MKL functions, when the user probably wants it all. I'm not aware of any functionality where OpenBLAS beats MKL are you?

Whether that should be done in LinSolve or whether we should explicitly inform the user to do it themselves idk

from sparsearrays.jl.

ViralBShah commented on June 25, 2024

At least for this issue, it would be good to have a simple example that shows where UMFPACK + OpenBLAS is slow. I assume you are using 1 openBLAS thread.

from sparsearrays.jl.

ChrisRackauckas commented on June 25, 2024

The point is that the threading performance is what is bad, so it has bad scaling. There is currently no situation where UMFPACK benchmarks as the best method https://docs.sciml.ai/SciMLBenchmarksOutput/dev/LinearSolve/SparsePDE/ and benchmarks show very poor performance for OpenBLAS https://docs.sciml.ai/SciMLBenchmarksOutput/dev/LinearSolve/LUFactorization/

from sparsearrays.jl.

ViralBShah commented on June 25, 2024

My question is - does it get better with openblas threading turned off?

from sparsearrays.jl.

rayegun commented on June 25, 2024

I'm not sure if UMFPACK is multithreaded @ViralBShah. I think it relies on threaded BLAS for performance.

from sparsearrays.jl.

ViralBShah commented on June 25, 2024

Here's Tim's answer on the topic: DrTimothyAldenDavis/SuiteSparse#432 (comment)

Also, a parallel UMFPACK is coming (unrelated to this), and will also have the same performance issue. People who care about this sort of thing should just use MKL in their startup.jl. We can do an @info for now in Base.

The only realistic solution is SuiteSparseMKL if you want it to all work automatically and by default. Even then, AMD has its own BLAS, and all this will fail on Mac ARM. Of course, I feel that the kind of person who wants faster LinearSolve performance, also generally wants faster overall BLAS performance, and it may be best to ask people to pick a better BLAS by default. Eventually we should have a global preference to set the BLAS, rather than doing it through startup.jl.

from sparsearrays.jl.

ChrisRackauckas commented on June 25, 2024

Well, I did the LinearSolve.jl change and 🤷 I'm glad we got some information back. I got about 50 happy responses of people saying PDE stuff is magically faster, but 2 bug reports which effectively blow it all down SciML/LinearSolve.jl#427 and will cause it to have to revert. Pretty sad, so this issue has at least been proven to be unsolvable without changes to the binaries and a cause of major (100x) slowdowns in many real-world use cases. And it's pretty clear that the coming parallel UMFPACK will be super good but probably still be slower than KLU if you use OpenBLAS, it's that bad 🤷. So, we really do need a way to build with MKL without LBT since we cannot use the global trigger in general.

The other path would be to actually fix MKL.jl to cover the whole interface (i.e. fixing JuliaLinearAlgebra/MKL.jl#138) and then having LinearAlgebra.jl do this. LinearAlgebra.jl should just check the CPU and load the "right" BLAS.

I'm not aware of any functionality where OpenBLAS beats MKL are you?

Matrix multiplications on AMD Epyc chips. That came up in two responses. Everyone else got either nothing changed or a speedup.

The only realistic solution is SuiteSparseMKL if you want it to all work automatically and by default.

I think that is the case.

Also, a parallel UMFPACK is coming (unrelated to this), and will also have the same performance issue. People who care about this sort of thing should just use MKL in their startup.jl. We can do an @info for now in Base.

We should throw a warning if anyone does A\b with a sparse matrix and has OpenBLAS on. Tim Davis is pretty explicit that it's effectively not correct behavior in his eyes and it would be good to tell users that they are missing a ton of performance in a very known and fixable way. In the meantime, since we are side-stepping LinearAlgebra anyways, we will at least add the warnings to LinearSolve.jl directly.

from sparsearrays.jl.

ViralBShah commented on June 25, 2024

I think those issues are addressable and we should go about fixing them. This is probably the first large scale push for MKL and it is not surprising it uncovered more issues.

Incomplete interface issues should be filed in libblastrampoline. We had picked every single thing we could find (but naturally there might be a few missed), and we even allow for BLAS extensions.

Packages that are not linked against LBT probably should be recompiled. However, if they call openblas directly, it really shouldn't be an issue. The SLICOT one is a bit puzzling, but I'll try to look into that.

from sparsearrays.jl.

ChrisRackauckas commented on June 25, 2024

Indeed, this was just the first large attempt to break the shackles of OpenBLAS. We will lick our wounds, fix up a few things, and come back to it with a improved packages in a bit. I personally want LinearAlgebra to start shipping MKL on appropriate CPUs within the next year

from sparsearrays.jl.

ViralBShah commented on June 25, 2024

Part of the fix might be this: JuliaLinearAlgebra/MKL.jl#140

from sparsearrays.jl.

ViralBShah commented on June 25, 2024

The OpenBLAS perf issue is resolved and also backported. For the SuiteSparse that ships as an stdlib, we won't be able to do this.

If you really want it, maybe there can be an external package, but I have no idea how reliably that will work.

from sparsearrays.jl.

Allow UMFPACK to be run with an alternative BLAS under the hood about sparsearrays.jl HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs