Comments (13)
Huge +1 to this. Would be amazing to not have to drop back to numpy/CPU for these sorts of things.
from mlx.
Something like np.linalg.norm
for vectors and for a matrix Frobenius norm should be very easy to do.. that's also a good place to start just to get the packaging setup.
from mlx.
We would love to have these operations available directly in MLX. It's not our top top priority but something we intend to add in the future or even better accept contributions for.
If you are interested in contributing, here are some thoughts:
- To the extent that we can avoid writing these from scratch that is good.
- For the CPU we can use LAPACK and/or Accelerate depending on what's available in each. A good starting point would be to wrap an op from one of those just for the CPU (and throw for the GPU).
- On the GPU there are also some pre-written kernels we can use from MPS for example: (cholesky)[https://developer.apple.com/documentation/metalperformanceshaders/mpsmatrixdecompositioncholesky?language=objc].
You can see an example of how to wrap MPS matmul. The others could be done similarly. - For ops not supported by MPS, we'd need kernels which is a bigger project, but a fun one for those up for a challenge!
from mlx.
matrix factorizations aren't easy parallelizable on the gpu.
would QR and SVD only have cpu implementation for now? @awni
from mlx.
So you can look at how mlx.core.random
works. We could do something similar for mlx.core.linalg
. Basically it's a nested namespace on the C++ side mlx::core::random
and then we make it a submodule in the pybind11 bindings. Then you can do:
import mlx.core as mx
mx.linalg.< >
from mlx.
Any thoughts on implementing at least vector/matrix norm methods such as torch.linalg.vector_norm?
from mlx.
note to self: almost all LAPACK routines are col-major
@awni would Transpose on an mlx array before sending it to LAPACK routines work here, or is there an alternative way?
from mlx.
matrix factorizations aren't easy parallelizable on the gpu.
would QR and SVD only have cpu implementation for now? @awni
SVD support would be great.
from mlx.
The CPU versions of these are pretty doable. See the QR factorization as an example https://github.com/ml-explore/mlx/blob/main/mlx/backend/common/qrf.cpp
GPU support is more involved as I don’t think there are many open source Metal implementations
from mlx.
Hi! I am quite interested to work on this but not really sure how to start. Would someone be able to push me in the right direction?
I would be even open to have a short meeting if required.
I work from a M2 Max. Thank you :)
from mlx.
Thoughts on wrapping these linalg specific functions to a separate module on Python frontend?
from mlx.
No I wouldn't deal with that using a transpose. You can usually call the routine with the right arguments and avoid a transpose. For example a row-major [M, N] matrix is the same as a col major [N, M] matrix in terms of its memory layout.
from mlx.
Hi @awni, may I ask is there any learning resources of Apple Metal and Accelerate Framework? I want to contribute to LinAlg module but I do not know where to start with. For instance, if I want to build mx.linalg.eig
, how can I use LAPACK from apple accelerate framework?
from mlx.
Related Issues (20)
- Difference in training convergence between PyTorch & MLX HOT 2
- [BUG] mlx.core.topk throws segmentation fault for large dimension HOT 1
- [BUG] JIT compile mode does not work with LoRA
- [Feature] dlpack device HOT 5
- [BUG] Compiled mx.eval(model.state) raises “Attempting to eval an array without a primitive” with mlx.optimizers.Adam HOT 4
- [BUG] compile + checkpoint segfaults HOT 1
- [BUG] Wrong result for sliced matmul on GPU HOT 1
- I'm asking for help with the following error: HOT 1
- [Enhancement] be able to override MLX_METAL_VERSION when running cmake
- Potential mx.load() issue with quantized GGUF's HOT 1
- gpt-neox HOT 4
- Implement a `torch.rand_like` equivalent? HOT 5
- [Feature] Export Lora Adapters as GGML
- [Feature] Add nan_to_num Function HOT 2
- [Potential BUG] Latency differs for different runs HOT 3
- [Question] Performance of mx.fast.scaled_dot_product_attention HOT 5
- Potential Bug in Distributed Communication HOT 20
- Unexpected scatter behavior HOT 3
- [Issue] Request to support macOS x64 HOT 4
- Distributed computing not utilizing GPUs HOT 35
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlx.