Comments (9)
Completely I agree that we should be able to use as much swap memory as we please.
same here. using a 8GB model. I haven't tried your solution of changing the value of 1.5 to more in allocation. I will play with it and let it update.
python txt2image.py "A photo of an astronaut riding a horse on Mars." --n_images 1 --n_rows 2
diffusion_pytorch_model.safetensors: 100%|█| 3.46G/3.46G [09:10<00:00, 6.29MB/s]
text_encoder/config.json: 100%|█████████████████| 613/613 [00:00<00:00, 906kB/s]
model.safetensors: 100%|███████████████████| 1.36G/1.36G [04:41<00:00, 4.83MB/s]
vae/config.json: 100%|██████████████████████████| 553/553 [00:00<00:00, 947kB/s]
diffusion_pytorch_model.safetensors: 100%|███| 335M/335M [00:57<00:00, 5.81MB/s]
tokenizer/vocab.json: 100%|████████████████| 1.06M/1.06M [00:00<00:00, 1.18MB/s]
tokenizer/merges.txt: 100%|███████████████████| 525k/525k [00:00<00:00, 882kB/s]
100%|███████████████████████████████████████████| 50/50 [05:19<00:00, 6.39s/it]
0%| | 0/1 [00:00<?, ?it/s]libc++abi: terminating due to uncaught exception of type std::runtime_error: [malloc_or_wait] Unable to allocate 134217728 bytes.
Abort trap: 6
UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
from mlx.
libc++abi: terminating due to uncaught exception of type std::runtime_error:
[malloc_or_wait] Unable to allocate 100237312 bytes.
I'm running into this error on my M3 Max, 36GB of ram when trying to run lora.py from ml-examples. Trying to fine tune mistral-7B model.
100237312 bytes
doesn't seem that much, not sure why it's failing.
from mlx.
See ml-explore/mlx-examples#70 for some ideas around how to reduce Lora memory consumption until we have quantization.
from mlx.
from mlx.
There is a maximum size you can allocate into a single buffer (which is a machine specific property). I think it is less than 9.8 GB for you.
But either way the fact that you are trying to put 9GB into a single buffer is not a good sign. What are you running to get that? Is it from training or generation?
from mlx.
It is a 16GB Air M1, do you happen to know a ballpark of the limit? Or is it dynamically dependent of other processes?
I was running a Phi-3-128k-mlx mlx_lm.utils load and generate function with ~6k context (when I run again it says 12.2GB needed), is it only limited to 8GB of VRAM? With PyTorch I am able to run 14GB of Python files without much of a speed loss (with around ~4-5GB swap of the top of my head).
from mlx.
It is a 16GB Air M1, do you happen to know a ballpark of the limit?
I don't know but you could try running this until it breaks:
import mlx.core as mx
mx.metal.set_cache_limit(0)
for i in range(100):
print(f"{i} GB")
a = mx.zeros((2**30, i), mx.bool_)
mx.eval(a)
del a
from mlx.
I'm going to close this issue as I'm not sure why it's still open. Feel free to file a new issue if you are still having issues with memory allocation.
from mlx.
air@MacBook-Air-van-Air test-repo % /opt/homebrew/bin/python3.
10 /Users/air/Repositories/test-repo/test4.py
0 GB
1 GB
2 GB
3 GB
4 GB
5 GB
6 GB
7 GB
8 GB
9 GB
libc++abi: terminating due to uncaught exception of type std::runtime_error: [malloc_or_wait] Unable to allocate 9663676416 bytes.
zsh: abort /opt/homebrew/bin/python3.10 ```
Just an FYI, no need for me to open a new issue, thank you.
from mlx.
Related Issues (20)
- Memory Leakage Issue in MLX 0.16 HOT 9
- [PERFORMANCE] grads for bitwise ops + indexing HOT 1
- [Performance] Linear Layer Benchmark HOT 2
- [BUG] expm1 handling of overflow / underflow causes wrong results
- [Feature Request] mx.pad supports the "edge" padding mode.
- [BUG] Cannot use mlx.metallib from xcode MacOS project (Swift, C++)
- [BUG] Cannot convert a list with `None` to `mx.array` HOT 1
- [BUG] Docs not building HOT 4
- How can we enable w4a8 GEMM in MLX? HOT 6
- Missing pyi file for mlx.core HOT 3
- Performance Comparison Issue: Matrix Multiplication on MLX vs. PyTorch on Mac HOT 3
- [Feature Request] Cannot create tensor from raw bytes + dtypes HOT 3
- [BUG] mx.std returns NaN HOT 1
- What is the equivalent to a Flatten layer in MLX? HOT 6
- How to mask out padding tokens when calculating the cross-entropy? HOT 2
- 使用exo+mlx多台mac运行llama-3.1-70b,返现量化时报错[BUG] HOT 1
- [Performance] PyTorch (MPS) is faster than MLX in backward of convolution layer HOT 4
- [BUG] Dropout not preserving `dtype` HOT 5
- [BUG] Convergence issue in MLX when compared to PyTorch HOT 4
- [BUG] Unable to load from a saved checkpoint, `KeyError` for all dropout modules... HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlx.