Comments (3)
If you have enough RAM you could use the CPU to convert it, without CUDA.
It's useful to create swap so that your system doesn't run out of memory:
$ sudo fallocate -l 32G /swapfile
$ sudo chown 0:0 /swapfile
$ sudo chmod 600 /swapfile
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
Then, remove or comment out .cuda()
from this line of the script.
$ python convert-to-torch.py models/GPT-NeoX-20B-Skein
Loading GPT-NeoX-20B-Skein...
Loading checkpoint shards: 100%|██████████| 23/23 [00:39<00:00, 1.70s/it]
Model loaded.
Saving to torch-dumps/GPT-NeoX-20B-Skein.pt
$ ls torch-dumps/
GPT-NeoX-20B-Skein.pt place-your-pt-models-here.txt
from text-generation-webui.
I have never managed to do it as well. Maybe @81300 can help.
from text-generation-webui.
Thanks, it worked!
I just created a convert-to-torch-cpu.py with the deleted cuda and it dumped everything in memory.
from text-generation-webui.
Related Issues (20)
- Searching a locally Defined Path on HuggingFace
- Oobabooga login not working through reverse proxy
- Illegal instruction (core dumped) after update HOT 5
- Running natively on Windows IIS
- Add file watching to SuperBoogaV2 for automatic reinjection?
- Cannot run Gemma2 27B with transformers loader: RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 HOT 1
- Slow Response Time with RAG Implementation Using Llama CPP Python Library on GPU Windows Machines (NV8as v4)
- latest main branch can't load models ('int' object has no attribute 'lower') HOT 1
- This script relies on Miniconda which can not be silently installed under a path with spaces.
- Speed drop when using API
- Broken on Metal: why is ooba attempting to import llama_cpp_cuda ? HOT 1
- Error: unknown model architecture: 'chatglm'
- RuntimeWarning: Detected duplicate leading "<|begin_of_text|>" in prompt
- Add confirmation dialog when leaving the webui
- no history for instruct HOT 2
- Minor bug: Synchronisation error messages after stopping the server
- Exception: Cannot import 'llama-cpp-cuda' because 'llama-cpp' is already imported. Switching to a different version of llama-cpp-python currently requires a server restart. HOT 3
- How multiple users can access deployed services in parallel and independently?
- All characters replying with 'Char:' in this version
- SuperboogaV2 not saving settings
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from text-generation-webui.