Comments (57)
After some really tedious debugging and tackling various hidden problems, I managed to compile the whole module.
This is the end result:
https://github.com/DeXtmL/bitsandbytes-win-prebuilt
The binaries are compiled against CUDAToolkit 11.6 and Visual Studio 2022.
I am able to make inferences nearly identical to "normal" fp16 version. So this is kind of a confirmation for "it works" shout-out. No vigorous testing was conducted though. @TimDettmers
Finally, the "cuda_setup" part of the source code is entirely incompatible with Windows, there are loads of hardcoded routines; so I used a quick makeshift patch instead of making it proper, that's also why I'm not posting my changes or making PR for now. If you are eager to test:
in cuda_setup/main.py:
make evaluate_cuda_setup() always return "libbitsandbytes_cuda116.dll"
in ./cextension.py:
change ct.cdll.LoadLibrary(binary_path) to ct.cdll.LoadLibrary(str(binary_path))
That should do the trick.
Hopefully this can help someone in the Windows territory; let's hope the official windows support come fast.
from bitsandbytes.
If anyone else is still searching for a Windows solution and doesn't want to lose a few hours to the same issue just use this repo:
https://github.com/jllllll/bitsandbytes-windows-webui
The README even includes a pip install
command and (as of 0.41.1
) installs the newest version of bitsandbytes... only difference is that it's compatible with windows. Includes .dll
instead of .so
and cuda_setup\main.py
works with us
from bitsandbytes.
To use this with facebook-research/LLaMA-7b within text-generation-webui on windows 11:
- git pull oobabooga/text-generation-webui
- follow the installation instructions for conda
- download HuggingFace converted model weights for LLaMA, or convert them by yourself from the original weights. Both leaked on torrent and even on the official facebook llama repo as an unapproved PR.
- copy the
llama-7b
folder (or whatever size you want to run) intotext-generation-webui\models
. The folder should contain the config.json, generation_config.json, pytorch_model.bin.index.json, special_tokens_map.json, tokenizer.model, tokenizer_config.json as well as all the 33 pytorch_model-000xx-of-00033.bin files - put libbitsandbytes_cuda116.dll in
C:\Users\xxx\miniconda3\envs\textgen\lib\site-packages\bitsandbytes\
- edit
\bitsandbytes\cuda_setup\main.py
:
search for:
if not torch.cuda.is_available(): return 'libsbitsandbytes_cpu.so', None, None, None, None
replace with:
if torch.cuda.is_available(): return 'libbitsandbytes_cuda116.dll', None, None, None, None
search for this twice:
self.lib = ct.cdll.LoadLibrary(binary_path)
replace with:
self.lib = ct.cdll.LoadLibrary(str(binary_path))
from bitsandbytes.
After some really tedious debugging and tackling various hidden problems, I managed to compile the whole module. This is the end result: https://github.com/DeXtmL/bitsandbytes-win-prebuilt
The binaries are compiled against CUDAToolkit 11.6 and Visual Studio 2022. I am able to make inferences nearly identical to "normal" fp16 version. So this is kind of a confirmation for "it works" shout-out. No vigorous testing was conducted though. @TimDettmers Finally, the "cuda_setup" part of the source code is entirely incompatible with Windows, there are loads of hardcoded routines; so I used a quick makeshift patch instead of making it proper, that's also why I'm not posting my changes or making PR for now. If you are eager to test:in cuda_setup/main.py: make evaluate_cuda_setup() always return "libbitsandbytes_cuda116.dll" in ./cextension.py: change ct.cdll.LoadLibrary(binary_path) to ct.cdll.LoadLibrary(str(binary_path))
That should do the trick.
Hopefully this can help someone in the Windows territory; let's hope the official windows support come fast.Where do you put the pre-built file to activate adam?
You put them in site-packages\bitsandbytes
from bitsandbytes.
why we still cant use on windows
this is 2023
from bitsandbytes.
Quick final follow up, it built fine in Debug with the above, i actually don't know quite how. For release mode I did actually have to build the pthread library (just another mkdir build, cmake .., open solution, build all in release mode), then slightly modify the cmake file. (Probably could have just done cmake .. ; cmake --build . -j4 --config Release ; to build pthread)
CMakeLists.txt
I don't know why it worked before in debug mode at all, because I had link_libraries wrong (don't have the -l in cmake in front), and for release mode had to fix that and include pthreadVC3.lib .. here's the final final from me CMakeLists.txt for the csrc folder
CMakeLists.txt
**EDIT
I got it to load up, but it says it compiled without GPU support so i'm still working on it.
**EDIT2
Still working on it. Added "add_compile_definitions(BUILD_CUDA)" then checked the resulting vc files and it does enable the BUILD_CUDA define, and in pythonInterface.cpp visual studio says BUILD_CUDA is defined, I can see where the cadam32bit_g32 is generated via that macro, but not quite sure why when it loads the .dll - lib.cadam32bit_g32 throws an attribute error.
**EDIT3
I finally got it to work. It took a couple hours (long compile times) but I finally got one that exports all symbols. the trick was putting the thing in a different cmake file ffs. The final two
root/CMakeLists.txt
CMakeLists.txt
root/csrc/CMakeLists.txt
CMakeLists.txt
mkdir build, cd build, cmake .. , cmake --build ./ -j4 --config Release , .dll is put into build/csrc/Release/bitsandbyts.dll
just for fun here's it running gpt-j-6b on a RTX3080 on windows 11 with cuda 113
import torch
import transformers
access_token = "hf_"
from transformers.models.gptj import GPTJForCausalLM
device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = transformers.AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B", use_auth_token=access_token, device_map='auto')
gpt = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6B", use_auth_token=access_token, device_map='auto', load_in_8bit=True, low_cpu_mem_usage=True).to(device)
prompt = tokenizer("A cat sat on a mat", return_tensors='pt')
prompt = {key: value.to(device) for key, value in prompt.items()}
out = gpt.generate(**prompt, min_length=128, max_length=128, do_sample=True)
tokenizer.decode(out[0])
>>> tokenizer.decode(out[0])
"A cat sat on a mat, staring at me, his back legs tucked under him, tail swerving in quick little circles.\n\nI squatted next to him and leaned against the cold wooden wall. I'd come down here to feed the cat, but I'd been too tired and cold and my stomach still ached and my hands and feet were numb from spending the night in a tree. Besides, this was not my house, not my town, not my time. The cat stared at me, his green eyes the only proof he knew I was the intruder he was protecting.\n\nI wished I'd brought a blanket"
from bitsandbytes.
If anyone else is still searching for a Windows solution
and doesn't want to lose a few hours to the same issuejust use this repo: https://github.com/jllllll/bitsandbytes-windows-webuiThe README even includes a
pip install
command and (as of0.41.1
) installs the newest version of bitsandbytes... only difference is that it's compatible with windows. Includes.dll
instead of.so
andcuda_setup\main.py
works with us
wow very nice
from bitsandbytes.
"compile it yourself" can be a blocker indeed, especially as this is lib for Python and compilation errors one may get will be from C++. I could find time to make a PR with whatever was done till now and polish it where necessary, though not sure if someone is doing it already.
from bitsandbytes.
From my not so great memory it's something like:
To build the bitsandbytes project for Windows, you will need two programs: cmake and nvcc. You can a build environment such as Visual Studio and Miniconda.
Open the command line interface (CLI) for your build environment. (Start menu/visual studio/ one of them consoles)
Activate your chosen environment(Miniconda) and install necessary packages. ( cuda-nvcc iirc? Probably a cuda environment like https://pytorch.org/get-started/locally/ )
Place cmake files in the right location.
Build pthreads (if necessary) using cmake (The same commands as below).
Edit1
( download https://github.com/GerHobbelt/pthread-win32 and extract the whole thing there so it's /project_root/dependencies/pthread-win32-main/pthread.h )
END EDIT1
Run the following commands: (Note -j4 means use 4 cores to build. If you don't have 4 cores, or you have a lot more, change that number.)
(Assuming on C:\ drive, if on other drive, change letter on first and second line appropriately.
C:
cd C:\PROJECT_ROOT\ ( or cd C:\PROJECT_ROOT\dependencies\pthread-win32-main )
mkdir build
cd build
cmake ..
cmake --build ./ -j4 --config Release
The resulting dll file will be in build/csrc/Release/bitsandbytes.dll.
Edit2
When it errors about unistd.h or getopt.h, open that file and comment out the #include - although a more proper way would to be change to detect _MSC_VER, if found just don't include unistd.h ( wouldn't test against WIN32 cause can be true in mingw, WSL,(etc) environment where using unistd.h would be required still, where _MSC_VER indicates the microsoft visual studio compiler version )
EG:
#ifndef _MSC_VER
#include <unistd.h>
#endif
END EDIT2
from bitsandbytes.
Is it possible to run bitsandbytes with 2060RTX 6GB on Windows 10?
I don't see why it wouldn't run on a 2060, just be aware it doesn't eliminate vram requirements, just reduces them. Still wouldn't be able to run chatgpt for example with it's 800GB+ 32-bit precision vram requirement (if had access to that). Any model that takes <24gb of vram in 32bit, or <12gb of vram in 16bit mode, should be able to fit in 6gb at 8bit.
from bitsandbytes.
After some really tedious debugging and tackling various hidden problems, I managed to compile the whole module. This is the end result: https://github.com/DeXtmL/bitsandbytes-win-prebuilt
The binaries are compiled against CUDAToolkit 11.6 and Visual Studio 2022. I am able to make inferences nearly identical to "normal" fp16 version. So this is kind of a confirmation for "it works" shout-out. No vigorous testing was conducted though. @TimDettmers Finally, the "cuda_setup" part of the source code is entirely incompatible with Windows, there are loads of hardcoded routines; so I used a quick makeshift patch instead of making it proper, that's also why I'm not posting my changes or making PR for now. If you are eager to test:
in cuda_setup/main.py: make evaluate_cuda_setup() always return "libbitsandbytes_cuda116.dll" in ./cextension.py: change ct.cdll.LoadLibrary(binary_path) to ct.cdll.LoadLibrary(str(binary_path))
That should do the trick.
Hopefully this can help someone in the Windows territory; let's hope the official windows support come fast.
This solution is still valid and the linked binaries work with cuda 11.7 as well (at least for Adam 8 bit and on win 10). But the location of ct.cdll.LoadLibrary
changed and it can now be found in ./cuda_setup/main.py
as well and not in ./cextension.py
just replace both occurrences of self.lib = ct.cdll.LoadLibrary(binary_path)
with self.lib = ct.cdll.LoadLibrary(str(binary_path))
from bitsandbytes.
I've got this compiling under CUDA 11.7 with CMake if y'all are interested. I DID NOT RUN ANY TESTS yet, it is too late in the day
Prototype CMAKE file, it is missing functionality of the makefile. It is usable to target a single config, and does not bring in /dependencies/cub.
https://github.com/acpopescu/bitsandbytes/tree/cmake_windows Still WIP.
To Deploy copy
build/Release/*.*
to./bitsandbytes/
For reference and diff - #229
Did you look at #127 by any chance?
from bitsandbytes.
@km19809 - Ok, it's my Cmakelists.txt. My initial revision had a bug where it was using the initially set architecture, 52 that why it was working for you. I am missing the additional architectures in the latest release that the Makefile has.
https://github.com/acpopescu/bitsandbytes/releases/tag/v0.37.2-win.1 should have your architecture when using nocublast now.
from bitsandbytes.
@acpopescu Now it works well. Thank you a lot!
from bitsandbytes.
@FurkanGozukara we can use it on Windows as many people do (me including https://github.com/stoperro/bitsandbytes_windows) with this community effort. It's just less convenient as support is not yet merged into official branch and you need to compile it yourself.
Note that even Microsoft doesn't care about Windows support of their own AI tools (microsoft/DeepSpeed#2427), so having some support here when authors don't necessarily have Windows machine is heartening.
I know that branch really good but really older commit
I am making tutorials for regular people. It is not an option for them to "compile it yourself"
I hope windows support gets added
from bitsandbytes.
Bitsandbytes was not supported windows before, but my method can support windows.(yuhuang)
1 open folder J:\StableDiffusion\sdwebui,Click the address bar of the folder and enter CMD
or WIN+R, CMD 。enter,cd /d J:\StableDiffusion\sdwebui
2 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes
3 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes-windows
4 J:\StableDiffusion\sdwebui\py310\python.exe -m pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl
Replace your SD venv directory file(python.exe Folder) here(J:\StableDiffusion\sdwebui\py310)
from bitsandbytes.
I am able to compile the csrc part on Windows but fail to link (Visual Studio)
FYI, this is the error message. I'm no expert on cuda, so I'm not quite sure where it goes wrong.
To successfully compile on windows MSVC, some parts need to be patched as following:
-
rename pythonInterface.c to pythonInterface.cpp, or visual studio will try using a C compiler for it.
-
add one missing template instantiation like this: (in SIMD.h)
-
get unistd.h and getopt.h for windows
-
get pthread for windows
finally, this is just a build test, so I'm not using the Makefile.
That's all for now.
Tested on CUDA toolkit 11.6, windows 11
from bitsandbytes.
Thank you. This might be the key.
from bitsandbytes.
After some really tedious debugging and tackling various hidden problems, I managed to compile the whole module. This is the end result: https://github.com/DeXtmL/bitsandbytes-win-prebuilt
The binaries are compiled against CUDAToolkit 11.6 and Visual Studio 2022. I am able to make inferences nearly identical to "normal" fp16 version. So this is kind of a confirmation for "it works" shout-out. No vigorous testing was conducted though. @TimDettmers Finally, the "cuda_setup" part of the source code is entirely incompatible with Windows, there are loads of hardcoded routines; so I used a quick makeshift patch instead of making it proper, that's also why I'm not posting my changes or making PR for now. If you are eager to test:
in cuda_setup/main.py: make evaluate_cuda_setup() always return "libbitsandbytes_cuda116.dll" in ./cextension.py: change ct.cdll.LoadLibrary(binary_path) to ct.cdll.LoadLibrary(str(binary_path))
That should do the trick.
Hopefully this can help someone in the Windows territory; let's hope the official windows support come fast.
could you provide a makefile for this?
from bitsandbytes.
After some really tedious debugging and tackling various hidden problems, I managed to compile the whole module. This is the end result: https://github.com/DeXtmL/bitsandbytes-win-prebuilt
The binaries are compiled against CUDAToolkit 11.6 and Visual Studio 2022. I am able to make inferences nearly identical to "normal" fp16 version. So this is kind of a confirmation for "it works" shout-out. No vigorous testing was conducted though. @TimDettmers Finally, the "cuda_setup" part of the source code is entirely incompatible with Windows, there are loads of hardcoded routines; so I used a quick makeshift patch instead of making it proper, that's also why I'm not posting my changes or making PR for now. If you are eager to test:
in cuda_setup/main.py: make evaluate_cuda_setup() always return "libbitsandbytes_cuda116.dll" in ./cextension.py: change ct.cdll.LoadLibrary(binary_path) to ct.cdll.LoadLibrary(str(binary_path))
That should do the trick.
Hopefully this can help someone in the Windows territory; let's hope the official windows support come fast.
Where do you put the pre-built file to activate adam?
from bitsandbytes.
After some really tedious debugging and tackling various hidden problems, I managed to compile the whole module. This is the end result: https://github.com/DeXtmL/bitsandbytes-win-prebuilt
The binaries are compiled against CUDAToolkit 11.6 and Visual Studio 2022. I am able to make inferences nearly identical to "normal" fp16 version. So this is kind of a confirmation for "it works" shout-out. No vigorous testing was conducted though. @TimDettmers Finally, the "cuda_setup" part of the source code is entirely incompatible with Windows, there are loads of hardcoded routines; so I used a quick makeshift patch instead of making it proper, that's also why I'm not posting my changes or making PR for now. If you are eager to test:
in cuda_setup/main.py: make evaluate_cuda_setup() always return "libbitsandbytes_cuda116.dll" in ./cextension.py: change ct.cdll.LoadLibrary(binary_path) to ct.cdll.LoadLibrary(str(binary_path))
That should do the trick.
Hopefully this can help someone in the Windows territory; let's hope the official windows support come fast.
An easy way to always return libbitsandbytes_cuda116.dll
would be to insert
if torch.cuda.is_available(): return 'libbitsandbytes_cuda116.dll', None, None, None, None
above
119: if not torch.cuda.is_available(): return 'libsbitsandbytes_cpu.so', None, None, None, None
from bitsandbytes.
Appreciate you doing this work, helped unblock me in a big way. I hope bitsandbytes supports Windows directly sooner rather than later, but this is a great stopgap.
from bitsandbytes.
@PinPointPing Thanks a lot, it just worked like a charm
from bitsandbytes.
Same issue for me, if someone can compile cuda 11.8 binaries for me i can test them :)
using StableDiffusion-WebUI + Dreambooth, would love to give Adam a spin !
btw under Windows, the environment variable is CUDA_PATH
and CUDA_PATH_V11_8
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link
================================================================================
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
CUDA SETUP: Loading binary G:\Visions of Chaos\MachineLearning\Text To Image\stable-diffusion-webui\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so...
Exception importing 8bit adam: argument of type 'WindowsPath' is not iterable
Scheduler Loaded
Allocated: 2.3GB
Reserved: 2.4GB
from bitsandbytes.
after following some advice and making the edits in this thread, i got Adam to run on my CUDA 11.8 setup/ it does indeed work for me :)
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link
================================================================================
CUDA SETUP: Loading binary G:\Visions of Chaos\MachineLearning\Text To Image\stable-diffusion-webui\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll...
Scheduler Loaded
Allocated: 0.3GB
Reserved: 0.4GB
from bitsandbytes.
would you mind sharing your cuda 11.8 binary
from bitsandbytes.
would you mind sharing your cuda 11.8 binary
using the libs cuda 11.6 from [(https://github.com/DeXtmL/bitsandbytes-win-prebuilt)]
and cuda 11.8 on windows 10
from bitsandbytes.
@TimDettmers any updates on this ? 😄
from bitsandbytes.
I just got this building with cmake.
First thing I did was make a directory called dependencies , then download https://github.com/GerHobbelt/pthread-win32 and extract the whole thing there so it's /project_root/dependencies/pthread-win32-main/pthread.h
(I lied, I didn't really do that first, did that last but I suggest anyone following this do it first to avoid the error)
I did apply the patch above by @DeXtmL (Does not compile without it, missing vec_t error).
As for unistd.h/getopt, I literally just commented out the #include <unistd.h> in the file it gave the error on and it now fully compiles so not sure anything relies on unistd.h (At least in windows with VC2019.. ), we'll see when I go to test the .dll (Didn't give any errors about getopt)
CMakeLists.txt
CMakeLists.txt **** (New files in next reply) ****
The first CMakeLists.txt is in the root, the second in csrc.
I do:
cd /project_root
mkdir build
cd build
cmake ..
Then can open the .sln, right click bitsandbytes hit build and it goes to town.
I imagine for linux the same cmake file would actually work just fine. Only thing I did special for windows was add the include path for the pthreads, adding an include path to a folder that doesn't exist probably won't hurt right?
GL
from bitsandbytes.
Quick final follow up...
You are a hero, thank you so much for posting this!
from bitsandbytes.
Quick final follow up...
Any chance you could hang some instructions somewhere to help others replicate the process? I'm trying to follow along but have run into some issues. A step-by-step would be awesome.
from bitsandbytes.
Is it possible to run bitsandbytes with 2060RTX 6GB on Windows 10?
from bitsandbytes.
Thanks for the answer.
I think I get confused about bitsandbytes. I was thinking it only works on Linux and repository such as kohya-ss/sd-scripts or bmaltais/kohya_ss won't work on Windows, because of the lack of compatibility and that is the reason I am not able to run them. But this is not true, as I manage to make them work on my machine eventually.
from bitsandbytes.
I've tried all of the above and I'm still getting an error
... bitsandbytes\libbitsandbytes_cuda116.dll...
[WinError 193] %1 is not a valid Win32 application
Any ideas? I thought the whole point of this .dll was that it is a windows version?
from bitsandbytes.
from bitsandbytes.
Thankyou.
TIL (1) - if you right click/save a filename from GitHub, while it appears you are saving the target file, you're not. The file that appears in the target folder has the same name but is gibberish.
TIL (2) - I am not very good at IT.
All working fine now.
from bitsandbytes.
What changes are needed to run it on a Windows CPU only machine? Trying to run Llama-Alpaca-LoRa but getting issues with bitsandbytes:
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine!
CUDA SETUP: Loading binary C:\Users\AMahmood\Downloads\Llama-Alpaca-LoRa\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so...
[WinError 193] %1 is not a valid Win32 application
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine!
CUDA SETUP: Loading binary C:\Users\AMahmood\Downloads\Llama-Alpaca-LoRa\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so...
[WinError 193] %1 is not a valid Win32 application
.\Llama-Alpaca-LoRa\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('/usr/local/cuda/lib64')}
warn(msg)
.\Llama-Alpaca-LoRa\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning: WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
warn(msg)
.\Llama-Alpaca-LoRa\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning: WARNING: No GPU detected! Check your CUDA paths. Proceeding to load CPU-only library...
warn(msg)
.\Llama-Alpaca-LoRa\venv\lib\site-packages\bitsandbytes\cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
.\Llama-Alpaca-LoRa\venv\Lib\site-packages\bitsandbytes\cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
Process finished with exit code 0
from bitsandbytes.
I've got this compiling under CUDA 11.7 with CMake if y'all are interested. I DID NOT RUN ANY TESTS yet, it is too late in the day
Prototype CMAKE file, it is missing functionality of the makefile. It is usable to target a single config, and does not bring in /dependencies/cub.
https://github.com/acpopescu/bitsandbytes/tree/cmake_windows
Still WIP.
To Deploy copy build/Release/*.*
to ./bitsandbytes/
For reference and diff - #229
from bitsandbytes.
Automated tests look promising on 11.7, at least on GPU :) #229 (comment)
python -m build was able to build a wheel file and that worked with 11.7: CPU is NOT TESTED below:
bitsandbytes-0.37.2-py3-none-any.whl.zip
from bitsandbytes.
@centerionware no, I couldn't find that referenced anywhere. I was looking directly at the issues and it did not show up in google search.
from bitsandbytes.
from bitsandbytes.
@acpopescu Hello, I used your branch to compile cuda117 with NO_CUDABLASLT. (My GPU is too old 😢)
However, the result DLL does has "_noblatlt" in its name.
In Linux, we can make
a shared object with "_noblaslt".
I am not familiar with cmake, so I cannot find out why.
How can I solve this problem?
I am currently using the result DLL, by renaming it.
from bitsandbytes.
Hey, yeah I need to change the cmake to add the _noblaslt for that option.
Happy to hear that the resulted DLL works by renaming, I did not test it for noblaslt btw.
from bitsandbytes.
@km19809 the build script should be fixed, renamed properly and copied in the right location
from bitsandbytes.
Compiled a release here, against 11.6 and 11.7 - Feature level 11x with and without cublast:
https://github.com/acpopescu/bitsandbytes/releases/tag/v0.37.2-win.0
from bitsandbytes.
@acpopescu
Thank you so much! I'll try it.
from bitsandbytes.
Hmm...
The current wheel throws symbol not found
.
CUDA SETUP: CUDA runtime path found: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cudart64_110.dll
CUDA SETUP: Highest compute capability among GPUs detected: 6.1
CUDA SETUP: Detected CUDA version 117
D:\stable-diffusion-webui\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:141: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!
warn(msg)
CUDA SETUP: Loading binary D:\stable-diffusion-webui\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117_nocublaslt.dll...
Error named symbol not found at line 508 in file F:\dev\AI\BLOOM\bitsandbytes\csrc\ops.cu
However, I have no F:
at all...
Although I've searched all the files in the repo to find "F:" or "BLOOM", they did not appear.
I cannot understand why this happens.
I should compile this again by myself.
from bitsandbytes.
AttributeError: function 'clion32bit_g32' not found
how can i do? i am crying...
from bitsandbytes.
@acpopescu I did an update to your fork with newer bitsandbytes version plus applied one fix for qlora on Windows, maybe you would wish to apply same changes to your branch - https://github.com/stoperro/bitsandbytes_windows/tree/cmake_windows .
from bitsandbytes.
@stoperro I'll take a look when I can (hopefully soon) :)
from bitsandbytes.
I am able to compile the csrc part on Windows but fail to link (Visual Studio) FYI, this is the error message. I'm no expert on cuda, so I'm not quite sure where it goes wrong.
To successfully compile on windows MSVC, some parts need to be patched as following:
- rename pythonInterface.c to pythonInterface.cpp, or visual studio will try using a C compiler for it.
- add one missing template instantiation like this: (in SIMD.h)
- get unistd.h and getopt.h for windows
- get pthread for windows
finally, this is just a build test, so I'm not using the Makefile. That's all for now. Tested on CUDA toolkit 11.6, windows 11
Thanks for your work, by referencing your idea, I am able to buid it on Windows11, CUDA 12.1, VS 2022 Community!
I have forked it with my procedure and code changes: https://github.com/ShanGor/bitsandbytes-windows
Who wants to run it on Windows with insufficient GPU memory can try it. (Windows has Shared GPU Memory mechanism, to utilize half of your RAM to supplement GPU memory)
from bitsandbytes.
Error Message: ModuleNotFoundError: No module named 'transformers_modules.'
I encountered the following error message while running your bitsandbytes code in a Windows environment:
bin D:\sj\project\python\test\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll
CUDA SETUP: CUDA runtime path found: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cudart64_110.dll
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary D:\sj\project\python\test\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll...
D:\sj\project\python\test\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:156: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('D:/yy/寰\ue1bb俊web寮�鍙戣�呭伐鍏穃dll'), WindowsPath('C:/Users/aoguai/anaconda3/Library/usr/bin')}
warn(msg)
Traceback (most recent call last):
File "D:\sj\project\python\test\core\deeplearning\test.py", line 13, in <module>
tokenizer = AutoTokenizer.from_pretrained(
File "D:\sj\project\python\test\venv\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 676, in from_pretrained
tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
File "D:\sj\project\python\test\venv\lib\site-packages\transformers\dynamic_module_utils.py", line 443, in get_class_from_dynamic_module
return get_class_in_module(class_name, final_module.replace(".py", ""))
File "D:\sj\project\python\test\venv\lib\site-packages\transformers\dynamic_module_utils.py", line 164, in get_class_in_module
module = importlib.import_module(module_path)
File "D:\yy\Python\Python38\lib\importlib\__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'transformers_modules.'
I have already installed the transformers4.30.2. The issue occurred when I tried to use the AutoTokenizer.from_pretrained() function from bitsandbytes. Specifically, the error message indicates that it cannot find the module named transformers_modules.
I have attempted to check my environment settings, including CUDA configuration and path setup. However, these attempts did not resolve the issue.
I would appreciate your assistance in resolving this problem. Thank you for developing this library, and I appreciate your help!
from bitsandbytes.
The bitsandbytes library is currently only supported on Linux distributions. Windows is not supported at the moment.
https://github.com/acpopescu/bitsandbytes/tree/cmake_windows
搞了半天,不支持windows
here is the answer
https://github.com/ShanGor/bitsandbytes-windows
from bitsandbytes.
@FurkanGozukara we can use it on Windows as many people do (me including https://github.com/stoperro/bitsandbytes_windows) with this community effort. It's just less convenient as support is not yet merged into official branch and you need to compile it yourself.
Note that even Microsoft doesn't care about Windows support of their own AI tools (microsoft/DeepSpeed#2427), so having some support here when authors don't necessarily have Windows machine is heartening.
from bitsandbytes.
If anyone else is still searching for a Windows solution
and doesn't want to lose a few hours to the same issuejust use this repo: https://github.com/jllllll/bitsandbytes-windows-webuiThe README even includes a
pip install
command and (as of0.41.1
) installs the newest version of bitsandbytes... only difference is that it's compatible with windows. Includes.dll
instead of.so
andcuda_setup\main.py
works with us
Legend it worked for me I was one of those looking for hours xDD
from bitsandbytes.
Excellent! On Anaconda same thing applies - has to be root of the venv
from bitsandbytes.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
from bitsandbytes.
Related Issues (20)
- Path is broken in Stable Diffusion after installing extension.
- CUDA Setup failed despite GPU being available. HOT 4
- HTTP ERROR 403. It says: 127.0.0.1 refused^ HOT 1
- bitsandbytes cuda library error HOT 1
- [QUESTION] does setting load_in_4bit=True use w4a4, w4a8 or w4a16 ?
- TypeError: Input tensors need to be on the same GPU, but found the following tensor and device combinations HOT 1
- NameError: name 'str2optimizer32bit' is not defined HOT 2
- torch compile support?
- 32 bit optimizer update error despite gradients being the same HOT 4
- Quantized model using load_in_8bit produces very different results on T4 vs V100 GPU on Colab
- NameError: name 'str2optimizer32bit' is not defined
- CUDA Setup failed despite CUDA being Available :: NameError: name 'str2optimizer32bit' is not defined HOT 3
- bitsandbytes interprets URLs from environment variables as paths HOT 2
- Bug issues
- error on VectorstoreIndexCreator HOT 4
- CONTRIBUTING.md references Meta CLA HOT 1
- bitsandbytes import error in colab
- Could not run Kohya
- PicklingError: Can't pickle <function Embedding.forward at XXXXXXX> it's not the same object as torch.nn.modules.sparse.Embedding.forward
- AttributeError: 'NoneType' object has no attribute 'split' CUDA Setup failed despite CUDA being available.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bitsandbytes.