GithubHelp home page GithubHelp logo

Comments (7)

matthewdouglas avatar matthewdouglas commented on June 12, 2024 2

@Jebati Can you please try these steps from the output?

CUDA SETUP: Problem: The main issue seems to be that the main CUDA runtime library was not detected.
CUDA SETUP: Solution 1: To solve the issue the libcudart.so location needs to be added to the LD_LIBRARY_PATH variable
CUDA SETUP: Solution 1a): Find the cuda runtime library via: find / -name libcudart.so 2>/dev/null
CUDA SETUP: Solution 1b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_1a

I would suspect you might find libcudart.so in /home/1000/.local/lib/python3.10/site-packages/nvidia/cuda_runtime/lib. Additionally you might find libcublas.so and libcublasLt.so in /home/1000/.local/lib/python3.10/site-packages/nvidia/cublas/lib, and libcusparse.so in /home/1000/.local/lib/python3.10/site-packages/nvidia/cusparse/lib. These paths may need to be added to LD_LIBRARY_PATH in order for everything to work correctly.

Note for myself: relates to #1126

from bitsandbytes.

Titus-von-Koeller avatar Titus-von-Koeller commented on June 12, 2024 1

Yes, we're aware of this. We'll start supporting the latest CUDA version with the last release. The Docker image wasn't out yet, the last time we checked, so it wasn't straight forward to support in our CI setup so far.

Please compile from source for now, then everything should work perfectly fine.

CC @matthewdouglas

from bitsandbytes.

Titus-von-Koeller avatar Titus-von-Koeller commented on June 12, 2024 1

@matthewdouglas was just explaining to me that the key line is

kohya-ss-gui  | CUDA SETUP: PyTorch settings found: CUDA_VERSION=121, Highest Compute Capability: 8.9.

and I agree with his assertions:

I wouldn't pay too much attention to what CUDA version says in nvidia-smi outputs unless it's really old. The CUDA version there is just what max CUDA version their driver supports, but isn't always going to match what CUDA toolkit is installed or the one PyTorch is built with.

That means it will try to load libbitsandbytes_cuda121.so
The rest is just noise saying the CUDA libraries aren't anywhere on LD_LIBRARY_PATH nor could they be found on the system at all (but in reality, they are, pytorch shipped with them)

I think these comments of @matthewdouglas give valuable context to understand what's going on. Seems to me that he is spot on. Thanks for the valuable input!

Please follow the instructions outlined by him and report back to us.

from bitsandbytes.

matthewdouglas avatar matthewdouglas commented on June 12, 2024 1

Hi @dsidorenkoSU,
It looks like you're trying to build with support for Kepler GPUs, which is removed in CUDA 12. When configuring CMake, set -DCOMPUTE_CAPABILITY=75 to target just your T4.

from bitsandbytes.

dsidorenkoSU avatar dsidorenkoSU commented on June 12, 2024 1

@matthewdouglas This works. I appreciate your help.

from bitsandbytes.

Jebati avatar Jebati commented on June 12, 2024

image
It seems to have worked!

Thanks!

from bitsandbytes.

dsidorenkoSU avatar dsidorenkoSU commented on June 12, 2024

I am getting this while compiling with CUDA 12.4

(base) daemon4d_us@instance-20240401-032345:~/bitsandbytes$ make
[ 14%] Building CXX object CMakeFiles/bitsandbytes.dir/csrc/common.cpp.o
[ 28%] Building CXX object CMakeFiles/bitsandbytes.dir/csrc/cpu_ops.cpp.o
[ 42%] Building CXX object CMakeFiles/bitsandbytes.dir/csrc/pythonInterface.cpp.o
[ 57%] Building CUDA object CMakeFiles/bitsandbytes.dir/csrc/ops.cu.o
nvcc fatal   : Unsupported gpu architecture 'compute_35'
make[2]: *** [CMakeFiles/bitsandbytes.dir/build.make:118: CMakeFiles/bitsandbytes.dir/csrc/ops.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/bitsandbytes.dir/all] Error 2
make: *** [Makefile:91: all] Error 2
(base) daemon4d_us@instance-20240401-032345:~/bitsandbytes$

Here is my GPU info:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   61C    P0             30W /   70W |       0MiB /  15360MiB |      8%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

from bitsandbytes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.