The bug As stated in the title as soon as I start the machine lear

Please see <a class="issue-link js-issue-link" data-error-text="Failed to load title"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

heres another thing that happened <div class="snippet-clipboard-content notranslat

immich-machine-learning throws a Exception in ASGI application error on starting any kind of machine learning job about immich HOT 6 CLOSED

AryanVerma1024 commented on July 24, 2024 3

immich-machine-learning throws a Exception in ASGI application error on starting any kind of machine learning job

from immich.

Comments (6)

cliffwoolley commented on July 24, 2024 2

Please see NVIDIA/nvidia-container-toolkit#520 .

from immich.

jasonbrimblecombe commented on July 24, 2024 1

There's a version update for Docker Desktop to 4.31.0 that has resolved this issue. It contains the updated NVIDIA Container Toolkit 1.15.0.
https://docs.docker.com/desktop/release-notes/#4310

from immich.

AryanVerma1024 commented on July 24, 2024

tried creating new instances on arch and windows, arch worked without any problems but windows still had the same error even with a new instance

from immich.

bo0tzz commented on July 24, 2024

@mertalev I've seen a few cases of this CUDA failure 500: named symbol not found now. Is it an issue on our end, or just misconfiguration?

from immich.

AryanVerma1024 commented on July 24, 2024

heres another thing that happened

[05/24/24 09:56:22] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=-1980571051 ; hostname=822e5a2b482e ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 09:56:23] INFO     Loading clip model 'ViT-B-32__openai' to memory    
[05/24/24 09:56:25] ERROR    Worker (pid:17) was sent SIGKILL! Perhaps out of   
                             memory?                                            
[05/24/24 09:56:25] INFO     Booting worker with pid: 524                       
[05/24/24 09:56:30] INFO     Started server process [524]                       
[05/24/24 09:56:30] INFO     Waiting for application startup.                   
[05/24/24 09:56:30] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[05/24/24 09:56:30] INFO     Initialized request thread pool with 8 threads.    
[05/24/24 09:56:30] INFO     Application startup complete.

from immich.

AryanVerma1024 commented on July 24, 2024

well it turns out nothing is able to access the gpu inside wsl for some reason.

I have tried these two

docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

Output

Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Error: only 0 Devices available, 1 requested.  Exiting.

and running tf.config.list_physical_devices() inside of this

docker run --rm -it -p 8888:8888 --gpus all tensorflow/tensorflow:latest-gpu-jupyter

Output

2024-05-24 13:08:45.278336: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-05-24 13:08:45.570992: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-24 13:09:00.572513: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:282] failed call to cuInit: CUDA_ERROR_NOT_FOUND: named symbol not found

closing this issue as this is not related to immich but would appreciate any kind of help in solving this issue

from immich.

immich-machine-learning throws a Exception in ASGI application error on starting any kind of machine learning job about immich HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs