GithubHelp home page GithubHelp logo

immich-machine-learning throws a Exception in ASGI application error on starting any kind of machine learning job about immich HOT 6 CLOSED

AryanVerma1024 avatar AryanVerma1024 commented on July 3, 2024 3
immich-machine-learning throws a Exception in ASGI application error on starting any kind of machine learning job

from immich.

Comments (6)

cliffwoolley avatar cliffwoolley commented on July 3, 2024 2

Please see NVIDIA/nvidia-container-toolkit#520 .

from immich.

jasonbrimblecombe avatar jasonbrimblecombe commented on July 3, 2024 1

There's a version update for Docker Desktop to 4.31.0 that has resolved this issue. It contains the updated NVIDIA Container Toolkit 1.15.0.
https://docs.docker.com/desktop/release-notes/#4310

from immich.

AryanVerma1024 avatar AryanVerma1024 commented on July 3, 2024

tried creating new instances on arch and windows, arch worked without any problems but windows still had the same error even with a new instance

from immich.

bo0tzz avatar bo0tzz commented on July 3, 2024

@mertalev I've seen a few cases of this CUDA failure 500: named symbol not found now. Is it an issue on our end, or just misconfiguration?

from immich.

AryanVerma1024 avatar AryanVerma1024 commented on July 3, 2024

heres another thing that happened

[05/24/24 09:56:22] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=-1980571051 ; hostname=822e5a2b482e ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 09:56:23] INFO     Loading clip model 'ViT-B-32__openai' to memory    
[05/24/24 09:56:25] ERROR    Worker (pid:17) was sent SIGKILL! Perhaps out of   
                             memory?                                            
[05/24/24 09:56:25] INFO     Booting worker with pid: 524                       
[05/24/24 09:56:30] INFO     Started server process [524]                       
[05/24/24 09:56:30] INFO     Waiting for application startup.                   
[05/24/24 09:56:30] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[05/24/24 09:56:30] INFO     Initialized request thread pool with 8 threads.    
[05/24/24 09:56:30] INFO     Application startup complete.

from immich.

AryanVerma1024 avatar AryanVerma1024 commented on July 3, 2024

well it turns out nothing is able to access the gpu inside wsl for some reason.

I have tried these two

docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
Output
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Error: only 0 Devices available, 1 requested.  Exiting.

and running tf.config.list_physical_devices() inside of this

docker run --rm -it -p 8888:8888 --gpus all tensorflow/tensorflow:latest-gpu-jupyter
Output
2024-05-24 13:08:45.278336: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-05-24 13:08:45.570992: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-24 13:09:00.572513: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:282] failed call to cuInit: CUDA_ERROR_NOT_FOUND: named symbol not found

closing this issue as this is not related to immich but would appreciate any kind of help in solving this issue

from immich.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.