GithubHelp home page GithubHelp logo

Comments (20)

kenji-miyake avatar kenji-miyake commented on July 24, 2024 2

Sent a PR: osrf/rocker#182

If it's not accepted, I'll add the following block in the Dockerfile.

## Set env for nvidia-container-runtime
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility

from autoware.

VRichardJP avatar VRichardJP commented on July 24, 2024 1

Yes indeed, my latest is older than galactic-latest:

ghcr.io/autowarefoundation/autoware-universe               galactic-latest                                                          ea6b0e03f964   5 weeks ago     5.74GB
ghcr.io/autowarefoundation/autoware-universe               latest                                                                   35694ec40e8f   8 weeks ago     16.2GB

from autoware.

kenji-miyake avatar kenji-miyake commented on July 24, 2024 1

@angry-crab humble-latest doesn't contain CUDA now. Please use humble-latest-cuda instead.
https://github.com/autowarefoundation/autoware/pkgs/container/autoware-universe/26944787?tag=humble-latest-cuda

from autoware.

yukke42 avatar yukke42 commented on July 24, 2024 1

@kenji-miyake
I'm sorry for the late replay. AND thank you for identifying the cause of this issue!
I will close this issue after creating a PR to add your suggestion into the document.

from autoware.

LevieSun avatar LevieSun commented on July 24, 2024

I get the same problem

from autoware.

kenji-miyake avatar kenji-miyake commented on July 24, 2024

I think this will be resolved after we change the base image to 11.7.0-devel-ubuntu22.04.

But to do that, the network installation packages of cuDNN and TensorRT should be released.
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/

I'm waiting for that.
Or if there is another better approach, I can try it out.

from autoware.

LevieSun avatar LevieSun commented on July 24, 2024

I think this will be resolved after we change the base image to 11.7.0-devel-ubuntu22.04.

But to do that, the network installation packages of cuDNN and TensorRT should be released. https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/

I'm waiting for that. Or if there is another better approach, I can try it out.

Do you have any other method to solve this problem?

from autoware.

kenji-miyake avatar kenji-miyake commented on July 24, 2024

Regarding nvidia-smi, I guess it's fixed by installing nvidia-utils.

$ apt-file search $(which nvidia-smi)
nvidia-340: /usr/bin/nvidia-smi
nvidia-utils-390: /usr/bin/nvidia-smi
nvidia-utils-418-server: /usr/bin/nvidia-smi
nvidia-utils-435: /usr/bin/nvidia-smi
nvidia-utils-440: /usr/bin/nvidia-smi
nvidia-utils-450-server: /usr/bin/nvidia-smi
nvidia-utils-460-server: /usr/bin/nvidia-smi
nvidia-utils-470: /usr/bin/nvidia-smi
nvidia-utils-470-server: /usr/bin/nvidia-smi
nvidia-utils-510: /usr/bin/nvidia-smi
nvidia-utils-510-server: /usr/bin/nvidia-smi

Regarding cudaErrorInsufficientDriver, I'm sorry but I'm not sure. 🥺
I need to investigate the issue to catch up.

from autoware.

may012345 avatar may012345 commented on July 24, 2024

Regarding nvidia-smi, I guess it's fixed by installing nvidia-utils.

$ apt-file search $(which nvidia-smi)
nvidia-340: /usr/bin/nvidia-smi
nvidia-utils-390: /usr/bin/nvidia-smi
nvidia-utils-418-server: /usr/bin/nvidia-smi
nvidia-utils-435: /usr/bin/nvidia-smi
nvidia-utils-440: /usr/bin/nvidia-smi
nvidia-utils-450-server: /usr/bin/nvidia-smi
nvidia-utils-460-server: /usr/bin/nvidia-smi
nvidia-utils-470: /usr/bin/nvidia-smi
nvidia-utils-470-server: /usr/bin/nvidia-smi
nvidia-utils-510: /usr/bin/nvidia-smi
nvidia-utils-510-server: /usr/bin/nvidia-smi

Regarding cudaErrorInsufficientDriver, I'm sorry but I'm not sure. 🥺 I need to investigate the issue to catch up.

I also encontered the similar problem. When I run "colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release" in the container of "ghcr.io/autowarefoundation/autoware-universe:latest" image , I found "CUDA_TOOLKIT_ROOT_DIR not found or specified
CUDA NOT FOUND
TensorRT is NOT Available
CUDNN is NOT Available" error. It seems the container could not find CUDA. How do I fix this problem? Do I need to install autoware from the source instead of the docker?

from autoware.

kenji-miyake avatar kenji-miyake commented on July 24, 2024

@may012345 Seeing @yukke42 -san's comment, I guess you can run Autoware using pure Docker, not rocker.

from autoware.

yukke42 avatar yukke42 commented on July 24, 2024

Yes. I could run autoware with adocker run command and a image built locally following the official tutorial.

docker run --rm -it --gpus all -e DISPLAY -e TERM -e QT_X11_NO_MITSHM=1 -v /tmp/.X11-unix:/tmp/.X11-unix -v /etc/localtime:/etc/localtime:ro ghcr.io/autowarefoundation/autoware-universe:humble-latest

from autoware.

may012345 avatar may012345 commented on July 24, 2024

@may012345 Seeing @yukke42 -san's comment, I guess you can run Autoware using pure Docker, not rocker.

I run with "rocker --nvidia --x11 --user --volume $HOME/autoware -- ghcr.io/autowarefoundation/autoware-universe:latest" and met the problem.

from autoware.

angry-crab avatar angry-crab commented on July 24, 2024

I've also met the same problem. I'm using rocker --nvidia --x11 --user --volume $HOME/autoware -- ghcr.io/autowarefoundation/autoware-universe:latest.

from autoware.

VRichardJP avatar VRichardJP commented on July 24, 2024

I have the problem with rocker --nvidia --x11 --user --volume $HOME/autoware -- ghcr.io/autowarefoundation/autoware-universe:galactic-latest but not with rocker --nvidia --x11 --user --volume $HOME/autoware -- ghcr.io/autowarefoundation/autoware-universe:latest

from autoware.

kenji-miyake avatar kenji-miyake commented on July 24, 2024

but not with rocker --nvidia --x11 --user --volume $HOME/autoware -- ghcr.io/autowarefoundation/autoware-universe:latest

@VRichardJP Could you tell me the hash of your images like this? I guess one is old (and based on CUDA image).

$ docker images | grep autoware-universe | grep " latest "
ghcr.io/autowarefoundation/autoware-universe                                          latest                                                                       5fdadcabfe9c   5 weeks ago     5.73GB

from autoware.

angry-crab avatar angry-crab commented on July 24, 2024

docker run --rm -it --gpus all -e DISPLAY -e TERM -e QT_X11_NO_MITSHM=1 -v /tmp/.X11-unix:/tmp/.X11-unix -v /etc/localtime:/etc/localtime:ro -v $HOME/adehome:/home/adehome [ghcr.io/autowarefoundation/autoware-universe:humble-latest](http://ghcr.io/autowarefoundation/autoware-universe:humble-latest)
Using docker instead of rocker does not solve the problem. CUDA, TensorRT, and cuDNN could not be found.

from autoware.

kenji-miyake avatar kenji-miyake commented on July 24, 2024

I found the problem is because of the env NVIDIA_DRIVER_CAPABILITIES.

To confirm that, run the following command.

rocker --nvidia --x11 --user --env NVIDIA_DRIVER_CAPABILITIES="" -- ghcr.io/autowarefoundation/autoware-universe:humble-latest-cuda

# Or
rocker --nvidia --x11 --user --env NVIDIA_DRIVER_CAPABILITIES=compute,utility,graphics -- ghcr.io/autowarefoundation/autoware-universe:humble-latest-cuda

I'll investigate this behavior more in detail and fix it.

from autoware.

kenji-miyake avatar kenji-miyake commented on July 24, 2024

@yukke42 My PR to rocker is merged. Do you think we can close this issue?
We can set NVIDIA_DRIVER_CAPABILITIES manually until the new rocker is released.

from autoware.

yukke42 avatar yukke42 commented on July 24, 2024

@.kenji-miyake I'm sorry for the late replay. AND thank you for identifying the cause of this issue! I will close this issue after creating a PR to add your suggestion into the document.

I will do after this related issue is closed.

from autoware.

kenji-miyake avatar kenji-miyake commented on July 24, 2024

Resolved by #2732

from autoware.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.