GithubHelp home page GithubHelp logo

Comments (3)

LaurentMazare avatar LaurentMazare commented on September 27, 2024

With the first of your two solutions you probably want to try setting LD_LIBRARY_PATH to the directory that contains this libtorch_cuda.so this is not important when compiling but when running the compiled binary.

from tch-rs.

v-espitalier avatar v-espitalier commented on September 27, 2024

Thank for your quick reply.

I went further in the investigation following your advise, and it works, thanks.

For the records, here are the steps I followed. The pytorch.org / .zip solution worked, but not the conda one.

Working solution: libtorch from pytorch.org (libtorch 2.2.0 + cuda 11.8)

Download libtorch from pytorch.org, unzip it in your favorite location, and record the 'libtorch' folder path (to your home for instance):
$ wget https://download.pytorch.org/libtorch/cu118/libtorch-cxx11-abi-shared-with-deps-2.2.0%2Bcu118.zip
$ unzip libtorch-cxx11-abi-shared-with-deps-2.2.0%2Bcu118.zip -d /home/user

In the following lines, you need to replace '/home/user/libtorch' with the actual location where you unzipped the files to.
If you unzipped to your home, you should see these folders and files for instance :

/home/user/libtorch/lib
/home/user/libtorch/include
/home/user/libtorch/lib/libtorch.so
/home/user/libtorch/lib/libtorch_cuda.so

After unzipping, export env variables, compile, and test:

$ export LIBTORCH="/home/user/libtorch"
$ export LIBTORCH_LIB="$LIBTORCH"
$ export LIBTORCH_INCLUDE="$LIBTORCH"
$ export LIBTORCH_BYPASS_VERSION_CHECK=1    # We installed cuda 11.8 whereas the tch lib is expecting 11.7
$ export LD_LIBRARY_PATH="$LIBTORCH/lib"

Checks:

$ find "$LIBTORCH" | grep libtorch.so
$ find "$LIBTORCH" | grep libtorch_cuda.so
$ find "$LD_LIBRARY_PATH" | grep libtorch_cuda.so

Compilation and test:

$ cargo build
$ target/debug/tch-test

The program runs. In order to make it work permanently, as always, you need to add the 5 export lines at the end of your .bashrc

If you update the version of libtorch, you may need to recompile your tch rust programs. Otherwise, as when you launch them, the dynamicly loaded librairies may have mismatched version from the building-time used ones and you may get weird error message (missing file sth) at running time (happened to me).

Other test (not working) : libtorch from conda (libtorch 2.2.0 + cuda 11.8)

It may be out of scope at this point, but I tried to go through conda, without success. I don't need to get a solution as the .zip seems to work fine.

But I provide the tests I followed, in case anyone want to dig further. I would still be interested. Otherwise I will wait for the next version of pytorch-gpu.

So, where to find, in conda, the required files (among others) for tch-rs: libtorch.so and libtorch_cuda.so, with version 2.2.0 ?

I tested 3 possible packages: pytorch-gpu, pytorch-cuda or simply pytorch.

  • 'pytorch-gpu' is currently built up to 2.1.2 on conda-forge ; not the right version for tch -> Next package !
    conda search pytorch-gpu -c pytorch -c conda-forge
    pytorch-gpu 2.1.2 cuda120py38h6767744_301 conda-forge
    conda search pytorch-gpu -c pytorch
    pytorch-gpu 1.3.1 0 pkgs/main

  • Surprisingly, 'pytorch-cuda' does not seem to provide libtorch_cuda.so (I guess I did sth wrong, but where -see below- ?). -> Next !

$ micromamba create -n pytorch220_cuda117
$ micromamba activate pytorch220_cuda117
$ micromamba install pytorch==2.2.0 pytorch-cuda=11.7 torchvision -c pytorch -c nvidia -c anaconda

$ find /home/user/micromamba/envs/pytorch220_cuda117/lib/ | grep libtorch.so$
gives:
/home/user/micromamba/envs/pytorch220_cuda117/lib/python3.11/site-packages/torch/lib/libtorch.so
but
find /home/user/micromamba/envs/pytorch220_cuda117/lib/ | grep libtorch_cuda.so$
returns blank (!!)

  • Third package test: 'pytorch' package, with specific cuda-enabled build:

$ conda search pytorch -c pytorch | grep 2.2.0 | grep cuda
py3.10_cuda11.8_cudnn8.7.0_0
py3.10_cuda12.1_cudnn8.9.2_0
$ micromamba search pytorch -c pytorch | grep 2.2.0 | grep cuda
py3.9_cuda12.1_cudnn8.9.2_0
(The 2 commands unexpectedly give different outputs, but at this point, I am ready to hear anything.)

Checked python version:
$ python --version
Python 3.10.9

Then installed the package proposed by conda, using micromamba:

$ micromamba create -n pytorch220_cuda118
$ micromamba activate pytorch220_cuda118
$ micromamba install pytorch=2.2.0=py3.10_cuda11.8_cudnn8.7.0_0 torchvision -c pytorch -c nvidia -c anaconda

(The 3 repos specs are required)

In that case, the libtorch path is:
/home/user/micromamba/envs/pytorch220_cuda118/lib/python3.10/site-packages/torch

I exported the 5 environment variables, following the working solution above.

But then, when I $ cargo build, I get the error:

$ cargo build
   Compiling torch-sys v0.15.0
   Compiling tch v0.15.0
   Compiling tch-test v0.1.0 (/home/user/Bureau/Bureau2/workarea/test_tch/tch-test)
error: linking with `cc` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/home/user/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/home/user/micromamba/env...
...
...(very long lines or line*)
...
.../user/micromamba/envs/pytorch220_cuda118/lib/python3.10/site-packages/torch/include/ATen/core/function_schema_inl.h:337: undefined reference to `c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
          collect2: error: ld returned 1 exit status
          
  = note: some `extern` functions couldn't be found; some native libraries may need to be installed or have their path specified
  = note: use the `-l` flag to specify native libraries to link
  = note: use the `cargo:rustc-link-lib` directive to specify the native libraries to link with Cargo (see https://doc.rust-lang.org/cargo/reference/build-scripts.html#rustc-link-lib)

error: could not compile `tch-test` (bin "tch-test") due to 1 previous error

Anyway, I will stick to the .zip solution until pytorch-gpu from conda-forge publishes a package with the 2.2.0 torch version..

from tch-rs.

v-espitalier avatar v-espitalier commented on September 27, 2024

Dear Laurent,
Just to thank you again for your quick help. I managed to go through my (holiday) project involving tch-rs :
https://github.com/v-espitalier/Raytracer-tch-rs

Best Regards,
Vincent.

from tch-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.