GithubHelp home page GithubHelp logo

scala_torch's Introduction

scala-torch

JVM/Scala wrappers for LibTorch.

State of this project

This project is mature enough to be used regularly in production code. The API exposed is fairly clean and tries to follow PyTorch syntax as much as possible. The API is a mix of hand-written wrappings and a wrapper around most of Declarations.yaml.

That said, some internal documentation is not quite ready for public consumption yet, though there is enough documentation that people who are already familiar with Scala and LibTorch can probably figure out what's going on. Code generation is accomplished through a combination of Swig and a quick-and-dirty Python script that reads in Declarations.yaml, which provides a language-independent API for a large part of LibTorch. This file is deprecated and in the future, we can hopefully replace bindgen.py using the forthcoming torchgen tool provided by PyTorch.

One major annoyance with Scala in particular is that you cannot define multiple overloads of a method that take default arguments. Currently, bindgen.py uses any defaults present in only the first overload found in Declarations.yaml. In some cases, clever use of Scala's implicit conversions can hide these headaches, but currently, you occasionaly have to write out the defaults where you would not have to in Python. One potential future option is to give overloads different names, but we elected not to do that (yet).

We have not yet published JARs for this project. These are coming soon.

Short tour

Scala-torch exposes an API that tries to mirror PyTorch as much as Scala syntax allows. For example, taking some snippets from this tutorial:

PyTorch:

import torch

data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)

Scala-Torch:

import com.microsoft.scalatorch.torch
import com.microsoft.scalatorch.torch.syntax._

torch.ReferenceManager.forBlock { implicit rm =>
 val data = $($(1, 2), $(3, 4))
 val x_data = torch.tensor(data)
}

PyTorch:

tensor = torch.ones(4, 4)
print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}")
tensor[:,1] = 0
print(tensor)

Scala-Torch:

val tensor = torch.ones($(4, 4))
println(s"First row: ${tensor(0)}")
println(s"First column: ${tensor(::, 0)}")
println(s"Last column: ${tensor(---, -1)}")
tensor(::, 1) = 0
println(tensor)

See this file for a complete translation of the PyTorch tutorial into Scala-Torch.

Memory management

One big difference between Scala-Torch and PyTorch is in memory management. Because Python and LibTorch both use reference counting, memory management is fairly transparent to users. However, since the JVM uses garbage collection and finalizers are not guaranteed to run, it is not easy to make memory management transparent to the user. Scala-Torch elects to make memory management something the user must control by providing ReferenceManagers that define the lifetime of any LibTorch-allocated object that is added to it. All Scala-Torch methods that allocate objects from LibTorch take an implicit ReferenceManager, so it is the responsibility of the caller to make sure there is a ReferenceManager in implicit scope (or passed explicitly) and that that ReferenceManager will be close()ed when appropriate. See documentation and uses of ReferenceManager for more examples.

Handling of native dependencies

PyTorch provides pre-built binaries for the native code backing it here. We make use of the pre-built dynamic libraries by packaging them up in a jar, much like TensorFlow Scala. Downstream projects have two options for handling the native dependencies: they can either

  1. Declare a dependency on the packaged native dependencies wrapped up with a jar using
val osClassifier = System.getProperty("os.name").toLowerCase match {
  case os if os.contains("mac") || os.contains("darwin") => "darwin"
  case os if os.contains("linux")                        => "linux"
  case os if os.contains("windows")                      => "windows"
  case os                                                => throw new sbt.MessageOnlyException(s"The OS $os is not a supported platform.")
}
libraryDependencies += ("com.microsoft.scalatorch" % "libtorch-jar" % "1.10.0").classifier(osClassifier + "_cpu")
  1. Ensure that the libtorch dependencies are installed in the OS-dependent way, for example, in /usr/lib or in LD_LIBRARY_PATH on Linux, or in PATH on windows. Note that on recent version of MacOS, System Integrity Protected resets LD_LIBRARY_PATH and DYLD_LIBRARY_PATH when working processes, so it is very hard to use that approach on MacOS.

The native binaries for the JNI bindings for all three supported OSes are published in scala-torch-swig.jar, so there is no need for OS-specific treatment of those libraries.

Approach 1 is convenient because sbt will handle the libtorch native dependency for you and users won't need install libtorch or set any environment variables. This is the ideal approach for local development.

There are several downsides of approach 1:

  • it may unnecessarily duplicate installation of libtorch if, for example, pytorch is already installed
  • jars for GPU builds of libtorch are not provided, so approach 2 is the only option if GPU support is required
  • care must be taken when publishing any library that depends on Scala-Torch to not publish the dependency on the libtorch-jar, since that would force the consumer of that library to depend on whatever OS-specific version of the jar was used at building time. See the use of pomPostProcess in build.sbt for how we handle that. Note that another option is for downstream libraries to exclude the libtorch-jar using something like
libraryDependencies += ("com.microsoft" % "scala-torch" % "0.1.0").exclude("com.microsoft.scalatorch", "libtorch-jar")

Approach 2 is the better option for CI, remote jobs, production, etc.

Local Development (MacOS)

You will need to have SWIG installed, which you can install using brew install swig.

git submodule update --init --recursive
cd pytorch
python3 -m tools.codegen.gen -s aten/src/ATen -d torch/share/ATen
cd ..
curl https://download.pytorch.org/libtorch/cpu/libtorch-macos-$(pytorchVersion).zip -o libtorch.zip
unzip libtorch.zip
rm -f libtorch.zip
conda env create --name scala-torch --file environment.yml
conda activate scala-torch
export TORCH_DIR=$PWD/libtorch
# This links to the JNI shared library to the absolute paths in the libtorch dir instead of 
# using an rpath.
export LINK_TO_BUILD_LIB=true
sbt test

A similar setup should work for Linux and Windows.

Troubleshooting

If you are using Clang 11.0.3 you may run into an error when compiling the SobolEngineOps file. This is most likely due to an issue with the compiler and it has already been reported here. A temporary workaround is to install another version of Clang (e.g., by executing brew install llvm). Another option is to downgrade XCode to a version < 11.4.

Upgrading the LibTorch version

To upgrade the underlying version of LibTorch:

  • cd pytorch; git checkout <commit> with the <commit> of the desired release version, best found here.
  • Rerun the steps under Local Development.
  • Change TORCH_VERSION in run_tests.yml.
  • Address compilation errors when running sbt compile. Changes to bindgen.py may be necessary.

Contributors

Thanks to the following contributors to this project:

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

scala_torch's People

Contributors

microsoft-github-operations[bot] avatar adampauls avatar

Stargazers

Thomas Deniffel avatar Alfredo Serafini avatar timothy avatar York K. Chernz avatar Vincent Gélinas avatar sudotty avatar  avatar Andy Jayne avatar Karim Hammami avatar Jay Kruer avatar Hao Fang avatar  avatar Davo avatar  avatar BenJueWeng avatar Martin Mauch avatar  avatar  avatar Waris Radji avatar Yilong Tang avatar Vladimir Bodnartchouk avatar Ola Håkansson avatar Filippo Cavallari avatar Pavel Ajtkulov avatar Luis Osa avatar Yakiv Yereskovskyi avatar Marcel Luethi avatar Mohammad Forouhesh avatar Aish avatar 唐古拉山 avatar Gerbrand van Dieyen avatar Emiliano Martinez avatar Spyros Koukas avatar Raul Rodriguez avatar ScalaWilliam avatar Han Ju avatar Marcin Kuthan avatar Sergey Tarabara avatar  avatar Choucri FAHED avatar Vegard Fjellbo avatar  avatar Simon Parten avatar Oswaldo Dantas avatar Mattia avatar Ajit Nayak avatar Dan Di Spaltro avatar Łukasz Biały avatar Jiachen Du avatar Shishir avatar 爱可可-爱生活 avatar David Hall avatar Masanori Ogino avatar Alexander Slesarenko avatar Shahrukh Khan avatar Brian Schlining avatar Kiryl Valkovich avatar Torsten Scholak avatar Nils Kilden-Pedersen avatar  avatar Carlo avatar Robin Raju  avatar Synesios Christou avatar Tyler Schock avatar Kenny avatar Vitalii Honta avatar Andres Pérez avatar  avatar trackiss (Ryuki Kobayashi) avatar Omid Bakhshandeh avatar Aleksandar Marinković avatar Tobias Jonas avatar Rikito Taniguchi avatar Peng Cheng avatar dino avatar Luis Carcamo avatar Bilal avatar Rohan Verma avatar Andriy Plokhotnyuk avatar Nikita avatar Sven avatar John Wass avatar Zhenglai Zhang avatar Pierre Kisters avatar Alexandre Moreno avatar pyfagorass avatar Lorenzo Gabriele avatar Mariano Gonzalez avatar Damiano Masillo avatar Marek Kadek avatar Erick avatar Redion Xhepa avatar  avatar  avatar Luigi Mazzon avatar François-Xavier Hibon avatar Anastasios Skarlatidis avatar Marcin Szałomski avatar Ruslans Tarasovs avatar Gustavo Aquino avatar

Watchers

 avatar Sam Thomson avatar James Cloos avatar  avatar Sören Brunk avatar Yu Su avatar  avatar .NET Foundation Contribution License Agreements avatar  avatar emmanuel avatar

scala_torch's Issues

How to build on Windows

Hi, since there isn't a build guide for Windows I tried to figure out myself how to do it but I am stuck at that error:

[info] Preprocessing...
[error] C:\Users\filip\Desktop\Workspaces\IdeaProjects\scala_torch\swig\src\main\swig\torch_data.i(83) : Error: Unable to find 'c10\core\DeviceType.h'
[error] C:\Users\filip\Desktop\Workspaces\IdeaProjects\scala_torch\swig\src\main\swig\torch_data.i(84) : Error: Unable to find 'c10\core\Device.h'
[error] stack trace is suppressed; run 'last swig / Swig / generate' for the full output
[error] (swig / Swig / generate) aborting generation for C:\Users\filip\Desktop\Workspaces\IdeaProjects\scala_torch\swig\src\main\swig\torch_swig.i because swig was unhappy
[error] Total time: 12 s, completed 31 gen 2023, 11:34:26

System info:

Steps to reproduce:

  1. clone the scala torch repo
  2. unzip libtorch wherever you like, add libtorch/lib to env PATH, add libtorch/share/cmake/Torch to env Torch_DIR
  3. download swig-win 4.1.1 and add it to PATH
  4. install python 3.9 and create an alias for command python3 by duplicating python.exe to pythron3.exe (ugly but faster than modifying the scripts)
  5. install gcc, g++,make and cmake using cygwin and set the following envs:
  • CMAKE_C_COMPILER=C:\cygwin64\bin\gcc.exe
  • CMAKE_CXX_COMPILER=C:\cygwin64\bin\g++.exe
  • CMAKE_MAKE_PROGRAM=C:\cygwin64\bin\make.exe
  1. download cuda and cdnn and add cuda directory path to env CUDA_HOME
  2. run sbt compile

I also tried using libtorch for CPUs but got the same error.

Do you know how to solve it?

How to build on M1 Mac? (or is it possible?)

(edited)
using libtorch 1.10.2( https://download.pytorch.org/libtorch/cpu/libtorch-macos-1.10.2.zip) on arm64 darwin

[error] ld: symbol(s) not found for architecture arm64
[error] clang-11: error: linker command failed with exit code 1 (use -v to see invocation)

using libtorch 1.13.1(https://download.pytorch.org/libtorch/cpu/libtorch-macos-1.13.1.zip) on arm64 darwin

ATen/CUDAGeneratorImpl.h' file not found

hindsight

  • libtorch 1.13.1, which I tried to use, and 1.10.2, which is used in CI, have different include paths.
  • with libtorch 1.10.2, linker still raises error on arm64 arch.

environment

system_profiler SPSoftwareDataType SPHardwareDataType 
Software:

    System Software Overview:

      System Version: macOS 12.5 (21G72)
      Kernel Version: Darwin 21.6.0
      ...
    Hardware:

      Hardware Overview:
      Model Name: MacBook Pro
      Model Identifier: MacBookPro18,2
      Chip: Apple M1 Max
SWIG Version 4.0.2

Compiled with clang++ [aarch64-apple-darwin22.1.0]

Configured options: +pcre
python --version
Python 3.7.15
import platform
print(platform.platform())
Darwin-21.6.0-arm64-arm-64bit
java --version
openjdk 17.0.3 2022-04-19 LTS
OpenJDK Runtime Environment Zulu17.34+19-CA (build 17.0.3+7-LTS)
OpenJDK 64-Bit Server VM Zulu17.34+19-CA (build 17.0.3+7-LTS, mixed mode, sharing)

reproduction

What I did for now are the followings

  1. prepare swig4, Python with required packages(pyyaml, typing-extensions, setuptools,etc.), Java(Azul Systems, Inc. Java 17.0.3), bazel
  2. run git clone [email protected]:microsoft:scala_torch.git --recursive
  3. download libtorch from https://download.pytorch.org/libtorch/cpu/libtorch-macos-1.13.1.zip and export TORCH_DIR path
  4. run bazel build generated_cpp at pytorch dir to get Declarations.yaml for bindgen.py
  5. run python3 -m tools.codegen.gen -s aten/src/ATen -d torch/share/ATen
  6. put Declarations.yaml from scala_torch/pytorch/bazel-bin/aten/src/ATen/Declarations.yaml to pytorch/torch/share/ATen/Declarations.yaml
  7. run sbt compile

and I got

[info] Building library with native build tool CMake
[info] Using CMake version 3.22.3
[info] -- Static Pytorch: OFF
[info] -- torch dir?: /path/to/libtorch
[info] -- final torch dir: /path/to/libtorch
[info] -- CMAKE_PREFIX_PATH: 
[info] -- CMAKE_MODULE_PATH: /path/to/libtorch/../cmake/Modules;/usr/local/cmake/Modules/share/cmake-3.14
[info] -- JNI include directories: /path/to/zulu17.34.19-ca-jdk-17.0.3/include;/path/to/zulu17.34.19-ca-jdk-17.0.3/include;/path/to/zulu17.34.19-ca-jdk-17.0.3/include
[info] -- Torch include directories: /path/to/libtorch/include;/path/to/libtorch/include/torch/csrc/api/include
[info] -- torch libs : torch;torch_library;/path/to/libtorch/lib/libc10.dylib;/path/to/libtorch/lib/libkineto.a
[info] -- Configuring done
[info] -- Generating done
[info] -- Build files have been written to: /path/to/scala_torch/swig/target/native/arm64-darwin/build
[info] [ 50%] Building CXX object CMakeFiles/torch_swig0.dir/path/to/scala_torch/swig/target/src_managed/native/torch_swig.cxx.o
[error] /path/to/scala_torch/swig/target/src_managed/native/torch_swig.cxx:706:10: fatal error: 'ATen/CUDAGeneratorImpl.h' file not found
[error] #include <ATen/CUDAGeneratorImpl.h>
[error]          ^~~~~~~~~~~~~~~~~~~~~~~~~~
[error] 1 error generated.
[error] make[2]: *** [CMakeFiles/torch_swig0.dir/build.make:76: CMakeFiles/torch_swig0.dir/path/to/scala_torch/swig/target/src_managed/native/torch_swig.cxx.o] Error 1
[error] make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/torch_swig0.dir/all] Error 2
[error] make: *** [Makefile:136: all] Error 2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.