∇grad: tensor processing library + automatic differentiation engine

NOTE: This library is in a pretty early development stage, so expect main branch to be broken from time to time. Also, the API is prone to frequent breaking changes.

Tensors, also known as multidimensional arrays, are ubiquitous data structures in many computing fields, such as machine learning and scientific computing, that can be used to represent high-dimensional data, incluing but not limited to images, audio, gradient vectors, any kind of matrix or the inputs of a mathematical function as a vector.

Automatic differentiation is a technique to automatically perform efficient and analytically precise partial differentiation on a given mathematical expression.

∇grad (nablagrad) is yet another tensor processing library for C++ that incorporates an automatic differentiation engine supporting both forward and reverse-mode autodiff on tensors.

Examples can be found in the examples directory.

Overview

Automatic differentiation has two different modes of operation known as forward accumulation mode (or tangent linear mode) and reverse accumulation mode (or cotangent linear mode).

nablagrad makes use of dual numbers, together with operator overloading, to perform partial differentiation in forward-mode. In reverse-mode, a gradient tape (known as Wengert list) is used to represent an abstract computation graph that is built progressively when evaluating the given mathematical expression. Upon completion, the graph is traversed backwards, computing and propagating the corresponding partial derivatives using the chain rule.

For a complete, more detailed explanation of how these modes work see Baydin et al. (2018).

Some quick usage examples

The full implementation of the following examples can be found in the examples directory. Note that in these examples we use template functions to be able to compare nablagrad computations with nabla::Dual and nabla::Tensor to the finite differences computation.

Before running these examples, see Installation and usage. Examples can be compiled with make examples.

Reverse-mode gradient computation

Let $f(x_0, x_1) = \ln(x_0) + x_0x_1 - \sin(x_1)$. Implement $f(x_0, x_1)$ as a template function to allow computing both the value of the function and the gradient.

template<typename T> T f(const std::vector<T>& x) {
    return log(x.at(0)) + x.at(0) * x.at(1) - sin(x.at(1));
}

Using the nabla::Tensor structure and the nabla::grad() function the gradient $\nabla f(x_0, x_1)$ can be easily computed and evaluated at some vector $x$ as

std::vector<double> x = {2.0, -3.0};

auto grad_f = nabla::grad(f<nabla::Tensor>);
std::cout << "∇f(x) = " << grad_f(x) << std::endl;

Running these example as ./examples/reverse_mode_gradient we obtain

∇f(x) = [5.5, 1.7163378145367738092]

Although less efficiently (see Baydin et al. (2018)), gradient computation can also be performed in forward-mode by using nabla::Dual and nabla::grad_forward() instead of nabla::Tensor and nabla::grad().

Forward-mode partial differentiation

Let $\mathbf{x} = [x_0, x_1, \ldots, x_n]^\intercal$ be a vector. The $\ell_2$-norm of $\mathbf{x}$ is defined as $\lVert\mathbf{x}\rVert_2^2\triangleq\langle\mathbf{x}, \mathbf{x}\rangle$. Partial differentiation (for instance, $\partial /\partial x_0 \lVert\mathbf{x}\rVert_2^2$) can be performed by declaring all elements in $\mathbf{x}$ (which are the variables of the $\ell_2$-norm function) as nabla::Dual and setting the adjoint to $1.0$ of the variable to be differentiated with respect to.

The $\ell_2$-norm can be implemented as

template<typename T> T l2norm(const std::vector<T>& x) {
    T t{0};
    for (T xi : x) t += xi * xi;
    return sqrt(t);
}

Then partial differentiation is performed as follows.

nabla::Dual x0{-4.3, 1.0};
nabla::Dual x1{6.2};
nabla::Dual x2{5.8};
std::vector<nabla::Dual> x = {x0, x1, x2};

nabla::Dual dl2norm_x0 = l2norm<nabla::Dual>(x);
std::cout << "∂l2norm/∂x0 = " << dl2norm_x0.get_adjoint() << std::endl;

Running this example as ./examples/forward_mode_partial_diff we obtain

∂l2norm/∂x0 = -0.451831

The nabla::grad_forward() function can be used in order to compute the gradient of a function in forward-mode.

Installation and usage

nablagrad can be installed by running

git clone https://github.com/rixsilverith/nablagrad.git && cd nablagrad
make install # may need to be run with root priviledges (sudo)

Note As of now, only Linux is properly supported. For other operating system manual installation is required.

Once installed, nablagrad must be linked to your project as a static library using your compiler of choice (for instance, with the -lnablagrad flag in g++). Then the library can be included as #include <nablagrad/nabla.h>.

Compiling from source

nablagrad can be compiled from source by running make build. A libnablagrad.a static library file will be generated inside the build directory.

Requirements

A compiler supporting, at least, C++17 is needed in order to compile nablagrad from source.

License

nablagrad is licensed under the MIT License. See LICENSE for more information. A copy of the license can be found along with the code.

References

Baydin, A. G.; Pearlmutter, B. A.; Radul, A. A.; Siskind, J. M. (2018). Automatic differentiation in machine learning: a survey. https://arxiv.org/abs/1502.05767
Wengert, R. E. (1964). A simple automatic derivative evaluation program. Comm. ACM. 7 (8): 463–464. [doi]
Automatic differentiation. Wikipedia.

rixsilverith / nablagrad Goto Github PK

nablagrad's Introduction

∇grad: tensor processing library + automatic differentiation engine

Overview

Some quick usage examples

Reverse-mode gradient computation

Forward-mode partial differentiation

Installation and usage

Compiling from source

Requirements

License

References

nablagrad's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs