GithubHelp home page GithubHelp logo

facebookresearch / theseus Goto Github PK

View Code? Open in Web Editor NEW
1.7K 31.0 120.0 11.53 MB

A library for differentiable nonlinear optimization

License: MIT License

Python 93.35% C++ 3.32% Cuda 2.37% Shell 0.85% C 0.10%
differentiable-optimization robotics embodied-ai nonlinear-least-squares pytorch deep-learning computer-vision gauss-newton levenberg-marquardt implicit-differentiation

theseus's Introduction

CircleCI License pypi PyPi Downloads Python pre-commit black PRs

A library for differentiable nonlinear optimization

PaperBlogWebpageTutorialsDocs

Theseus is an efficient application-agnostic library for building custom nonlinear optimization layers in PyTorch to support constructing various problems in robotics and vision as end-to-end differentiable architectures.

Differentiable nonlinear optimization provides a general scheme to encode inductive priors, as the objective function can be partly parameterized by neural models and partly with expert domain-specific differentiable models. The ability to compute gradients end-to-end is retained by differentiating through the optimizer which allows neural models to train on the final task loss, while also taking advantage of priors captured by the optimizer.


Current Features

Application agnostic interface

Our implementation provides an easy to use interface to build custom optimization layers and plug them into any neural architecture. Following differentiable features are currently available:

Efficiency based design

We support several features that improve computation times and memory consumption:

Getting Started

Prerequisites

  • We strongly recommend you install Theseus in a venv or conda environment with Python 3.8-3.10.
  • Theseus requires torch installation. To install for your particular CPU/CUDA configuration, follow the instructions in the PyTorch website.
  • For GPU support, Theseus requires nvcc to compile custom CUDA operations. Make sure it matches the version used to compile pytorch with nvcc --version. If not, install it and ensure its location is on your system's $PATH variable.
  • Theseus also requires suitesparse, which you can install via:
    • sudo apt-get install libsuitesparse-dev (Ubuntu).
    • conda install -c conda-forge suitesparse (Mac).

Installing

  • pypi

    pip install theseus-ai

    We currently provide wheels with our CUDA extensions compiled using CUDA 11.6 and Python 3.10. For other CUDA versions, consider installing from source or using our build script.

    Note that pypi installation doesn't include our experimental Theseus Labs. For this, please install from source.

  • From source

    The simplest way to install Theseus from source is by running the following (see further below to also include BaSpaCho)

    git clone https://github.com/facebookresearch/theseus.git && cd theseus
    pip install -e .

    If you are interested in contributing to Theseus, instead install

    pip install -e ".[dev]"
    pre-commit install

    and follow the more detailed instructions in CONTRIBUTING.

  • Installing BaSpaCho extensions from source

    By default, installing from source doesn't include our BaSpaCho sparse solver extension. For this, follow these steps:

    1. Compile BaSpaCho from source following instructions here. We recommend using flags -DBLA_STATIC=ON -DBUILD_SHARED_LIBS=OFF.

    2. Run

      git clone https://github.com/facebookresearch/theseus.git && cd theseus
      BASPACHO_ROOT_DIR=<path/to/root/baspacho/dir> pip install -e .

      where the BaSpaCho root dir must have the binaries in the subdirectory build.

Running unit tests (requires dev installation)

python -m pytest tests

By default, unit tests include tests for our CUDA extensions. You can add the option -m "not cudaext" to skip them when installing without CUDA support. Additionally, the tests for sparse solver BaSpaCho are automatically skipped when its extlib is not compiled.

Examples

Simple example. This example is fitting the curve $y$ to a dataset of $N$ observations $(x,y) \sim D$. This is modeled as an Objective with a single CostFunction that computes the residual $y - v e^x$. The Objective and the GaussNewton optimizer are encapsulated into a TheseusLayer. With Adam and MSE loss, $x$ is learned by differentiating through the TheseusLayer.

import torch
import theseus as th

x_true, y_true, v_true = read_data() # shapes (1, N), (1, N), (1, 1)
x = th.Variable(torch.randn_like(x_true), name="x")
y = th.Variable(y_true, name="y")
v = th.Vector(1, name="v") # a manifold subclass of Variable for optim_vars

def error_fn(optim_vars, aux_vars): # returns y - v * exp(x)
    x, y = aux_vars
    return y.tensor - optim_vars[0].tensor * torch.exp(x.tensor)

objective = th.Objective()
cost_function = th.AutoDiffCostFunction(
    [v], error_fn, y_true.shape[1], aux_vars=[x, y],
    cost_weight=th.ScaleCostWeight(1.0))
objective.add(cost_function)
layer = th.TheseusLayer(th.GaussNewton(objective, max_iterations=10))

phi = torch.nn.Parameter(x_true + 0.1 * torch.ones_like(x_true))
outer_optimizer = torch.optim.Adam([phi], lr=0.001)
for epoch in range(10):
    solution, info = layer.forward(
        input_tensors={"x": phi.clone(), "v": torch.ones(1, 1)},
        optimizer_kwargs={"backward_mode": "implicit"})
    outer_loss = torch.nn.functional.mse_loss(solution["v"], v_true)
    outer_loss.backward()
    outer_optimizer.step()

See tutorials, and robotics and vision examples to learn about the API and usage.

Citing Theseus

If you use Theseus in your work, please cite the paper with the BibTeX below.

@article{pineda2022theseus,
  title   = {{Theseus: A Library for Differentiable Nonlinear Optimization}},
  author  = {Luis Pineda and Taosha Fan and Maurizio Monge and Shobha Venkataraman and Paloma Sodhi and Ricky TQ Chen and Joseph Ortiz and Daniel DeTone and Austin Wang and Stuart Anderson and Jing Dong and Brandon Amos and Mustafa Mukadam},
  journal = {Advances in Neural Information Processing Systems},
  year    = {2022}
}

License

Theseus is MIT licensed. See the LICENSE for details.

Additional Information

Theseus is made possible by the following contributors:

Made with contrib.rocks.

theseus's People

Contributors

aiddun avatar bamos avatar brentyi avatar christopher6488 avatar cpaxton avatar ddetone avatar dishank-b avatar exhaustin avatar fantaosha avatar gralerfics avatar hesam-vayu avatar jeffin07 avatar jingyuqian avatar joeaortiz avatar luisenp avatar maurimo avatar mhmukadam avatar neilpandya avatar psodhi avatar rmurai0610 avatar rtqichen avatar thomasweng15 avatar vshobha avatar yipuzhao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

theseus's Issues

Various minor improvements

  • Pass dtype to ScaleCostWeight when constructed from float.
  • Typo in SE3.
  • Add between and adjoint aliases to global th scope.
  • Explicitly disallow non Manifold optimization variables added to Objective.
  • Test old bug in Variable.update(batch_ignore_mask=True) with Lie groups.
  • Consider adding LieGroupTensor wrapper automatically (if needed) when calling TheseusLayer.forward (see).
  • Rename pytest.mark.cuda as pytest.mark.cudaext.
  • Make LieGroup.dof() a static method.
  • Expose NonlinearOptimizerInfo at th level.
  • Change info.best_iter initialization to a non-ambiguous value (e.g,. -1, currently 0).
  • Ensure consistency in the order of args in our cost functions (e.g., optim_vars, aux_vars, cost_weight).

Sparse solver class and autograd consolidation

  • From discussion here: common code between autograd functions of different sparse solvers can be consolidated.
  • Add an intermediate sparse solver class, see comment here and make implementation consistent with dense solver.

Refactor SE2 and SE3 to use rotation-then-translation format

Sophus uses rotation-then-translation everywhere, gtsam is the same except for SE2 where it is flipped.

We should consider refactoring SE2 in that case, unless there are any consequences (and why gtsam uses the flipped convention for it). Aside from the function signature, the data order would also likely change then (x, y, c, s -> c, s, x, y).

Originally posted by @mhmukadam in #68 (comment)

Prevent passing aux/opt vars to optim kwargs in theseus layer forward

From discussion:

layer.forward(input_data={}, optimizer_kwargs={"verbose": True, "track_best_solution": True, "damping": 0.01, ..., etc })
After refactoring this way, I don't think we need to check at the if the keys passed to the dictionary are good or not. In fact, we should probably rename it to optimizer_options={} so it's pretty obvious what this dict is supposed to be doing.

error and errorSquaredNorm can optionally take in variable data

🚀 Feature

API improvement in Objective: error and errorSquaredNorm can optionally take in var_data which if passed would call update internally. Document the behavior that if var_data is passed this will update the objective.

Motivation

Facilitates usage for cases where only error needs to be queried (w/o running optimization or even updating the variables).

Pitch

Following ways to use this api after:

  1. Get error on current internal values: call error without passing any var_data
  2. Get error on new values w\ update to objective: call error and pass var_data
  3. Get error on new values w\o update to objective: call error, pass var_data and True optional flag to not update objective

Moving CholmodSolveFunction and SparseStructure to separate submodule

CholmodSolveFunction and SparseStructure don't depend on any of theseus' core components (including those within th.optimizer). Since they are objects of broader interest, applicable to sparse matrices outside of theseus optimization problems, we should move them to a different location. Maybe theseus.util or theseus.linalg?

Updating just `aux_vars` isn't sufficient to re-solve with some data changed

🐛 Bug

Using the quadratic fit example, I thought it would be reasonable to update the data in just aux_vars and re-solve, but it seems like there's a dependence on the global data_x.

Steps to Reproduce

I included a MWE below that outputs the incorrect solution in the middle here by just changing aux_inputs:

optimal a:  tensor([[1.0076]], grad_fn=<AddBackward0>)
== Only changing aux_vars["x"] (this should not be the same solution)
optimal a:  tensor([[1.0076]], grad_fn=<AddBackward0>)
== Globally updating data_x (this is the correct solution)
optimal a:  tensor([[0.0524]], grad_fn=<AddBackward0>)

Expected behavior

I was pretty confused at first when my code using wasn't working and didn't realize it was because of this. We should make updating aux_inputs sufficient to re-solve the problem, or if this is challenging we should consider 1) raising a warning/adding a check with aux_inputs doesn't match or 2) remove duplicated passing of aux_inputs when it doesn't do anything.

Code

#!/usr/bin/env python3

import torch
import theseus as th
import theseus.optimizer.nonlinear as thnl

import numpy as np
import numdifftools as nd

def generate_data(num_points=10, a=1., b=0.5, noise_factor=0.01):
    data_x = torch.rand((1, num_points))
    noise = torch.randn((1, num_points)) * noise_factor
    data_y = a * data_x.square() + b + noise
    return data_x, data_y

num_points = 10
data_x, data_y = generate_data(num_points)

x = th.Variable(data_x.requires_grad_(), name="x")
y = th.Variable(data_y.requires_grad_(), name="y")
a = th.Vector(1, name="a")
b = th.Vector(1, name="b")

def quad_error_fn(optim_vars, aux_vars):
    a, b = optim_vars
    x, y = aux_vars
    est = a.data * x.data.square() + b.data
    err = y.data - est
    return err

optim_vars = a, b
aux_vars = x, y
cost_function = th.AutoDiffCostFunction(
    optim_vars, quad_error_fn, num_points, aux_vars=aux_vars, name="quadratic_cost_fn"
)
objective = th.Objective()
objective.add(cost_function)
optimizer = th.GaussNewton(
    objective,
    max_iterations=15,
    step_size=0.5,
)

theseus_inputs = {
"a": 2 * torch.ones((1, 1)).requires_grad_(),
"b": torch.ones((1, 1)).requires_grad_()
}
aux_vars = {
"x": data_x,
"y": data_y,
}
theseus_optim = th.TheseusLayer(optimizer)
updated_inputs, info = theseus_optim.forward(
    theseus_inputs, aux_vars=aux_vars,
    track_best_solution=True, verbose=False,
    backward_mode=thnl.BackwardMode.FULL,
)
print('optimal a: ', updated_inputs['a'])

aux_vars = {
"x": data_x+10.,
"y": data_y,
}
updated_inputs, info = theseus_optim.forward(
    theseus_inputs, aux_vars=aux_vars,
    track_best_solution=True, verbose=False,
    backward_mode=thnl.BackwardMode.FULL,
)
print('== Only changing aux_vars["x"] (this should not be the same solution)')
print('optimal a: ', updated_inputs['a'])

data_x.data += 10.
aux_vars = {
"x": data_x,
"y": data_y,
}
updated_inputs, info = theseus_optim.forward(
    theseus_inputs, aux_vars=aux_vars,
    track_best_solution=True, verbose=False,
    backward_mode=thnl.BackwardMode.FULL,
)
print('== Globally updating data_x (this is the correct solution)')
print('optimal a: ', updated_inputs['a'])

Edge case unit tests for 2D Lie groups

If I understand correctly, this is to test different corner cases (e.g., near zero tangent vectors). If so, we should also add similar tests for SE2 and SO2 (in a new PR). But agree that adding some small comments here would be useful.

Originally posted by @luisenp in #71 (comment)

Does tutorial 2 use the Theseus derivatives through the NLLS? (Or just through the objective?)

This tutorial parameterizes a quadratic ax^2+b with a optimized by PyTorch autograd and b optimized with the Theseus NLLS for a given a. The key piece that enables a to be learned is that we pass it back into the same cost function the NLLS optimizer uses except we take a gradient step of the cost function w.r.t. a, which doesn't use the derivative information of how b was computed through the NLLS optimizer:
image

Thus if I understand correctly, this tutorial isn't using the derivatives through the NLLS optimization process. To try to understand this better, I added a torch.no_grad call around the NLLS optimizer to block the gradients through it and it didn't change the output and it was still able to fit the quadratics:
image
image

\cc @vshobha @mhmukadam @luisenp

Derivatives w.r.t. the data in the quadratic fitting example

One interesting use-case of the derivatives Theseus provides in the quadratic fitting example in tutorial 1 is that they can also be used to obtain the derivative of the function w.r.t. the input data to do some form of sensitivity analysis to see how sensitive the loss/parameters are w.r.t. individual data points. We have a small example of this in section 6.1 of cvxpylayers for logistic regression:
image

I quickly added these for the quadratic fitting example (da/dxy) to see what they would look like and we get something reasonable. The following shows that taking these negative gradient steps would decrease the quadratic parameter a.
image

Do you think this would be interesting to include in one of the tutorials/examples? Maybe as a new section at the end of tutorial 2?

\cc @luisenp @mhmukadam @vshobha

Code

import torch
import theseus as th
import matplotlib.pyplot as plt

torch.manual_seed(0)

def generate_data(num_points=100, a=1, b=0.5, noise_factor=0.01):
    # Generate data: 100 points sampled from the quadratic curve listed above
    data_x = torch.rand((1, num_points))
    noise = torch.randn((1, num_points)) * noise_factor
    data_y = a * data_x.square() + b + noise
    return data_x, data_y

data_x, data_y = generate_data()

# data is of type Variable
x = th.Variable(data_x.requires_grad_(), name="x")
y = th.Variable(data_y.requires_grad_(), name="y")

# optimization variables are of type Vector with 1 degree of freedom (dof)
a = th.Vector(1, name="a")
b = th.Vector(1, name="b")

def quad_error_fn(optim_vars, aux_vars):
    a, b = optim_vars 
    x, y = aux_vars
    est = a.data * x.data.square() + b.data
    err = y.data - est
    return err

optim_vars = a, b
aux_vars = x, y
cost_function = th.AutoDiffCostFunction(
    optim_vars, quad_error_fn, 100, aux_vars=aux_vars, name="quadratic_cost_fn"
)
objective = th.Objective()
objective.add(cost_function)
optimizer = th.GaussNewton(
    objective,
    max_iterations=15,
    step_size=0.5,
)
theseus_optim = th.TheseusLayer(optimizer)

theseus_inputs = {
"a": 2 * torch.ones((1, 1)).requires_grad_(),
"b": torch.ones((1, 1)).requires_grad_()
}
aux_vars = {
"x": data_x,
"y": data_y,
}
updated_inputs, info = theseus_optim.forward(
    theseus_inputs, aux_vars=aux_vars,
    track_best_solution=True, verbose=True)
print("Best solution:", info.best_solution)



da_dx = torch.autograd.grad(
    updated_inputs['a'], aux_vars['x'],
    retain_graph=True)[0].squeeze()
da_dy = torch.autograd.grad(
    updated_inputs['a'], aux_vars['y'],
    retain_graph=True)[0].squeeze()

# Plot the leraned function
fig, ax = plt.subplots()
ax.scatter(data_x.detach(), data_y.detach());

a = info.best_solution['a'].squeeze().detach()
b = info.best_solution['b'].squeeze().detach()
x = torch.linspace(0., 1., steps=100)
y = a*x*x + b
ax.plot(x, y, color='k', lw=4, linestyle='--')

ax.set_xlabel('x')
ax.set_ylabel('y')

for i in range(data_x.shape[1]):
    data_xi = data_x[0,i].detach()
    data_yi = data_y[0,i].detach()
    ax.plot([data_xi, data_xi-da_dx[i]],
            [data_yi, data_yi-da_dy[i]],color='k')
ax.set_title('Negated derivatives of a w.r.t. the input data');

Features to improve obstacle avoidance functionality

🚀 Feature

  • Implicitly handle planar or 3D signed distance fields (SDF).
  • Support to initialize from occupancy grid (and calculate SDF internally). Will of course be slow for large and/or 3D grids (and we can put a comment for it).
  • Flexibility for passing user defined custom obstacle cost functions (possibly overlaps with learnable cost function api).
  • Provide option to pass PlanarSDF if its data is constant, to avoid creating a new one for each factor.
  • Ensure that map edges are correctly considered: add a flag for providing user options to decide if out of map bounds is free or occupied. (see context below)
  • Evaluate the interplay of different options with outer loop optimization (e.g., when learned initial variables go out of bounds during the outer loop optimization).

Motivation

Pitch

Alternatives

Additional context

https://github.com/facebookresearch/theseus/blob/main/theseus/embodied/collision/signed_distance_field.py#L59

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.