GithubHelp home page GithubHelp logo

imgeorgiev / diffrl Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nvlabs/diffrl

68.0 1.0 5.0 60.53 MB

Learning Optimal Policies Through Contact in Differentiable Simulation

Home Page: https://adaptive-horizon-actor-critic.github.io/

License: Other

Shell 0.01% C++ 0.99% Python 31.63% C 1.39% MATLAB 0.06% Makefile 0.02% Batchfile 0.02% Jupyter Notebook 65.88%

diffrl's Introduction

Adaptive Horizon Actor Critic (AHAC)

This repository contains the implementation for the paper Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation (ICML 2024).

In this paper, we build on previous work in differentiable simulation policy optimization, to create Adaptive Horizon Actor Critic (AHAC). Our approach deals with gradient error arising from stiff contact by dynamically adapting its model-based horizon to fit one robot gait and avoid excessive contact. This results in a higher performant and easier to use algorithm than its predecessor Short Horizon Actor Critic (SHAC) while also outperofming PPO by 40% across a set of high-dimensional locomotion tasks.

Watch the video

Installation

git clone https://github.com/imgeorgiev/DiffRL --recursive

Setup this project with Anaconda

conda env create -f environment.yml
conda activate diffrl
pip install -e dflex
pip install -e .

For an unknown reason, you need to symlink cuda libraries for ninja to work:

ln -s $CONDA_PREFIX/lib $CONDA_PREFIX/lib64

If you want SVG as a baseline:

pip install -e externals/svg

Training

python train.py alg=ahac env=ant

where you can change alg and env freely based in the provided hydra configurations.

The training script outputs tensorboard logs by default. If you want to use wandb, you can add the additional flag general.run_wandb=True and specify wandb.project=<name> wnadb.entity=<entity>.

Note that dflex is not fully deterministic due to GPU acceleration and cannot reproduce the same results given then same seed.

Testing

You can load a policy and evluate it without training. Works only for AHAC and SHAC algorithms.

python train.py alg=ahac env=ant train=False checkpoint=<policy_path>

You can also control the number of eval episodes with env.player.games_num=10.

Generating rendering files

The general.render flag indicates whether to export the video of the task execution. If does, the exported video is encoded in .usd format, and stored in the examples/output folder. To visualize the exported .usd file, refer to USD at NVIDIA.

python train.py alg=ahac env=ant general.train=False general.render=True general.checkpoint=<policy_path> env.config.stochastic_init=False env.player.games_num=1 env.player.num_actors=1 env.config.num_envs=1 alg.eval_runs=1

Once you have generated a rendering file you can load it in USD Composer to generate a image or video render like the one above. To install Omniverse, follow the Omniverse Install Page. Then install USD Composer from the Omniverse GUI. Start USD Composer and load the usd files generated by the script above.

diffrl's People

Contributors

eanswer avatar imgeorgiev avatar krishpop avatar viktorm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.