GithubHelp home page GithubHelp logo

fmdazhar / rl_locomotion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from antonilo/rl_locomotion

1.0 0.0 0.0 2.44 MB

Code for training locomotion policies with RL

License: GNU General Public License v3.0

Python 93.90% CMake 6.10%

rl_locomotion's Introduction

Cross-Modal Supervision (Policy Training Code using RMA)

This repo builds on the code from RMA: Rapid Motor Adaptation for Legged Robots to train the blind adaptive policy associated with the paper Learning Visual Locomotion with Cross-Modal Supervision. For more information, please check the RMA project page and CMS project webpage. For using the policy with vision on a real robot, please refer to this repository.

Paper, Video, and Datasets

If you use this code in an academic context, please cite the following two publications:

Papers:
RMA: Rapid Motor Adaptation for Legged Robots, Narration: Video
Learning Visual Locomotion With Cross-Modal Supervision, Narration: Video

  @InProceedings{kumar2021rma,
   author={Kumar, Ashish and Fu, Zipeng and Pathak, Deepak and Malik, Jitendra},
   title={{RMA: Rapid motor adaptation for legged robots}},
   booktitle={Robotics: Science and Systems},
   year={2021}
  }

  @InProceedings{loquercio2022learn,
   author={Loquercio, Antonio and Kumar, and Malik, Jitendra},
   title={{Learning Visual Locomotion with Cross-Modal Supervision}},
   booktitle={International Conference on Robotics and Automation},
   year={2023}
  }

Usage

This code trains a policy using reinforcement learning to walk on complex terrains with minimal information. The code uses the Raisim simulator for training. Note that simulator is CPU-based.

Raisim Install

Please follow the installation guide of raisim. Note that we do not support the latest version of raisim. Please checkout the commit f0bb440762c09a9cc93cf6ad3a7f8552c6a4f858 after cloning raisimLib.

Training Environment Installation

Run the following commands to install the training environments

cd raisimLib
git clone [email protected]:antonilo/rl_locomotion.git
rm -rf raisimGymTorch
mv rl_locomotion raisimGymTorch
cd raisimGymTorch
# You might want to create a new conda environment if you did not do it already for the vision part
conda create --name cms python=3.8
conda activate cms
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
pip install opencv-python matplotlib pandas wandb

# installation of the environments
python setup.py develop

Training a policy with priviledged information

You can use the following code to train a policy with access to priviledged information about the robot (e.g. mass, velocity, and motor strenght) and about the environment (e.g. the terrain geometry). The optimization will be guided by trajectories generated by a previous policy we provide. You can control the strenght of the imitation by changing the parameter RL_coeff in this file.

To start training, you can use the following commands:

cd raisimLib/raisimGymTorch/raisimGymTorch/env/envs/rsg_a1_task
python runner.py --name random --gpu 1 --exptid 1

It will take approximately 4K iterations to train a good enough policy. If you want to make any changes to the training environment, feel free to edit this file. Note that every time you make changes, you need to recompile the file by running this commands:

cd raisimLib/raisimGymTorch
python setup.py develop

If you wish to continue a previous run, use the following commands:

cd raisimLib/raisimGymTorch/raisimGymTorch/env/envs/rsg_a1_task
python runner.py --name random --gpu 1 --exptid 1 --loadid ITR_NBR --overwrite

Visualizing a policy

You can use the following code to see if your policy training worked. First run the unity renderer:

cd raisimLib/raisimUnity/linux
./raisimUnity.x86_64

In a separate terminal, run the policy

conda activate cms
cd raisimLib/raisimGymTorch/raisimGymTorch/env/envs/rsg_a1_task
python viz_policy.py ../../../../data/rsg_a1_task/EXPT_ID POLICY_ID

You can now analysize the behaviour!

Benchmark a policy

If you want to know how the policy performs over a set of controlled experiments, use the following commands:

conda activate cms
cd raisimLib/raisimGymTorch/raisimGymTorch/env/envs/rsg_a1_task
python evaluate_policy.py ../../../../data/rsg_a1_task/EXPT_ID POLICY_ID

This will generate a json file in the experiment folder. You can visualize the results in a nice table using this script:

python compute_results.py ../data/rsg_a1_task/EXPT_ID/evaluation_results.csv

If you want to make any changes to the evaluation, feel free to edit this file. Note that in the evaluation, the flag Eval is set to true. Remeber to recompile any time you edit the environment file!

Distilling a policy with priviledged information into a blind policy

Policy trained with priviledged information cannot be used on a physical robot. Therefore, we distill a priviledged policy using a slightly different version of RMA optimized for walking on complex terrains.

To start training, you can use the following commands:

cd raisimLib/raisimGymTorch/raisimGymTorch/env/envs/dagger_a1
python dagger.py --name cms_dagger --exptid 1 --loadpth ../../../../data/rsg_a1_task/EXPT_ID --loadid PRIV_POLICY_ID --gpu 1

It will take approximately 2K iterations to train a good enough policy. If you want to make any changes to the training environment, feel free to edit this file. Note that every time you make changes, you need to recompile the environment (see above).

Visualizing and evaluating a blind policy

You can follow exactly the same steps as for the priviledged policy (but now running commands from this folder) to visualize and benchmark a blind policy.

Using a blind policy on a real robot

If you want to use a policy you trained on a real robot, you should first move it in the models folder, change the path to the model in the launch_file and the policy id in the parameter_file.

rl_locomotion's People

Contributors

antonilo avatar ashish-kmr avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.