GithubHelp home page GithubHelp logo

clvrai / mopa-rl Goto Github PK

View Code? Open in Web Editor NEW
68.0 8.0 11.0 257.26 MB

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

Home Page: https://clvrai.com/mopa-rl

License: MIT License

Python 68.68% C++ 20.36% Shell 8.23% Dockerfile 2.04% Cython 0.69%
robotics robot-manipulation reinforcement-learning robot-learning motion-planners

mopa-rl's Introduction

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments

[Project website] [Paper]

This project is a PyTorch implementation of Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments, published in CoRL 2020.

Deep reinforcement learning (RL) agents are able to learn contact-rich manipulation tasks by maximizing a reward signal, but require large amounts of experience, especially in environments with many obstacles that complicate exploration. In contrast, motion planners use explicit models of the agent and environment to plan collision-free paths to faraway goals, but suffer from inaccurate models in tasks that require contacts with the environment. To combine the benefits of both approaches, we propose motion planner augmented RL (MoPA-RL) which augments the action space of an RL agent with the long-horizon planning capabilities of motion planners.

Prerequisites

Installation

  1. Install Mujoco 2.0 and add the following environment variables into ~/.bashrc or ~/.zshrc.
# Download mujoco 2.0
$ wget https://www.roboti.us/download/mujoco200_linux.zip -O mujoco.zip
$ unzip mujoco.zip -d ~/.mujoco
$ mv ~/.mujoco/mujoco200_linux ~/.mujoco/mujoco200

# Copy mujoco license key `mjkey.txt` to `~/.mujoco`

# Add mujoco to LD_LIBRARY_PATH
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco200/bin

# For GPU rendering (replace 418 with your nvidia driver version)
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia-418

# Only for a headless server
$ export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so:/usr/lib/nvidia-418/libGL.so
  1. Download this repository and install python dependencies
# Install system packages
sudo apt-get install libgl1-mesa-dev libgl1-mesa-glx libosmesa6-dev patchelf libopenmpi-dev libglew-dev python3-pip python3-numpy python3-scipy

# Download this repository
git clone https://github.com/clvrai/mopa-rl.git

# Install required python packages in your new env
cd mopa-rl
pip install -r requirements.txt
  1. Install ompl
# Linux
sudo apt install libyaml-cpp-dev
sh ./scripts/misc/installEigen.sh #from the home directory # install Eigen

# Mac OS
brew install libyaml yaml-cpp
brew install eigen

# Build ompl
git clone [email protected]:ompl/ompl.git ../ompl
cd ../ompl
cmake .
sudo make install

# if ompl-x.x (x.x is the version) is installed in /usr/local/include, you need to rename it to ompl
mv /usr/local/include/ompl-x.x /usr/local/include/ompl
  1. Build motion planner python wrapper
cd ./mopa-rl/motion_planner
python setup.py build_ext --inplace

Available environments

PusherObstacle-v0 SawyerPushObstacle-v0 SawyerLiftObstacle-v0 SawyerAssemblyObstacle-v0
2D Push Sawyer Push Sawyer Lift Sawyer Assembly

How to run experiments

  1. Launch a virtual display (only for a headless server)
sudo /usr/bin/X :1 &
  1. Train policies
  • 2-D Push
sh ./scripts/2d/baseline.sh  # baseline
sh ./scripts/2d/mopa.sh  # MoPA-SAC
sh ./scripts/2d/mopa_ik.sh  # MoPA-SAC IK
  • Sawyer Push
sh ./scripts/3d/push/baseline.sh  # baseline
sh ./scripts/3d/push/mopa.sh  # MoPA-SAC
sh ./scripts/3d/push/mopa_ik.sh  # MoPA-SAC IK
  • Sawyer Lift
sh ./scripts/3d/lift/baseline.sh  # baseline
sh ./scripts/3d/lift/mopa.sh  # MoPA-SAC
sh ./scripts/3d/lift/mopa_ik.sh  # MoPA-SAC IK
  • Sawyer Assembly
sh ./scripts/3d/assembly/baseline.sh  # baseline
sh ./scripts/3d/assembly/mopa.sh  # MoPA-SAC
sh ./scripts/3d/assembly/mopa_ik.sh  # MoPA-SAC IK

Directories

The structure of the repository:

  • rl: Reinforcement learning code
  • env: Environment code for simulated experiments (2D Push and all Sawyer tasks)
  • config: Configuration files
  • util: Utility code
  • motion_planners: Motion planner code
  • scripts: Scripts for all experiments

Log directories:

  • logs/rl.ENV.DATE.PREFIX.SEED:
    • cmd.sh: A command used for running a job
    • git.txt: Log gitdiff
    • prarms.json: Summary of parameters
    • video: Generated evaulation videos (every evalute_interval)
    • wandb: Training summary of W&B, like tensorboard summary
    • ckpt_*.pt: Stored checkpoints (every ckpt_interval)
    • replay_*.pt: Stored replay buffers (every ckpt_interval)

Trouble shooting

Mujoco GPU rendering

To use GPU rendering for mujoco, you need to add /usr/lib/nvidia-000 (000 should be replaced with your NVIDIA driver version) to LD_LIBRARY_PATH before installing mujoco-py. Then, during mujoco-py compilation, it will show you linuxgpuextension instead of linuxcpuextension. In Ubuntu 18.04, you may encounter an GL-related error while building mujoco-py, open venv/lib/python3.7/site-packages/mujoco_py/gl/eglshim.c and comment line 5 #include <GL/gl.h> and line 7 #include <GL/glext.h>.

Virtual display on headless machines

On servers, you don’t have a monitor. Use this to get a virtual monitor for rendering and put DISPLAY=:1 in front of a command.

# Run the next line for Ubuntu
$ sudo apt-get install xserver-xorg libglu1-mesa-dev freeglut3-dev mesa-common-dev libxmu-dev libxi-dev

# Configure nvidia-x
$ sudo nvidia-xconfig -a --use-display-device=None --virtual=1280x1024

# Launch a virtual display
$ sudo /usr/bin/X :1 &

# Run a command with DISPLAY=:1
DISPLAY=:1 <command>

pybind11-dev not found

wget http://archive.ubuntu.com/ubuntu/pool/universe/p/pybind11/pybind11-dev_2.2.4-2_all.deb
sudo apt install ./pybind11-dev_2.2.4-2_all.deb

References

Citation

If you find this useful, please cite

@inproceedings{yamada2020mopa,
  title={Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments},
  author={Jun Yamada and Youngwoon Lee and Gautam Salhotra and Karl Pertsch and Max Pflueger and Gaurav S. Sukhatme and Joseph J. Lim and Peter Englert},
  booktitle={Conference on Robot Learning},
  year={2020}
}

Authors

Jun Yamada*, Youngwoon Lee*, Gautam Salhotra, Karl Pertsch, Max Pflueger, Gaurav S. Sukhatme, Joseph J. Lim, and Peter Englert at USC CLVR and USC RESL (*Equal contribution)

mopa-rl's People

Contributors

etpr avatar gautams3 avatar junjungoal avatar youngwoon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mopa-rl's Issues

Related GL issues

Hi! It seems that there is no #include <GL/gl.h> and #include <GL/glext.h> in the env/lib/python3.7/site-packages/mujoco_py/gl/eglshim.c.

My mujoco_py version is 2.0.2.5 and the first 9 lines in eglshim.c are:

#define EGL_EGLEXT_PROTOTYPES
#include "egl.h"
#include "eglext.h"
#include <GL/glew.h>

#include "mujoco.h"
#include "mjrender.h"

#include "glshim.h"

Looking forward to your reply! Thx!

Issue with running mopa-sac on SAC push

Hi,

I'm trying to run the script scripts/2d/mopa.sh and I get the following segmentation fault, and I'm not really sure where to start with addressing this. If anyone has any suggestions, that would be greatly appreciated.

Thanks!
[a53:232685] *** Process received signal *** [a53:232685] Signal: Segmentation fault (11) [a53:232685] Signal code: Address not mapped (1) [a53:232685] Failing at address: (nil) [a53:232685] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fc8b0d30420] [a53:232685] [ 1] /home/tarunc/mopa-rl/motion_planners/planner.cpython-37m-x86_64-linux-gnu.so(_ZN6MjOmpl26MujocoStateValidityChecker21addGlueTransformationERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x59)[0x7fc8300bf2a9] [a53:232685] [ 2] /home/tarunc/mopa-rl/motion_planners/planner.cpython-37m-x86_64-linux-gnu.so(_ZN13MotionPlanner16KinematicPlanner12isValidStateESt6vectorIdSaIdEE+0x1e9)[0x7fc8300bd3a9] [a53:232685] [ 3] /home/tarunc/mopa-rl/motion_planners/planner.cpython-37m-x86_64-linux-gnu.so(+0x41937)[0x7fc8300d3937] [a53:232685] [ 4] /home/tarunc/mopa-rl/motion_planners/planner.cpython-37m-x86_64-linux-gnu.so(+0x41e32)[0x7fc8300d3e32] [a53:232685] [ 5] python(_PyMethodDescr_FastCallKeywords+0xdb)[0x5643905dfd4b] [a53:232685] [ 6] python(+0x17faae)[0x5643905e0aae] [a53:232685] [ 7] python(_PyEval_EvalFrameDefault+0x661)[0x564390624601] [a53:232685] [ 8] python(_PyFunction_FastCallKeywords+0x187)[0x5643905998d7] [a53:232685] [ 9] python(+0x17f9c5)[0x5643905e09c5] [a53:232685] [10] python(_PyEval_EvalFrameDefault+0x661)[0x564390624601] [a53:232685] [11] python(_PyFunction_FastCallKeywords+0x187)[0x5643905998d7] [a53:232685] [12] python(+0x17f9c5)[0x5643905e09c5] [a53:232685] [13] python(_PyEval_EvalFrameDefault+0x661)[0x564390624601] [a53:232685] [14] python(_PyFunction_FastCallKeywords+0x187)[0x5643905998d7] [a53:232685] [15] python(+0x17f9c5)[0x5643905e09c5] [a53:232685] [16] python(_PyEval_EvalFrameDefault+0x661)[0x564390624601] [a53:232685] [17] python(+0x1901a5)[0x5643905f11a5] [a53:232685] [18] python(_PyMethodDef_RawFastCallKeywords+0xe9)[0x5643905aa639] [a53:232685] [19] python(_PyEval_EvalFrameDefault+0x4428)[0x5643906283c8] [a53:232685] [20] python(_PyFunction_FastCallKeywords+0x187)[0x5643905998d7] [a53:232685] [21] python(+0x17f9c5)[0x5643905e09c5] [a53:232685] [22] python(_PyEval_EvalFrameDefault+0x661)[0x564390624601] [a53:232685] [23] python(_PyFunction_FastCallKeywords+0x187)[0x5643905998d7] [a53:232685] [24] python(_PyEval_EvalFrameDefault+0x3f5)[0x564390624395] [a53:232685] [25] python(_PyEval_EvalCodeWithName+0x255)[0x564390579e85] [a53:232685] [26] python(+0x1d7d8e)[0x564390638d8e] [a53:232685] [27] python(_PyMethodDef_RawFastCallKeywords+0xe9)[0x5643905aa639] [a53:232685] [28] python(_PyEval_EvalFrameDefault+0x4428)[0x5643906283c8] [a53:232685] [29] python(_PyEval_EvalCodeWithName+0x255)[0x564390579e85] [a53:232685] *** End of error message *** Segmentation fault (core dumped)

Some questions about the settings in the XML files

Hi! Thanks for your generous sharing!

When I read the XML files, I noticed there are a lot of duplications of the body, such as the indicator link, dummy link or joint, goal link or joint.

What are the differences between the above settings and the real link or joint? In other words, what's the role of the above settings in the RL simulation?

Looking forward to your reply!

ompl no state space issue

Hi! I'm trying to set up the motion planner wrapper. I've cloned ompl (from the official source) into the mopa-rl directory, and run the setup commands. I then cd'ed into motion-planners and tried running the command, but ended up with this issue:
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ ./src/mujoco_ompl_interface.cpp:2:10: fatal error: ompl/base/StateSpace.h: No such file or directory 2 | #include <ompl/base/StateSpace.h> | ^~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. error: command '/usr/bin/gcc' failed with exit code 1
I'm not entirely sure where this is coming from, so if anyone had any insight, that would be greatly appreciated. Thanks!

MoPA-RL: error: argument --gpu expected one argument

Hi, thank you for your generous sharing. I have installed all the libraries and when I execute code
sh ./scripts/2d/baseline.sh, I encounter the following problem
MoPA-RL: error: argument --gpu expected one argument.
Do u know how to fix it? Looking forward to your reply, thanks!

Can't create a PyKinematicPlanner

when try to run sh ./scripts/2d/mopa.sh, there will be sh: 2: Syntax error: "(" unexpected, ERROR: Invalid activation key. After checking, I think there is something wrong with PyKinematicPlanner. Could anyone can fix it?

Post-processing the obtained path

Hi!

In your paper,

After the motion planning, the resulting path is smoothed using a shortcutting algorithm.

is mentioned to smooth the resulting path with the algorithm in

R. Geraerts and M. H. Overmars. Creating high-quality paths for motion planning. The International Journal of Robotics Research, 26(8):845–863, 2007.

But I didn't find this post-process in this library.

Did I miss something?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.