GithubHelp home page GithubHelp logo

idonotlikemondays / melts Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ghiara/melts

0.0 0.0 0.0 55.55 MB

Context-Based Meta-Reinforcement Learning with Bayesian Nonparametric Models (MELTS)

License: MIT License

Shell 0.03% Ruby 0.63% Python 99.34%

melts's Introduction

Context-Based Meta-Reinforcement Learning with Bayesian Nonparametric Models

Official implementation of Context-Based Meta-Reinforcement Learning with Bayesian Nonparametric Models (MELTS).

Links to the weights and videos of MELTS(DPMM) and MELTS(Gauss) to cheetah-eight-tasks

https://drive.google.com/drive/folders/1P9HWUf7PHz-Z4bOq9n-QNlLblqq8LKpT?usp=sharing

https://videoviewsite.wixsite.com/melts

The implementation is based on rlkit, PEARL and rand_param_envs.

Requirements

  • To install locally, you will need to first install MuJoCo. For the task distributions in which the reward function varies (Cheetah, Ant), install MuJoCo200. (following is deprecated and will be removed) For the task distributions where different tasks correspond to different model parameters (Walker and Hopper), MuJoCo131 is required. Simply install it the same way as MuJoCo200.

  • Set LD_LIBRARY_PATH to point to both the MuJoCo binaries (/$HOME/.mujoco/mujoco200/bin) as well as the gpu drivers (something like /usr/lib/nvidia-390, you can find your version by running nvidia-smi).

  • For the remaining dependencies, we recommend using miniconda. Use the latest yml file (conda env create -f setup/melts.yml) to set a conda virtual machine, or install the packages via pip install -r setup/requirements.txt. Make sure the correct GPU driver is installed and you use a matching version of CUDA toolkit and torch. We use torch 1.7.0 with cuda version 11 for our evaluations (pip install torch==1.7.0+cu110 -f https://download.pytorch.org/whl/torch_stable.html).

  • We created versions of the standard Mujoco environments in the folder submodules/meta_rand_envs. To perform experiments, both submodules meta_rand_envs and rand_param_envs (already included) must be installed in dev mode on the created env. For installation, perform the following steps for meta_rand_envs (rand_param_envs analogously):

cd submodules/meta_rand_envs
pip install -e .

This installation has been tested only on 64-bit Ubuntu 18.04.

Training

To train the models in the paper, activate the python env and run these commands:

python runner.py configs/c8/c8-dpmm.json
python runner.py configs/c8/c8-gauss.json

  • The experiment results will be stored under ./output/[ENV]/[EXP NAME].
  • The important parameters for the environment are listed in the respective json file. Further parameters can be found in the file configs/default.py including small descriptions for each parameter.
  • By default the code will use the GPU - to use CPU instead, set use_gpu=False in the corresponding config file.

Evaluation

We periodically evaluate the algorithm during training. We provide online evaluation via tensorboard located in the /tensorboard folder. Use tensorboard --logdir=./output/[ENV]/[EXP NAME]/tensorboard for visualizations of learning curves and the current embeddings. For further visual demonstrations after training, add/adjust

"path_to_weights": "output/cheetah-multi-task/experiment_name/weights",
"showcase_itr": 2000,
"train_or_showcase": "showcase_all",

at the top level of the corresponding training json (here: c8-showcase.json). Then, activate the python env and run the modified json file, e.g. as

python runner.py configs/c8/c8-showcase.json

The generated videos are stored in a new folder under ./output/[ENV]/[EXP NAME]. Results for task inference results are obtained analogously via setting "train_or_showcase": "showcase_task_inference".

melts's People

Contributors

ghiara avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.