GithubHelp home page GithubHelp logo

cup's Introduction

CircleCI License: MIT Python 3.6+ Code style: black Zulip Chat

CUP: Critic-Guided Policy Reuse

This repository is the official implementation of CUP: Critic-Guided Policy Reuse], which has been accepted by NeurIPS 2022. Please create an issue if you have any problems!

Contents

  1. Introduction

  2. Setup

  3. Usage

Introduction

Setup

  • Install dependencies:
    git init
    git add .
    git commit -m init
    pip install -r requirements/dev.txt
    cd ./src
    git clone [email protected]:NagisaZj/metaworld-cup.git
    git clone [email protected]:NagisaZj/mtenv.git
    cd ./src/mtenv
    pip install -e .
    cd ../metaworld-cup
    pip install -e .
    

Usage

CUP:

CAUTION: Remember to replace setup.load_dir, setup.load_dir_2, and setup.load_dir_3 with your own absolute path to the corresponding directories.

CUDA_VISIBLE_DEVICES=7 OPENBLAS_NUM_THREADS=4 PYTHONPATH=. python3 -u main.py \
setup=metaworld \
env=metaworld-push-back \
env.task_idx=-1 \
env.fix_goal=0 \
agent=state_sac \
experiment.num_eval_episodes=1 \
experiment.num_train_steps=1000000 \
setup.seed=1695 \
experiment.eval_freq=5000 \
replay_buffer.batch_size=1280 \
agent.multitask.num_envs=1 \
agent.multitask.should_use_disentangled_alpha=True \
agent.encoder.type_to_select=identity \
agent.multitask.should_use_multi_head_policy=False \
agent.multitask.should_use_disjoint_policy=False \
agent.multitask.should_use_task_encoder=True \
agent.multitask.actor_cfg.should_condition_model_on_task_info=False \
agent.multitask.actor_cfg.should_condition_encoder_on_task_info=True \
agent.multitask.actor_cfg.should_concatenate_task_info_with_encoder=True \
setup.relabel_num_tasks=1 \
setup.relabel_range=10 \
setup.load_dir=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_43_2/model \
setup.load_dir_2=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_43_2/model \
setup.load_dir_3=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_253/model \
setup.load=1 \
setup.load_log_std_bounds=[-20,2]

SAC baseline: Just set agent.use_expert to 0 in the corresponding config file (config/agent/state_sac.yaml), or pass arguments with commands, for example:

CUDA_VISIBLE_DEVICES=7 OPENBLAS_NUM_THREADS=4 PYTHONPATH=. python3 -u main.py \
setup=metaworld \
env=metaworld-push-back \
env.task_idx=-1 \
env.fix_goal=0 \
agent=state_sac \
experiment.num_eval_episodes=1 \
experiment.num_train_steps=1000000 \
setup.seed=1695 \
experiment.eval_freq=5000 \
replay_buffer.batch_size=1280 \
agent.multitask.num_envs=1 \
agent.multitask.should_use_disentangled_alpha=True \
agent.encoder.type_to_select=identity \
agent.multitask.should_use_multi_head_policy=False \
agent.multitask.should_use_disjoint_policy=False \
agent.multitask.should_use_task_encoder=True \
agent.multitask.actor_cfg.should_condition_model_on_task_info=False \
agent.multitask.actor_cfg.should_condition_encoder_on_task_info=True \
agent.multitask.actor_cfg.should_concatenate_task_info_with_encoder=True \
setup.relabel_num_tasks=1 \
setup.relabel_range=10 \
setup.load_dir=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_43_2/model \
setup.load_dir_2=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_43_2/model \
setup.load_dir_3=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_253/model \
setup.load=1 \
setup.load_log_std_bounds=[-20,2] \
agent.multitask.use_expert=0

Other available environments can be seen in ./config/env.

cup's People

Contributors

nagisazj avatar

Stargazers

 avatar tenderzada avatar  avatar Alex Z. Yin avatar  avatar

Watchers

 avatar

Forkers

yyds-xtt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.