GithubHelp home page GithubHelp logo

zishunyu / actor-critic-alignment Goto Github PK

View Code? Open in Web Editor NEW
10.0 1.0 0.0 18 KB

Implementation of ``Actor-Critic Alignment for Offline-to-Online Reinforcement Learning''

Home Page: https://proceedings.mlr.press/v202/yu23k.html

License: MIT License

Python 100.00%
offline-to-online actor-critc-alignment reinforcement-learning fine-tuning-rl finetuning-rl offline-to-online-rl

actor-critic-alignment's Introduction

Actor-Critic Alignment (ACA)

Code of our ICML`23 paper: Actor-Critic Alignment for Offline-to-Online Reinforcement Learning

Installation

  1. Pull this repo

    git clone [email protected]:ZishunYu/ACA.git; cd ACA
    
  2. Create conda virtual env

    conda create --name ACA python=3.7.4; conda activate ACA
    
  3. Install MuJoCo200 following the official documentation

  4. Install d4rl

    git clone https://github.com/rail-berkeley/d4rl.git
    cd d4rl; pip3 install -e .; cd ..
    
  5. Install requirements

    pip3 install -r requirements.txt
    

Run ACA

  1. Download offline pretrained models from here (Google drive)
  2. Run experiment with
    python3 run_aca.py --dataset hopper-medium-v2 --seed 1
    

Troubleshooting

  1. MuJoCo installation troubleshooting, see MuJoCo official git page
  2. ImportError: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory, try setting the lib path before running experiment
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/PATH/TO/CONDA/envs/ACA/lib
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/YOUR_USER_NAME/.mujoco/mujoco200/bin
    
  3. OSError: /some/path/mujoco/libmujoco200.so: undefined symbol: __glewBindBuffer, try install libglfw3 and libglew2.0 by
    conda install -c menpo glfw3
    conda install -c conda-forge glew==2.0.0
    

Reference

@InProceedings{pmlr-v202-yu23k,
  title = 	 {Actor-Critic Alignment for Offline-to-Online Reinforcement Learning},
  author =       {Yu, Zishun and Zhang, Xinhua},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {40452--40474},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/yu23k/yu23k.pdf},
  url = 	 {https://proceedings.mlr.press/v202/yu23k.html},
}

actor-critic-alignment's People

Contributors

zishunyu avatar

Stargazers

 avatar Xyu Chern avatar Thiago avatar Zhiwei Shang avatar BBD avatar Lisonglin avatar  avatar Yinmin.Zhang avatar Kilian Freitag avatar tqchen avatar

Watchers

 avatar

actor-critic-alignment's Issues

medium-replay pre-trained models

Thanks to @mikiya1213, it has been brought to my attention that the medium-replay models have incorrect model attribute naming, preventing the online fine-tuning from them. I've re-uploaded the medium-replay models, and have tested on my end.

Error in Antmaze environments

When I run run_aca.py for antmaze, this error will occur:
AttributeError: module 'd3rlpy.preprocessing.reward_scalers' has no attribute 'ConstantShiftRewardScaler'

My conda environment satisfy all the requirements listed in requirementstxt, maybe the requirement for d3rlpy is wrong?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.