actor-critic-alignment's Introduction

Actor-Critic Alignment (ACA)

Code of our ICML`23 paper: Actor-Critic Alignment for Offline-to-Online Reinforcement Learning

Installation

Pull this repo

git clone [email protected]:ZishunYu/ACA.git; cd ACA

Create conda virtual env

conda create --name ACA python=3.7.4; conda activate ACA

Install MuJoCo200 following the official documentation

Install d4rl

git clone https://github.com/rail-berkeley/d4rl.git
cd d4rl; pip3 install -e .; cd ..

Install requirements
```
pip3 install -r requirements.txt
```

Run ACA

Download offline pretrained models from here (Google drive)

Run experiment with

python3 run_aca.py --dataset hopper-medium-v2 --seed 1

Troubleshooting

MuJoCo installation troubleshooting, see MuJoCo official git page

ImportError: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory, try setting the lib path before running experiment

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/PATH/TO/CONDA/envs/ACA/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/YOUR_USER_NAME/.mujoco/mujoco200/bin

OSError: /some/path/mujoco/libmujoco200.so: undefined symbol: __glewBindBuffer, try install libglfw3 and libglew2.0 by
```
conda install -c menpo glfw3
conda install -c conda-forge glew==2.0.0
```

Reference

@InProceedings{pmlr-v202-yu23k,
  title = 	 {Actor-Critic Alignment for Offline-to-Online Reinforcement Learning},
  author =       {Yu, Zishun and Zhang, Xinhua},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {40452--40474},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/yu23k/yu23k.pdf},
  url = 	 {https://proceedings.mlr.press/v202/yu23k.html},
}

actor-critic-alignment's People

Contributors

Stargazers

Watchers

actor-critic-alignment's Issues

medium-replay pre-trained models

Thanks to @mikiya1213, it has been brought to my attention that the medium-replay models have incorrect model attribute naming, preventing the online fine-tuning from them. I've re-uploaded the medium-replay models, and have tested on my end.

Error in Antmaze environments

When I run run_aca.py for antmaze, this error will occur:
AttributeError: module 'd3rlpy.preprocessing.reward_scalers' has no attribute 'ConstantShiftRewardScaler'

My conda environment satisfy all the requirements listed in requirementstxt, maybe the requirement for d3rlpy is wrong?

Recommend Projects

zishunyu / actor-critic-alignment Goto Github PK

actor-critic-alignment's Introduction

Actor-Critic Alignment (ACA)

Installation

Run ACA

Troubleshooting

Reference

actor-critic-alignment's People

Contributors

Stargazers

Watchers

actor-critic-alignment's Issues

medium-replay pre-trained models

Error in Antmaze environments

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs