megayeye / reinforcement-learning-algorithms Goto Github PK

This project forked from tianhongdai/reinforcement-learning-algorithms

This repository contains most of classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, A3C, PPO, TRPO. (More algorithms are still in progress)

License: MIT License

Python 100.00%

reinforcement-learning-algorithms's Introduction

Deep Reinforcement Learning Alogrithms

This repository will implement the classic deep reinforcement learning algorithms. The aim of this repository is to provide clear code for people to learn the deep reinforcement learning algorithm. In the future, more algorithms will be added and the existing codes will also be maintained.

Deep Q-Learning Network(DQN)
Double DQN(DDQN)
Dueling Network Architecture(Dueling DQN)
Deep Deterministic Policy Gradient(DDPG)
Advantage Actor-Critic(A2C)
Trust Region Policy Optimization(TRPO)
Proximal Policy Optimization(PPO)
Actor Critic using Kronecker-Factored Trust Region(ACKTR)

Update Information

2018-10-17 - In this update, most of algorithms have been imporved and add more experiments with plots (except for DPPG). The PPO now supports atari-games and mujoco-env. The TRPO is much stable and can have better results!

TODO List

add prioritized experience replay.
in the future, we will not use openai baseline's pre-processing functions.
improve the DDPG.

Requirements

python-3.5.2
openai-gym
mujoco-py-1.50.1.56
pytorch-0.4.0
openai-baselines

Installation

install the pytorch

plase go to official webisite to install it: https://pytorch.org/

Recommend use Anaconda Virtual Environment to manage your packages

install openai-baselines (the openai-baselines update so quickly, please use the older version as blow, will solve in the future.)

# clone the openai baselines
git clone https://github.com/openai/baselines.git
cd baselines
git checkout 366f486
pip install -e .

Instructions

select the suitable algorithms

cd <the-rl-algorithm>

all of the parameters are defined in the arguments.py, you can train your model with suitable hyper-parameters.
train the networks

python train_network.py --env-name=<env-name> --cuda (only TRPO not support GPU) --<other-flags>

test the networks

python demo.py --env-name=<env-name>

download the pre-trained models
Please download them from the Google Driver, then put the saved_models under the corresponding algorithm's folder.