phasic-policy-gradient's Introduction

Status: Archive (code is provided as-is, no updates expected)

Phasic Policy Gradient

[Paper]

This is code for training agents using Phasic Policy Gradient (citation) .

Supported platforms:

macOS 10.14 (Mojave)
Ubuntu 16.04

Supported Pythons:

3.7 64-bit

Install

You can get miniconda from https://docs.conda.io/en/latest/miniconda.html if you don't have it, or install the dependencies from environment.yml manually.

git clone https://github.com/dachr8/phasic-policy-gradient.git
conda env update --name ppg --file phasic-policy-gradient/environment.yml
conda activate ppg
pip install -e phasic-policy-gradient

Reproduce

PPG with default hyperparameters:

nohup mpiexec -np 4 python -m phasic_policy_gradient.train > /tmp/ppg.out &

PPO baseline:

nohup mpiexec -np 4 python -m phasic_policy_gradient.train --n_epoch_pi 3 --n_epoch_vf 3 --n_aux_epochs 0 --arch shared  --log_dir '/tmp/ppo' > /tmp/ppo.out &

PPG, using L_KL instead of L_clip:

nohup mpiexec -np 4 python -m phasic_policy_gradient.train --clip_param 0 --kl_penalty 1  --log_dir '/tmp/ppgkl' > /tmp/ppgkl.out &

PPG, single network variant:

nohup mpiexec -np 4 python -m phasic_policy_gradient.train --arch detach  --log_dir '/tmp/ppg_single_network' > /tmp/ppg_single_network.out &

Visualize

Operating directory: project directory

PPG with default hyperparameters (tmp/ppg-run0):

python -m phasic_policy_gradient.graph --experiment_name ppg --save

PPO baseline (tmp/ppo-run0):

python -m phasic_policy_gradient.graph --experiment_name ppo --save

PPG, using L_KL instead of L_clip (tmp/ppgkl-run0):

python -m phasic_policy_gradient.graph --experiment_name ppgkl --save

PPG, single network variant (tmp/ppgsingle-run0):

python -m phasic_policy_gradient.graph --experiment_name ppg_single_network --save

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.

Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

TensorFlow

An Open Source Machine Learning Framework for Everyone

Django

The Web framework for perfectionists with deadlines.

Laravel

A PHP framework for web artisans

D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

web

Some thing interesting about web. New door for the world.

server

A server is a program made to process requests and deliver data to clients.

Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

Visualization

Some thing interesting about visualization, use data art

Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.

Microsoft

Open source projects and samples from Microsoft.

Google

Google ❤️ Open Source for everyone.

Alibaba

Alibaba Open Source for everyone

D3

Data-Driven Documents codes.

Tencent

China tencent open source team.

dachr8 / phasic-policy-gradient Goto Github PK