GithubHelp home page GithubHelp logo

master-thesis-experiments's Introduction

master-thesis-experiments

This repository implements Proximal Policy Optimization. (Schulman et al., https://arxiv.org/pdf/1707.06347.pdf)

Supported environments

OpenAI Gym:

"Hopper-v2", "Walker2d-v2", "FetchReach-v1", "InvertedDoublePendulum-v2"

Setup

Setup your python3.7 virtual environment of choice. I recommend Anaconda.

Install all python dependencies.

If you use Anaconda, you can utilize the environment.yml file as follows:

$ conda env create -f environment.yml

If you use virtualenv, you must install all dependencies manually.

Install Jupyter Notebook to evaluate experiments after running:

Acquire a valid MuJoCo license and install the MuJoCo version 2.0 binaries for Linux or OSX:

Usage

Get help:

$ python run.py --help

Run with default arguments:

$ python run.py

Evaluate a specific experiment with 6 different trials in parallel:

First, uncomment or modify the experiment of choice in "experiments.py". By default, the experiment "test" runs a clipped PPO policy for 100000 steps on "Hopper-v2". Editing the hyper-parameters of choice should be straightforward.

Then run the experiments file.

$ python experiments.py 

Of you want to track the learning progress with tensorboard open a new bash and use:

$ tensorboard --logdir . --port 6999

Logs

Tensorboard directory:

data/tensorboard/DATE

Logs of run.py

data/policy/logs/ENVIRONMENT/DATE

Logs of experiments.py

data/policy/results/EXPERIMENT/ENVIRONMENT/DATE

Generate graphs from the logs:

Open graphs.ipynb with your Jupyter Notebook App.

Plot the experiment of choice. Each plot will be generated from the last N runs of the directory:

data/policy/results/EXPERIMENT/ENVIRONMENT/

Example graph for the "test" experiment for 6 evaluated runs on the Hopper-v2 environment:

env = "Hopper-v2"
cats = [f"test"]
exp = "test"
labels = ["test"]
legend_title = 'Test Graph'

ax = plot_resampled(env,cats) # create resampled seaborn plot
ax.set_ylabel("Average Reward")
ax.set_xlabel(r"Timesteps ($\times 10^6$)")
ax.legend(title=legend_title, loc='lower right',fancybox=True, framealpha=0.5, labels=labels)
ax.set_xticklabels(['{:,.1f}'.format(x) for x in ax.get_xticks()/1_000_000])
ax.set_title(env)
figure = ax.get_figure() 
figure.savefig(f"{exp}-{env}.pdf", bbox_inches="tight") # save figure to "test-Hopper-v2.pdf"

master-thesis-experiments's People

Contributors

2mawi2 avatar

Stargazers

STYLIANOS IORDANIS avatar Yannick Lamprecht avatar

Watchers

James Cloos avatar paper2code - bot avatar

Forkers

stjordanis

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.