GithubHelp home page GithubHelp logo

rl-her's Introduction

Deep RL - Hindsight Experience Replay (HER)

This repository holds deep RL solutions for solving the bit flipping enviroment using the hindsight experience replay.
View the original paper here.

Bit-Flip Environment

In this environment, we are given with a starting state with is a binary vector of size n, and a goal state of size n.
In each action, the user can flip one of the bits in the current state. For each step, the user gets a reward '-1',
and for a step which makes the current state equal to the goal, the user gets a reward '0'.
The environment is written in the bit_flip_env.py file.

Bit-Flip Dynamic Environment

To check the ability of HER to deal with dynamic environments, we added this option to the bit flipping domain.
This means that with every step the user makes, with probability 0.3, one of the goal's bits would flip,
making it harder to predict. The goal's flipped bit is chosen with uniform probability.

Hindsight Experience Replay (HER)

The algorithm, described in details here by Andrychowicz et al. can deal with sparse binary rewards (as we get in the bit flipping domain.
The problem with sparse rewards, is that for very large state spaces, we might never get a succesful episode, making it very hard to learn.
In this algorithm, we create new "fake" episodes from unsuccesful ones, by chaging their original goal to one of the states they actually reached.
This way, we add successes to the experience replay buffer, and can learn from them. It is basically the same as learning from mistakes.

Hindsight Experience Replay with Dynamical goals (DHER)

The concept here is very similar to HER, and is described here by Fang et al.
This algorithm takes also into account that the goal made some trasitions over time, and uses its trajectory to learn how to reach it.

Scripts Usage:

All the files below have arguments which can be changed (but all set by default to our choice of parameters).
To see all arguments for each script run: <SCRIPT NAME>.py --help
Example for running a script: python main.py

Train scripts:

To train the model that solves the bit flipping environment, run the following scripts: main.py
. Note that the argument --state-size <NUMBER> is neccesary, in order to see the effect of the different sizes on the model.
Adding the argument --HER or --DHER would use the respective algorithms.
Adding the argument --dynamic would use the dynamical mode of the environment. The models architecture is specified in: dqn.py

Evaluation scripts:

To test the models run the following scripts: evaluate_model.py with the relevant --state-size argument.
We added a trained model in the bit_flip_model.pkl file, with the size n=10.

Results

In the above figure, we show how the state size affects the success ratio of the different algorithms.
As can be seen, using HER, allows us to overcome the binary sparse reward problem, and maintain high success rate even for very high state spaces.
This even works when compared to the normal DQN, with added Reward Shaping.

Results of using HER

Example

In the following example, we can ovserve how the domain is solved step by step using the HER algorithm.

Example of using HER to solve the domain with size 10

rl-her's People

Contributors

orilinial avatar

Stargazers

 avatar

Watchers

paper2code - bot avatar

Forkers

xindaq

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.