GithubHelp home page GithubHelp logo

es's Introduction


Evolution Strategies as Scalable Alternative to Reinforcement Learning

This reposistory contains the code to train agents on any Gym, pyBullet, or MuJoCo environment using an Evolution Strategy (ES) algorithm. It's adapted from this OpenAI implementation of the distributed Evolution-Strategy (ES) introduced in Evolution Strategies as Scalable Alternative to Reinforcement Learning, Salimans et al. 2017.

This code was used to create the non-plastic baselines for our paper Meta-Learning through Hebbian Plasticity in Random Networks.

How to run

First, install dependencies. Use Python >= 3.8:

# clone project   
git clone https://github.com/enajx/ES

# install dependencies   
cd ES 
pip install -r requirements.txt

Next, use train_static.py to train an agent. You can train any of OpenAI Gym's or pyBullet environments:


# train agent to solve the racing car
python train_static.py --environment CarRacing-v0


# train agent specifying evolution parameters, eg. 
python train_static.py --environment CarRacing-v0 --generations 300 --popsize 200 --print_every 1 --lr 0.2 --sigma 0.1 --decay 0.995 --threads -1

Use python train_static.py --help to display all the training options:


train_static.py [--environment] [--popsize] [--print_every] [--lr] [--decay] [--sigma] [--generations] [--folder] [--threads]

arguments:
  --environment   Environment: any OpenAI Gym or pyBullet environment may be used
  --popsize       Population size.
  --print_every   Print and save every N steps.
  --lr            ES learning rate.
  --decay         ES decay.
  --sigma         ES sigma: modulates the amount of noise used to populate each new generation
  --generations   Number of generations that the ES will run.
  --folder        folder to store the evolved weights
  --threads       Number of threads used to run evolution in parallel.

Once trained, use evaluate_static.py to test the evolved agent:


python evaluate_static.py --environment CarRacing-v0 --path_weights weights.dat

When running on a headless server some environments will require a virtual display to run -eg. CarRacing-v0-, in this case run:


xvfb-run -a -s "-screen 0 1400x900x24 +extension RANDR" -- python train_static.py --environment CarRacing-v0

Citation

If you use the code for academic or commecial use, please cite the associated paper:

@inproceedings{Najarro2020,
	title = {{Meta-Learning through Hebbian Plasticity in Random Networks}},
	author = {Najarro, Elias and Risi, Sebastian},
	booktitle = {Advances in Neural Information Processing Systems},
	year = {2020},
	url = {https://arxiv.org/abs/2007.02686}
}

Some notes on training performance

In the paper we have tested the CarRacing-v0 and AntBulletEnv-v0 environments. For both of them we have written custom functions to bound the action activations; the rest of the environments have a simple clipping mechanism to bound their actions. Environments with a continuous action space (ie. Box) are likely to benefit from a continous scaling -rather than clipping- of their action spaces, either with a custom activation function or with Gym's RescaleAction wrapper.

Another element that greatly affects performance -if you have bounded computational resources- is the choice of a suitable early stop meachanism such that less CPU cycles are wasted, eg. for the CarRacing-v0 environment we use 20 consecutive steps with negative reward as an early stop signal.

Finally, some pixel-based environments would likely benefit from using grayscaling + stacked frames approach rather than feeding the network the three RGB channels as we do in our implementation, eg. by using Gym's Frame stack wrapper or the Atari preprocessing wrapper.

es's People

Contributors

enajx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.