GithubHelp home page GithubHelp logo

async-deep-flappybird's Introduction

Asynchronous Deep ReinFlappyBird

This repository contains an implementation of Asynchronous Advantage Actor-Critic (A3C) that teaches an agent to play Flappy Bird.

Performance

Coming soon!

Technical Details

For my tests, these are the training speeds when using a CPU (Intel Xeon E5620 2.40 GHz) or GPU (NVIDIA GTX1070).

FF LSTM
CPU 57 steps/s TBA steps/s
GPU 400 steps/s 300 steps/s

Settings

Here are some of the available flags you can set when you train an agent. For the full list, see a3c.py.

Agent settings

  • mode / [train, display, visualize] - Which mode you want to activate when you start a session.
  • use_gpu / [True, False] - If you have a/want to use GPU to speed up the training process.
  • parallel_agent_size - Number of parallel agents to use during training.
  • action_size - Numbers of available actions.
  • agent_type / [FF, LSTM] - What type of A3C to train the agent with.

Training and Optimizer settings

The current settings are based on or borrowed from the [implemenentation] (https://github.com/miyosuda/async_deep_reinforce) by @miyosuda. They have not yet been optimized for Flappy Bird but rather used as is for now. Tell me settings that perform better than the current ones!

  • max_time_step - 40 000 000 - Maximum training steps.
  • initial_alpha_low - -5 - LogUniform low limit for learning rate (represents x in 10^x).
  • initial_alpha_high - -3 - LogUniform high limit for learning rate (represents x in 10^x).
  • gamma - 0.99 - Discount factor for rewards.
  • entropy_beta - 0.01 - Entropy regularization constant.
  • grad_norm_clip - 40.0- Gradient norm clipping.
  • rmsp_alpha - 0.99 - Decay parameter for RMSProp.
  • rmsp_epsilon - 0.1 - Epsilon parameter for RMSProp.
  • local_t_max - 5- Repeat step size.

Logging

  • log_level - Log level [NONE, FULL]
  • average_summary - How many episodes to average summary over.

Display

  • display_episodes - Numbers of episodes to display.
  • average_summary - How many episodes to average summary over.
  • display_log_level - Display log level - NONE prints end summary, MID prints episode summary and FULL prints the ฯ€-values, state value and reward for every state. [NONE, MID, FULL]

Getting started

To start a training session with the default parameters, run:

$ python a3c.py

To check your progress and possibly compare different experiments in real time, navigate to your async-deep-flappybird folder and start tensorboard by running:

$ tensorboard --logdir summaries/

Enjoy!

Credit

A3C - The A3C implementation used is a modified version by @miyosuda.

Flappy Bird - The Flappy Bird implementation is based on a version by @yenchenlin with som minor adjustments.

โ€”

2016, Babak Toghiani-Rizi

async-deep-flappybird's People

Contributors

darkforte avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

async-deep-flappybird's Issues

What is the magic to make PyGame capable for multithreading?

Hi,

Thank you for your wonderful work on FlappyBird A3C!

I surveyed some other solutions training FlappyBird with A3C on Github, and tried out my own as well. I found many people argue that PyGame does not support multithreading (here), and so if you want to use A3C you have to do it by multiprocessing and implement the troublesome communication between processes. But I see you are just using multithreading here and it works well. Would you please share your idea on how did you achieve it?

Thank you!

How did you run 400 steps/sec with GTX 1070?

Hi,

Recently I ran this on a Tesla P40, and I found it ran only 237 steps/sec. I'm wondering if GTX 1070 can run 400 steps/sec, there might be some mistakes that I have made.

I simply run the code with python a3c.py --use_gpu True, is that all? Or I missed something?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.