GithubHelp home page GithubHelp logo

pgkang / mcts-nnet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from philippew83440/mcts-nnet

0.0 0.0 0.0 4.68 MB

Monte Carlo Tree Search with Reinforcement Learning for Motion Planning

License: MIT License

Julia 19.62% Python 79.91% MATLAB 0.47%

mcts-nnet's Introduction

Monte Carlo Tree Search with Reinforcement Learning for Motion Planning

This repo contains the code used in the paper "Monte Carlo Tree Search with Reinforcement Learning for Motion Planning", IEEE ITSC 2020

The following algorithms are implemented and benchmarked:

  • Rules-based (reflex) method: a simple emergency braking method
  • Tree Search: Uniform Cost Search (A*) and Dynamic Programming
  • MPC: Model Predictive Control
  • Sampling based Tree Search with MCTS: Monte Carlo Tree Search
  • DDQN: Double Deep Q-learning using Deep Neural Networks
  • MCTS-NNET: MCTS combined with DDQN.

Ultimately we find that combining MCTS planning and DQN learning in a single solution provides the best performance with real-time decisions. Here, a pre-trained DQN network is used to guide the tree search, providing fast and reliable estimates of Q-values and state values. We call this model MCTS-NNET, as it leverages on the insights of AlphaGo Zero.

Our results demonstrate the performance of MCTS-NNET achieving a 98% success rate when compared to a 84% success rate for DQN and a 85% success rate for MCTS alone. This is possible with an inference time of 4 ms.

Presentation video of the paper

Julia source code (optimized for speed)

The code was initially developped in Python and later on optimized for speed in Julia.
The Julia versions are much faster than the Python versions.

cd julia
julia scn.jl mcts
julia scn.jl mpc
julia scn.jl ucs
julia scn.jl dp

Python source code

Models and algorithms:

Utilities used (from Stanford CS221 and CS230 courses):

Mcts-nnet inference

Baseline (reflex-based), dqn, mcts, mcts-nnet on 100 tests:

cd mdp
python3 test_algo.py --algo baseline
python3 test_algo.py --algo dqn
python3 test_algo.py --algo mcts
python3 test_algo.py --algo mpc
python3 test_algo.py --algo mcts-nnet

Collision Avoidance Scenario
Collision Avoidance Scenario

Mcts-nnet training

cd mdp
python3 train_dqn.py

Training results are stored in mdp/models
Cf trained model dnn-0.31.pth.tar

mcts-nnet's People

Contributors

philippew83440 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.