GithubHelp home page GithubHelp logo

batra98 / monte_carlo-td-function_approximation Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 5.31 MB

Use Monte-Carlo (MC) Methods and Temporal Difference (TD) Learning on couple of games and toy problems.

License: MIT License

Jupyter Notebook 50.66% Python 1.24% HTML 48.10%
monte-carlo-methods temporal-difference-algorithms reinforcement-learning-algorithms

monte_carlo-td-function_approximation's Introduction

Monte-Carlo, TD Methods and Functional Approximation

Introduction

In this assignment, we will use Monte-Carlo (MC) Methods and Temporal Difference (TD) Learning on couple of games and toy problems. The problems as given below:

  1. Train an agent that plays the Tic-Tac-Toe using Monte-Carlo Methods.
  2. Train an agent that generates the optimal policy through TD-Methods in the Frozen-Lake Environment.
  3. Build a Deep Q-Learning Network (DQN) which can play Atari Breakout and get the best scores. I was not able to implement this component of the assignment, so instead I build a DQN which can play the cart-pole game.

Details of the problems are included in the respective folders.

๐Ÿ“ File Structure

.
โ”œโ”€โ”€ Q_1
โ”‚ย ย  โ”œโ”€โ”€ Mc_OffPolicy_agent.dat
โ”‚ย ย  โ”œโ”€โ”€ Mc_OnPolicy_agent
โ”‚ย ย  โ”œโ”€โ”€ Monte-Carlo_Methods(3).html
โ”‚ย ย  โ”œโ”€โ”€ Monte-Carlo_Methods.ipynb
โ”‚ย ย  โ”œโ”€โ”€ __pycache__
โ”‚ย ย  โ”œโ”€โ”€ base_agent.py
โ”‚ย ย  โ”œโ”€โ”€ best_td_agent.dat
โ”‚ย ย  โ”œโ”€โ”€ gym-tictactoe
โ”‚ย ย  โ”œโ”€โ”€ human_agent.py
โ”‚ย ย  โ”œโ”€โ”€ mc_agents.py
โ”‚ย ย  โ””โ”€โ”€ td_agent.py
โ”œโ”€โ”€ Q_2
โ”‚ย ย  โ”œโ”€โ”€ Expected_Sarsa.py
โ”‚ย ย  โ”œโ”€โ”€ Frozen_Lake_Through_TD_Methods.html
โ”‚ย ย  โ”œโ”€โ”€ Frozen_Lake_Through_TD_Methods.ipynb
โ”‚ย ย  โ”œโ”€โ”€ Q_Learning.py
โ”‚ย ย  โ”œโ”€โ”€ Sarsa.py
โ”‚ย ย  โ”œโ”€โ”€ __pycache__
โ”‚ย ย  โ””โ”€โ”€ frozen_lake.py
โ”œโ”€โ”€ Q_3
โ”‚ย ย  โ”œโ”€โ”€ DQN_Agent.py
โ”‚ย ย  โ”œโ”€โ”€ Function_Approximation_DQN.html
โ”‚ย ย  โ”œโ”€โ”€ Function_Approximation_DQN.ipynb
โ”‚ย ย  โ”œโ”€โ”€ __pycache__
โ”‚ย ย  โ””โ”€โ”€ cartpole-dqn.h5
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ assignment.pdf

7 directories, 21 files
  • Q_* - Contains files for respective problems along with trained models.
  • assignment.pdf - contains the all the problems statements of the assignment.

Future Work

At the time of doing the assignment, I did't have sufficient knowledge of DL to implement the last part of the assignment. I would like to complete this part of the assignment now.

monte_carlo-td-function_approximation's People

Contributors

batra98 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.