GithubHelp home page GithubHelp logo

maximecharpentierdata / connect-four-rl Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 1.0 986 KB

Deep Reinforcement Learning for Connect-4

License: MIT License

Python 87.60% Jupyter Notebook 12.40%
reinforcement-learning reinforcement-learning-environments connect-4 deep-learning deep-reinforcement-learning

connect-four-rl's Introduction

Hi there 👋

  • 👨‍🎓 I am graduated from CentraleSupélec (Paris-Saclay University) engineering school in Data Sciences.
  • 📊 I chose to focus on Data Sciences and Data engineering because I think data and modeling can be powerful ressources to tackle challenges and to optimize processes.
  • 💻 I’m fluent in Python, Pandas/Scikit-learn/Tensorflow/PyTorch and in UNIX environments. I am trying to improve my DevOps practices and I wish to increase my skills regarding data engineering (GCP, AWS, Kubernetes...) in order to master the full value chain of data.
  • 🏛️ 🌍 As a 21st century engineer, I feel involved in the main challenges of our time, especially in the energy transition, by trying to educate myself and to reduce my impact.
  • 💪 I love (and need) to feel strongly involved in the projects I am working on. This involvement is my fuel to get me go on and progress.
  • 🏃 I like running, cooking, and horror movies.

connect-four-rl's People

Contributors

castavo avatar corentinbrtx avatar maximecharpentierdata avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

kluivert-paulo

connect-four-rl's Issues

Batched MSE

Use some batching (like for example using several episodes and not just one) for computing MSE

Improve parallelisation

I don't know if we would benefit from this but maybe force pytorch to run on a single cpu in order to run different models concurrently

Grid searches

Actually grid searches, even if we don't have the best ways of doing it yet

Problème multi-agent

Je pense qu'on a un problème avec notre implémentation du multi-agent. Dans notre implémentation, le joueur 2 choisit son action avant de voir l'action faite par le joueur 1, donc faut revenir à un schéma où une action correspond à l'action d'un seul joueur.

Mais dans ce cas, comment ont dit à un joueur qu'il a perdu ?
On peut soit lui rendre -1 sur l'action suivante dans tous les cas, soit stocker l'action et l'état précédent et lui faire refaire sa dernière étape en lui rendant -1 cette fois, soit ... ?

Vous en pensez quoi ?

Better stochasticity

Make randomized opponents behave better, in some way, or rather make good opponents non-deterministic, in some way.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.