GithubHelp home page GithubHelp logo

poja / cattus Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 0.0 812 KB

Cattus is a chess engine based on DeepMind AlphaZero paper, written in Rust. It uses a neural network to evaluate positions, and MCTS as a search algorithm.

Rust 82.14% Python 17.86%
ai artificial-intelligence chess

cattus's People

Contributors

barakugav avatar poja avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

cattus's Issues

Remove "model_id"

@barakugav
It's not so important - I'm just (ab)using this "Issues" platform to discuss something in the code:

What's the goal of the model_id function? Can I remove it and just name the directories with the self-play data according to the time they were created? Or if we want to label them by the model that is used, can we use the date of the model creation as its "id"? Why make this complicated hash function?

Create evaluation mechanism

We create and trains models, we should be able to evalute them during the trainling loop.
This is required to understand the learning rate

Correctly update score_w when backpropagating

From Wikipedia:
If white loses the simulation, all nodes along the selection incremented their simulation count (the denominator), but among them only the black nodes were credited with wins (the numerator). If instead white wins, all nodes along the selection would still increment their simulation count, but among them only the white nodes would be credited with wins. In games where draws are possible, a draw causes the numerator for both black and white to be incremented by 0.5 and the denominator by 1.

make get_legal_moves fast

Currently we iterate over all tiles for each call, we can maintain some data structure within the position to iterate over these faster.
Need to measure the performance bottleneck before optimizing this.

Score range

The network output range is [-1,1] while our current convension in the MCTS implementation is [0,1], need to aligned the two.
I think [-1,1] is more intuitive, in which case we should change the draw value to 0 and inverte scores by "-score" rather than "1-score"

MCTS: expand all children, simulate only one

currently we are expanding all children and simulating them all, need to simulate only one.
In selection we assume if a node have any child that all its children exists and simulated, after this change need to consider non simulated children first.

Training: learn from random games produced by many models

Currently each training iteration learns from the self play games of the last model only.
We should learn from games produced by all models, and always take X random games from the Y latest ones.
This will allow us to reduce the self play games without damaging the training.

Use the same directory for all self play games.
Add X,Y to confiuration file.
Set 'epoches=1'

MCTS: use the network per move probability output as initial node score

Currently, if no simulation were performed for a leaf, the function calc_selection_heuristic return f32::MAX.
Instead, in the first type a node is expanded, the two-headed network should be run, resulting in a scalar value [-1,1] for propagation and per-more probability which we should assign to the new expanded leafs and us in calc_selection_heuristic instead of f32::MAX.

MCTS: use our own tree implementation instead of petgraph

We will suffer from the overhead of accessing an element in the petgraph, our own tree implementation can use pointers.
Need to check how much pain this requires in Rust and if this is worth it.
This is a task for the future, we are good for now.

Solve Hex! and figure out hyperparams

Train a two headed network until the engine wins against us consistently.
Understand how long such training requires and with what hyper parameters?:
learning rate
games generated in self play
temperature for softmax, for how many moves should we use softmax
model structure
should we take a single position from each game or more

a lot of this can be taken from lc0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.