poja / cattus Goto Github PK
View Code? Open in Web Editor NEWCattus is a chess engine based on DeepMind AlphaZero paper, written in Rust. It uses a neural network to evaluate positions, and MCTS as a search algorithm.
Cattus is a chess engine based on DeepMind AlphaZero paper, written in Rust. It uses a neural network to evaluate positions, and MCTS as a search algorithm.
@barakugav
It's not so important - I'm just (ab)using this "Issues" platform to discuss something in the code:
What's the goal of the model_id
function? Can I remove it and just name the directories with the self-play data according to the time they were created? Or if we want to label them by the model that is used, can we use the date of the model creation as its "id"? Why make this complicated hash function?
Like hard-coded paths
CPU/GPU
"cargo run" always run debug by default
We create and trains models, we should be able to evalute them during the trainling loop.
This is required to understand the learning rate
mini-batches of 2,048 positions each, sampled uniformly at random from all positions of the most recent 500,000 games
Maybe githubpull requests with github CLI
From Wikipedia:
If white loses the simulation, all nodes along the selection incremented their simulation count (the denominator), but among them only the black nodes were credited with wins (the numerator). If instead white wins, all nodes along the selection would still increment their simulation count, but among them only the white nodes would be credited with wins. In games where draws are possible, a draw causes the numerator for both black and white to be incremented by 0.5 and the denominator by 1.
currently fixed to '0.5 * (0.0001)'. add to config and propagate to model creation
Evaluation should consists of 400 games. If the new player wins >55% then it becomes the best player
Model.fit return some sort of history containing metrics which might be important information
Currently we iterate over all tiles for each call, we can maintain some data structure within the position to iterate over these faster.
Need to measure the performance bottleneck before optimizing this.
The network output range is [-1,1] while our current convension in the MCTS implementation is [0,1], need to aligned the two.
I think [-1,1] is more intuitive, in which case we should change the draw value to 0 and inverte scores by "-score" rather than "1-score"
bitmap of planes
input layer, output layers, no hidden layers
see https://docs.google.com/document/d/1QFNRRoPtywh1sWIAnhjxpQQmwHu7MICuaK1KHbNEvao exact MCTS selection calculation
currently we are expanding all children and simulating them all, need to simulate only one.
In selection we assume if a node have any child that all its children exists and simulated, after this change need to consider non simulated children first.
https://docs.rs/chess/ is a nice option for a chess library, supporting legal moves query, game status (checkmate, stalemate, ect) detection and so on.
We should write a wrapper to the library we choose, enabling our generic MCTS implementation to use it.
Currently each training iteration learns from the self play games of the last model only.
We should learn from games produced by all models, and always take X random games from the Y latest ones.
This will allow us to reduce the self play games without damaging the training.
Use the same directory for all self play games.
Add X,Y to confiuration file.
Set 'epoches=1'
Currently, if no simulation were performed for a leaf, the function calc_selection_heuristic return f32::MAX.
Instead, in the first type a node is expanded, the two-headed network should be run, resulting in a scalar value [-1,1] for propagation and per-more probability which we should assign to the new expanded leafs and us in calc_selection_heuristic instead of f32::MAX.
Instead of only [-1,1] score, the network should contain two "heads", the first outputing a scalar [-1,1] score (which is the "simulation" value) and the second outputing a per-move probability values
We will suffer from the overhead of accessing an element in the petgraph, our own tree implementation can use pointers.
Need to check how much pain this requires in Rust and if this is worth it.
This is a task for the future, we are good for now.
currently using kl-divergence which doesn't realy match the cross entropy loss
Train a two headed network until the engine wins against us consistently.
Understand how long such training requires and with what hyper parameters?:
learning rate
games generated in self play
temperature for softmax, for how many moves should we use softmax
model structure
should we take a single position from each game or more
a lot of this can be taken from lc0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.