official-stockfish / wdl_model Goto Github PK
View Code? Open in Web Editor NEWFit a chess Win-Draw-Loss model from played games
License: GNU General Public License v3.0
Fit a chess Win-Draw-Loss model from played games
License: GNU General Public License v3.0
It'd be great to explicitly state the license of this repository.
On linux I get segmentation faults when --fixFEN
finds missing keys in the metadata, say. In itself this is not a problem, as the code should exit anyway. But a more graceful way to stop would be nice. I think the segmentation faults comes from the parallel execution of many PGN analysises, and maybe the exit(1)
leads to some unexpected states there. (Sadly, I do not know how to fix this.)
Sample output for me:
Missing "book_depth" key in metadata for .epd book for test pgns/23-09-23/650f26ffadc82c88993ddd80/650f26ffadc82c88993ddd80
pt pt 00
scoreWDLstat: external/chess.hpp:1872: virtual void chess::Board::placePiece(chess::Piece, chess::Square): Assertion `board_[sq] == Piece::NONE' failed.
file_from -1
Segmentation fault (core dumped)
On a fresh clone, I get as output for python download_fishtest_pgns.py --path pgns --subdirs --page 2
Found 0 fully downloaded tests in pgns/ already.
Downloading pgns to pgns/23-08-30/64efc1bdb0db1f4c8581ee29/ ...
Fetching 453 missing pgn files ...
Downloading pgns to pgns/23-08-31/64f0565fb0db1f4c8581ffc5/ ...
Then killing the process and checking the directory pgns/23-08-30/64efc1bdb0db1f4c8581ee29
gives an empty directory, whereas the second directory starts to get filled.
Will try to investigate tomorrow.
As of this morning, I get a segmentation fault when running update script.
> ./updateWDL.sh
started at: Mon 23 Oct 09:08:37 CEST 2023
Look recursively in directory pgns for games from SPRT tests using books matching "UHO_4060_v3.epd" for SF revisions between 70ba9de85cddc5460b1ec53e0a99bee271e26ece (from 2023-09-22 19:26:16 +0200) and HEAD (from 2023-10-22 16:16:02 +0200).
./updateWDL.sh: line 59: 2154418 Segmentation fault (core dumped) ./scoreWDLstat --dir $pgnpath -r --matchRev $regex_pattern --matchBook "$bookname" --fixFEN --SPRTonly -o updateWDL.json &> scoreWDLstat.log
What do you think about maybe upgrading the build system to something newer like CMake/Meson?
The Makefile works fine, there's no problem with it, but maybe it's time to try out new tools?
I'd personally would try out Meson, I can write up a patch in the next few days and then we can still decide ? :D
That is because cutechess-cli saves pgns with FEN move counters 0 1
.
See the discussion on discord.
I have pushed a fix to the PR #153 which I guess should be merged now. We can look again at the precise data retrieval from SF source code for the dynamic rescaling once this is in SF code.
This is not an issue per se, but just a convenient place to regularly check how our material based fitting works.
Below I report on the fits from ./updateWDL.sh --firstrev b59786e750a59d3d7cff2630cf284553f607ed29
(based on move) and from python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 62 --moveMin 8 --moveMax 120 --materialMin 10 --materialMax 78 --modelFitting optimizeProbability
applied to the same json data (based on material).
json data: updateWDL.json.gz
Not sure if this is the best place to discuss this, but once the latest PRs are merged, we are basically ready to create some WDL tracker. Here some questions we could try to agree on:
Running on a recent test, I get this output, i.e. 177333 games:
$ ./scoreWDLstat --dir ./pgns/23-10-21/6533f394de6d262d08d3a55e/ -r
Looking (recursively) for pgn files in ./pgns/23-10-21/6533f394de6d262d08d3a55e/
Found 96 .pgn(.gz) files in total.
Found 96 .pgn(.gz) files, creating 96 chunks for processing.
Progress: 96/96
Time taken: 0.1s
Wrote 2788919 scored positions from 177333 games to scoreWDLstat.json for analysis.
Yet, the test meta-data shows it is only 20000 games, which is confirmed by:
$ zcat ./pgns/23-10-21/6533f394de6d262d08d3a55e/*.pgn.gz | grep 'Result' | wc -l
20000
There are also exactly 96 pgn.gz files in that directory.
In the code, the total_games
counter is only incremented in the header()
function, so I'm a bit at a loss here. Can the header function be called multiple times per game?
Just leaving this here to not forget: we could (should) also filter out WDL data from games that were lost due to crashes and time losses, I think. They can be identified by [Termination "time forfeit"]
and [Termination "abandoned"]
in their pgn data.
compiling with -fsanitize=thread
shows a data race
Once this is in fishtest, we should adopt the bulk download official-stockfish/fishtest#1818
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.