official-stockfish / wdl_model Goto Github PK

View Code? Open in Web Editor NEW

17.0 17.0 12.0 15.48 MB

Fit a chess Win-Draw-Loss model from played games

License: GNU General Public License v3.0

Python 53.49% Makefile 1.21% C++ 36.64% Shell 8.66%

wdl_model's People

Contributors

Stargazers

Watchers

Forkers

disservin robbai robertnurnberg dede1751 lynx-chess peregrineshahin vondele sarona-cccf dboingue scchess brito753 meliandam

wdl_model's Issues

License

It'd be great to explicitly state the license of this repository.

std::exit while pgns are being processed may lead to crashes

On linux I get segmentation faults when --fixFEN finds missing keys in the metadata, say. In itself this is not a problem, as the code should exit anyway. But a more graceful way to stop would be nice. I think the segmentation faults comes from the parallel execution of many PGN analysises, and maybe the exit(1) leads to some unexpected states there. (Sadly, I do not know how to fix this.)

Sample output for me:

Missing "book_depth" key in metadata for .epd book for test pgns/23-09-23/650f26ffadc82c88993ddd80/650f26ffadc82c88993ddd80
pt pt 00
scoreWDLstat: external/chess.hpp:1872: virtual void chess::Board::placePiece(chess::Piece, chess::Square): Assertion `board_[sq] == Piece::NONE' failed.
file_from -1
Segmentation fault      (core dumped)

downloads fail without error message

On a fresh clone, I get as output for python download_fishtest_pgns.py --path pgns --subdirs --page 2

Found 0 fully downloaded tests in pgns/ already.
Downloading pgns to pgns/23-08-30/64efc1bdb0db1f4c8581ee29/ ...
  Fetching 453 missing pgn files ...
Downloading pgns to pgns/23-08-31/64f0565fb0db1f4c8581ffc5/ ...

Then killing the process and checking the directory pgns/23-08-30/64efc1bdb0db1f4c8581ee29 gives an empty directory, whereas the second directory starts to get filled.

Will try to investigate tomorrow.

seg fault with scoreWDLstat

As of this morning, I get a segmentation fault when running update script.

> ./updateWDL.sh 
started at:  Mon 23 Oct 09:08:37 CEST 2023
Look recursively in directory pgns for games from SPRT tests using books matching "UHO_4060_v3.epd" for SF revisions between 70ba9de85cddc5460b1ec53e0a99bee271e26ece (from 2023-09-22 19:26:16 +0200) and HEAD (from 2023-10-22 16:16:02 +0200).
./updateWDL.sh: line 59: 2154418 Segmentation fault      (core dumped) ./scoreWDLstat --dir $pgnpath -r --matchRev $regex_pattern --matchBook "$bookname" --fixFEN --SPRTonly -o updateWDL.json &> scoreWDLstat.log

scoreWDLstat.log

Food for Thought - Build System

What do you think about maybe upgrading the build system to something newer like CMake/Meson?

The Makefile works fine, there's no problem with it, but maybe it's time to try out new tools?
I'd personally would try out Meson, I can write up a patch in the next few days and then we can still decide ? :D

current move based WDL model is 8 moves off for standard fishtest LTC data

That is because cutechess-cli saves pgns with FEN move counters 0 1.

See the discussion on discord.

sf refactoring broke the update script

I have pushed a fix to the PR #153 which I guess should be merged now. We can look again at the precise data retrieval from SF source code for the dynamic rescaling once this is in SF code.

monitor material based fitting

This is not an issue per se, but just a convenient place to regularly check how our material based fitting works.

Below I report on the fits from ./updateWDL.sh --firstrev b59786e750a59d3d7cff2630cf284553f607ed29 (based on move) and from python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 62 --moveMin 8 --moveMax 120 --materialMin 10 --materialMax 78 --modelFitting optimizeProbability applied to the same json data (based on material).

json data: updateWDL.json.gz

coordinating efforts to track the WDL model

Not sure if this is the best place to discuss this, but once the latest PRs are merged, we are basically ready to create some WDL tracker. Here some questions we could try to agree on:

Should each of us create their own local copies of fishtest .pgn files? Or best if @vondele does this somehow centrally? (Is there any point of hosting these pgn's on kaggle?)
Do we create a new repo for tracking, or make it part of this repo?
How do we deal with non-functional SF commits? We can now filter the WDL data by commit, but for the analysis it would make sense to merge data of non-funcional commits with last previous functional commit.
For creating a valid WDL_model-in-time data point, do we require a minimum number of positions? (Here I think of two functional commits in quick succession, meaning there won't be enough meaningful data for the first commit.)

Miscounting the number of games in a pgn collection?

Running on a recent test, I get this output, i.e. 177333 games:

$ ./scoreWDLstat --dir ./pgns/23-10-21/6533f394de6d262d08d3a55e/ -r
Looking (recursively) for pgn files in ./pgns/23-10-21/6533f394de6d262d08d3a55e/
Found 96 .pgn(.gz) files in total.
Found 96 .pgn(.gz) files, creating 96 chunks for processing.
Progress: 96/96
Time taken: 0.1s
Wrote 2788919 scored positions from 177333 games to scoreWDLstat.json for analysis.

Yet, the test meta-data shows it is only 20000 games, which is confirmed by:

$ zcat ./pgns/23-10-21/6533f394de6d262d08d3a55e/*.pgn.gz | grep 'Result' | wc -l
20000

There are also exactly 96 pgn.gz files in that directory.

In the code, the total_games counter is only incremented in the header() function, so I'm a bit at a loss here. Can the header function be called multiple times per game?

filter out crashes and time losses

Just leaving this here to not forget: we could (should) also filter out WDL data from games that were lost due to crashes and time losses, I think. They can be identified by [Termination "time forfeit"] and [Termination "abandoned"] in their pgn data.

data race

compiling with -fsanitize=thread shows a data race

Adopt bulk download in due time

Once this is in fishtest, we should adopt the bulk download official-stockfish/fishtest#1818

official-stockfish / wdl_model Goto Github PK

wdl_model's People

Contributors

Stargazers

Watchers

Forkers

wdl_model's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs