GithubHelp home page GithubHelp logo

Comments (13)

robertnurnberg avatar robertnurnberg commented on June 19, 2024

One way to fix historical data (for classical chess, where we can be sure an 8-move book was used), is to use the command find . -type f -name "*.pgn" -exec sed -i '/FEN/ s/0 1/0 9/g' {} + in the directory with pgns.

By the way, it may be a good idea to use a separate frc subdir in our download script for chess960. I am not sure how to catch that in the LTC overview page on fishtest, if someone could point me to an example html file, I could try to modify our download script accordingly.

from wdl_model.

Disservin avatar Disservin commented on June 19, 2024

frc, https://tests.stockfishchess.org/tests/view/64940b3cdc7002ce609c99f5
dfrc, https://tests.stockfishchess.org/tests/view/64940b4cdc7002ce609c99fa

from wdl_model.

robertnurnberg avatar robertnurnberg commented on June 19, 2024

Thanks. So at the moment the date of the test is extracted from https://tests.stockfishchess.org/tests/finished?ltc_only=1, together with the testID. Ideally we would also collect the (d)frc info from there. Is that possible?

from wdl_model.

vondele avatar vondele commented on June 19, 2024

one can fetch the book used from the test info page. https://tests.stockfishchess.org/tests/view/64f9a5910de4a3bb72fbe574 if the book name contains FRC it is treated as FRC.

Having access to so test info (e.g. a json containing key information) could now be saved along with the pgns (since we store them in a separate dir), and would allow for taking some steps related to that information.

Concerning the use of 8 moves deep lines to start, yes, I was aware of that. The point is, it is related to the book used, and maybe even the software used to play the games. Game ply seems to work pretty well for the WDL model, but indeed suffers from this limitation. Ultimately there is a limit to what can be put in the model (e.g. the same FEN could be a win or a draw depending on the move counter).

from wdl_model.

robertnurnberg avatar robertnurnberg commented on June 19, 2024

Concerning the use of 8 moves deep lines to start, yes, I was aware of that. The point is, it is related to the book used, and maybe even the software used to play the games. Game ply seems to work pretty well for the WDL model, but indeed suffers from this limitation. Ultimately there is a limit to what can be put in the model (e.g. the same FEN could be a win or a draw depending on the move counter).

I believe the 8 move offset could and should be fixed, agreed? I.e. both the fitting of the model, and the playing SF should use move counter from start position. Playing SF already does this in 99% of the cases, i.e. when used correctly by competent users.

The alternatives to fixing this would be either to go to a material based model (not sure how robust that would be at present), or to have a flat eval to wdl conversion w/o moves or material information.

from wdl_model.

vondele avatar vondele commented on June 19, 2024

I think the 8 moves offset should be fixed, but I'm not sure how this can be most cleanly done. In this case, fixing things is basically having some knowledge of the book. Probably, the code that downloads the pgns could start keeping some kind of side-info in a .json next to the pgns that documents what is there. Things like the e.g.

  • starting move
  • NormalizeToPawn actual value
  • Elo difference implied by the test
    are all things that in principle feed into the model.

I would not switch to a material based model so far, that's a bigger change.

from wdl_model.

robertnurnberg avatar robertnurnberg commented on June 19, 2024

A quick fix for now is to only use classical chess for fitting, and use the sed-one-liner from #34 (comment). Long term we should switch to fastchess for fishtest, which will store the correct FEN (w/ correct move counters) in .pgn.
Then the only extra fix needed is in @Disservin's cpp code to read the move counter from .pgn and use that. (I can look at that over the weekend.)

The .json thing would be a bigger change, and require more changes to both the download script and the cpp code.

from wdl_model.

Disservin avatar Disservin commented on June 19, 2024

Have you tested how long that sed takes with 40gb of files ?

from wdl_model.

robertnurnberg avatar robertnurnberg commented on June 19, 2024

Not yet, but I would hope less time than the download.

from wdl_model.

Disservin avatar Disservin commented on June 19, 2024

Regarding the analysis code you only need to add the current fullmoves * 2 to the ply I guess ?

Might be a dumb question but, how will the new model behave when the moves are shifted by 8 ?
How good will the fitted equation be for < 8 ?

from wdl_model.

robertnurnberg avatar robertnurnberg commented on June 19, 2024

I'd store moves from now and not plies. Read counter from board, and only increase if side to move is white. Or always read from board.

from wdl_model.

robertnurnberg avatar robertnurnberg commented on June 19, 2024

Fitting will hardly change. But we may want to move the anchor to move 40. That's for Joost to decide.

from wdl_model.

robertnurnberg avatar robertnurnberg commented on June 19, 2024

I think this issue can be closed, as all things this repo has influence over have been fixed. The only piece of the puzzle left is to get pgns from fishtest with correct move counters. (Or convert the pgns manually.)

from wdl_model.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.