GithubHelp home page GithubHelp logo

Exploration with pseudo counts about atari HOT 5 OPEN

kaixhin avatar kaixhin commented on August 15, 2024
Exploration with pseudo counts

from atari.

Comments (5)

lake4790k avatar lake4790k commented on August 15, 2024

My first step is to implement a CTS based probability measure for small bitmaps (with 1 bit pixels) with the location dependent model described in the paper. I will expect reasonable probabilities calculated for patterns that have been processed (1), similar to those (>0.5) and dissimilar (0).

from atari.

Kaixhin avatar Kaixhin commented on August 15, 2024

Good luck! Finally got round to reading the paper and noticed some extras in the appendix. Seems like for completeness we'll need to add a stochastic ALE setting for this paper and the PAL paper, plus remove the terminal signal on life loss for this paper. Looks like that can make a huge difference on the results reported.

from atari.

Kaixhin avatar Kaixhin commented on August 15, 2024

FYI there's another (new) paper from DeepMind with similar goals...

from atari.

lake4790k avatar lake4790k commented on August 15, 2024

The paper refers to a number of other papers with regards to CTS usage saying "similar to this and that", but in the end the referred papers do quite different things, best to look at just the method in the pseudo count paper. They also refer to the Skipping CTS paper, but always talk about CTS, so I use the plain CTS for now.

Managed to adapt the CTS code to give reasonable probs for 1-bit pixel bitmaps with the neighbour factors in the paper. It's not described exactly how they handle the multiple bits of a single pixel, that could be done in a number of ways (for a single bit look at the same bit in the neighbouring pixels or look at all bits in the neighbouring pixels). I'll add different options for that and provide a native lib and an ffi interface that could be invoked in ER and async to compute the pseudo counts from the probabilities.

from atari.

lake4790k avatar lake4790k commented on August 15, 2024

Kind of finished a separate module with the native probability tree for 8 bit screens. Was not easy, but probably now comes the difficult part... for example the probablity of the screen is the product of the probability of the pixels. Different implementations (CTW and CTS) compute slightly different probabilities, but when there are 42 * 42 * 8 factors the probablity product can be quite different (ie. 0.99 vs 0.99999 ^ (42 * 42 * 8)...) Probably one would need to do exactly as DM to make it work... let's try anyway.

from atari.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.