I read from <a href="https://github.com/yenchenlin/DeepLearningFlappyBird/blob/master/

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Why do the program only use two state? about deeplearningflappybird HOT 8 CLOSED

guotong1988 commented on April 28, 2024

Why do the program only use two state?

from deeplearningflappybird.

Comments (8)

ColdCodeCool commented on April 28, 2024

@guotong1988 I think you should learn the very basic concept of reinforcement learning. It is basically a dynamic program, the state changes from time to time. You'd better learn Markov Decision Process and Bellman Equation first.

from deeplearningflappybird.

guotong1988 commented on April 28, 2024

the state changes from time to time
thank you
could you please have a look at my another question? thx!
the question is also in the issues

from deeplearningflappybird.

guotong1988 commented on April 28, 2024

反过来想，为什么不只用1个state呢，而用了2个state

from deeplearningflappybird.

ColdCodeCool commented on April 28, 2024

@guotong1988 no, you cannot use only one state, since intuitively you must communicate with the environment by behaving to learn a lesson. Once your action done, you are in another state, and you get reward or punishment from the environment, thus you can learn something.

from deeplearningflappybird.

ColdCodeCool commented on April 28, 2024

@guotong1988 for comprehensive understanding, you should learn mdp theory first.

from deeplearningflappybird.

guotong1988 commented on April 28, 2024

关键这两个state是紧挨着的，
就是说第二个state有情况，是前若干步决定的啊

from deeplearningflappybird.

ColdCodeCool commented on April 28, 2024

@guotong1988 like I said, you really need to learn mdp first. Markov property informs the current state captures all relevant information from the history. Thus the future state only depends on the current state. In mathematical forms, P[s_{t+1}|s_{t}] = P[s_{t+1}|s_1,...,s_t].

from deeplearningflappybird.

guotong1988 commented on April 28, 2024

The answer: One state contains 4 frame.

from deeplearningflappybird.

Recommend Projects

Why do the program only use two state? about deeplearningflappybird HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs