Comments (8)
@guotong1988 I think you should learn the very basic concept of reinforcement learning. It is basically a dynamic program, the state changes from time to time. You'd better learn Markov Decision Process and Bellman Equation first.
from deeplearningflappybird.
the state changes from time to time
thank you
could you please have a look at my another question? thx!
the question is also in the issues
from deeplearningflappybird.
反过来想,为什么不只用1个state呢,而用了2个state
from deeplearningflappybird.
@guotong1988 no, you cannot use only one state, since intuitively you must communicate with the environment by behaving to learn a lesson. Once your action done, you are in another state, and you get reward or punishment from the environment, thus you can learn something.
from deeplearningflappybird.
@guotong1988 for comprehensive understanding, you should learn mdp theory first.
from deeplearningflappybird.
关键这两个state是紧挨着的,
就是说第二个state有情况,是前若干步决定的啊
from deeplearningflappybird.
@guotong1988 like I said, you really need to learn mdp first. Markov property informs the current state captures all relevant information from the history. Thus the future state only depends on the current state. In mathematical forms, P[s_{t+1}|s_{t}] = P[s_{t+1}|s_1,...,s_t].
from deeplearningflappybird.
The answer: One state contains 4 frame.
from deeplearningflappybird.
Related Issues (20)
- why my project can't run? HOT 2
- How long does it take if I train the network in CPU? HOT 3
- Setting the Difficulty Level of the Game HOT 2
- Can't reproduce. Is the reward and penalty rule right?
- Another AI flappy bird using genetic programming (evolutionary computation) HOT 1
- reading file issue HOT 1
- #53
- but way you use the same value on INITIAL_EPSILON and FINAL_EPSILON
- How do I reproduce the training process and how long it will take. Is it OK just not to load the training result model? HOT 1
- DTphotoMobile
- The final loss gradient is 1D but network output is (1,2). How is the gradient propagated ?
- training HOT 2
- Synthax Error
- Flappy bird
- Flappy bird
- AttributeError HOT 2
- TheMark.py
- Cannot run deep_q_network.py
- I don't understand
- help me
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deeplearningflappybird.