Comments (2)
The reward should be a function of only the previous state and action. To get good rewards, you have to plan ahead to get into states where there are good actions to take.
from gym.
Can we come back to this question? I think the reward is actually a function of the current state and action, not previous. A few examples from the simpler environments where it's really obvious:
Also, I'd argue that returning the reward due to previous action is problematic conceptually because for the first step() call there is no previous action; and the last action's reward would end up being discarded since there's no next step
from gym.
Related Issues (20)
- I am using DDPG. I trained the agent successfully. Now, I need to specify all action using all_actions = np.array(history.history['action']). But, it looks like there is no action recorded in history object. Any help and advice would be much appreciated.
- [Question] Is it possible that I turn off the auto reset of `step_wait()` function? HOT 1
- setup.py in https://www.gymlibrary.dev/content/environment_creation/#make-your-own-custom-environment
- [Question] Open source code for Montezuma Revenge
- Juputer notebook Kernel Dead After running any gym method
- [Bug Report] Vector env return value HOT 1
- Deepcopy env not working as expected HOT 6
- Segmentation Fault while trying to run Rviz in a wsl enviorment using VcXsrv HOT 1
- Is there any tools for changing the hyperparameter of the mujoco environment? HOT 1
- [Question] Modifying and Analyzing mujoco's qpos and qvel HOT 1
- installing gym[toy_text] bug HOT 1
- Could not find platform dependent libraries and kernel always busy. Im waiting and kernel still busy but my energyplus was completed succesfully. HOT 4
- Env.reset () is error HOT 2
- [Bug Report] Getting Error from "D:\PongDQN_RL\venv\lib\site-packages\gym\wrappers\compatibility.py" HOT 12
- [Question] Custom dtype in observation space
- [Bug Report] pusher-v4 in the environment doesn't collide the object for the fork HOT 1
- [Question] How to verify who is the winner of a game? HOT 3
- [bug] OpenAI gym set_level(50) not disabling logs HOT 1
- [Bug Report] Humanoidv4 doesnt include contact_cost in code, but still present in documentation HOT 4
- [Question] I am implementing a DQN on Atari. I have some shape related problems. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gym.