Comments (2)
@zita-ch
The reward-scale is multiplied with env.dt
(default is 0.02). Then the rewards are summed in the env.episode_sums
dictionary and printed here.
from legged_gym.
I got it. Thanks for your detailed explanation.
So first we get the average of the episode reward over the reset envs, then such a mean value is divided by the max_episode_length (in second, default 20). However, at the very beginning, most of the reset envs cannot stay alive to the last step of the horizon. So the episode rewards at the first dozens of iterations are very small, e.g., reward * 0.02/20.
I believe it makes sense, but this could be a little misleading. I think typically we do not average the episode reward on the horizon length especially when there is reset, although it does not affect the display of learning progress. Sometimes we may just want to observe the value averaged over the real episode length (that could be much smaller than max_episode_length). I personally suggest that in the next update you could add some explanations in tensorboard tab or code comments, or add another tensorboard tab.
Your works are quite inspiring and made me cling to your DRL framework now.
Best wishes ;)
from legged_gym.
Related Issues (20)
- Clarification on `armature` and `thickness` Parameters in `asset` Class
- Installation Issue HOT 2
- issue with training robot with passive joint
- Configuration files and hyperparameter tuning
- Configuration files and hyperparameter tuning HOT 1
- Hello, I want to load a ball or a door object in the legged gym with task a1
- RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch) HOT 8
- Code migration
- obs, extras = self.env.get_observations() ValueError: too many values to unpack (expected 2)** HOT 10
- class_name error
- When I running the example. the terminal show the error: WARNING: Forcing CPU pipeline. Not connected to PVD; GPU Pipeline: disabled ;Segmentation fault (core dumped) HOT 1
- python train.py --task=anymal_c_rough HOT 1
- Capturing video during training HOT 1
- The student policy about "Learning quadrupedal locomotion over challenging terrain"
- How to train a trotting gait?
- The new issac-lab repo HOT 1
- something wrong happened
- How to get high-resolution videos?
- The task/goal.
- How to understand these hyperparameters
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from legged_gym.