Comments (1)
The instability of iLQR comes from the eta update in iLQR.
The eta penalizes one of the KL divergence in iLQR, and is tuned by comparing kl_div with kl_step.
The problem is:
- when mc cost increases ==> new_mult < 1 ==> step decreases (since actual improvement becomes much smaller than predicted improvement and algorithm tries to reduce the step size)
- step decreases ==> con > 0 (since kl_step = step * kl_base, thus the theoretical bound becomes more strict. You refered kl_step in the code as epsilon, which is not correct since epsilon controls the other KL divergence term)
- con > 0 ==> eta increases (since more strict constraint on kl divergence makes current kl divergence violated the constraint, so more penalty will be added (i.k.i eta increases)
In summary:
when actual cost increases ==> penalization of KL divergence increases.
This is not reasonable, since more effort on KL divergence term will make the loss term becomes even more larger. After several iterations, the robot waived crazy.
The first several iterations is normal though. I guess the scaling of the improvement in new_multi calculation matters.
Any comment on this?
from gps.
Related Issues (20)
- ImportError: Failed to import any qt binding HOT 2
- Include observation in reward function without using it during state feedback
- Different definitions of "Loss of Supervised Learning" between the paper and the code HOT 1
- SyntaxWarning: The publisher should be created with an explicit keyword argument 'queue_size'. HOT 1
- Why is nominal trajectory not used in iLQR control law? HOT 2
- The installation of the dependencies drives me mad
- ImportError: No module named mjcpy
- leading minor of the array is not positive definite
- no GPSPR2Plugin HOT 1
- A successful implement of the code dependencies
- ValueError: array must not contain infs or NaNs HOT 1
- How to reperform a pr2 action?
- An installation structure of the code dependencies HOT 1
- -
- some problems when i use `--resume iter_N`
- init_traj_distr setting problems when dX < 2 * dU
- DataLossError: 2 root error(s) found.
- How do you compute the variance in the m-step of GMM ??
- No
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gps.