Comments (3)
I'd be more than happy to send a pull request!
Please do =) It is indeed missing.
I think you just need to extend the ppo sampler (have a sample_ppo_lstm()
that will call sample_ppo_params()
).
"ppo_lstm": sample_ppo_params,
is actually already a quick and good solution.
from rl-baselines3-zoo.
In my implementation of this where sample_ppo_lstm_params()
calls sample_ppo_params()
, I am encountering a limit in optuna's trail.suggest_categorical()
I am sampling net_arch
from ["tiny", "medium", "large"]
for the LSTM but in vanilla PPO it is sampling from only ["medium", "large"]
Optuna is unable to suggest categorical variables as it does not support having multiple parameters with same name but different value space, it does not even override it. There are some discussions on implementing it but they are old
Using a new name for the LSTM's net_arch
(eg. net_arch_lstm
) wastes search space and is an inelegant solution
I have three possible solutions to this:
-
Not including
"tiny"
insample_ppo_lstm_params()
:- does not impact implementation of PPO
- some environments benefit from a smaller neural network within the lstm and do not require bigger neural networks
-
Including
"tiny"
insample_ppo_params()
:- simplest, most straight forward implementation
- increases the search space for PPO, although if performance is not adequate it will not be explored
-
Passing a
flag
tosample_ppo_params()
:- retains all old functionality and does not impact PPO's search space
- possibly confusing to read and understand in one go
All these 3 assume that I am extending the function and updating the "policy_kwargs"
in the returned dictionary, if I just write a new function these issues simply do not exist.
Which solution according to you is ideal @araffin? And do you have any suggestions of your own? I can look into their implementation.
from rl-baselines3-zoo.
Solution 1 or 2 are fine for me.
from rl-baselines3-zoo.
Related Issues (20)
- Plotting Script Improvement HOT 1
- [Question] Training Donkey Car Without Simulator Rendering HOT 2
- Issue with 'feat/offline-RL' Branch for Donkey Car in rl-baselines3-zoo HOT 4
- [Feature Request] Call train from Python code HOT 2
- [Bug]: Cannot enjoy due to error Cannot convert space of type Discrete(7). Please upgrade your code to gymnasium. HOT 1
- [Feature Request] Store git hash of key repos/packages
- [Bug]: Nan Problems for SAC, TQC, for AntBulletEnv-v0, HalfCheetahBulletEnv-v0 HOT 9
- [Error]: I got unexpected error using enjoy() with pretrain model HOT 4
- Training DonkeyCar with TQC algorithm with pretrained AE
- [Bug]: Custom Sub-Hyperparameters during train.py -> Optimize HOT 1
- [Question] You must pass an environment when using `HerReplayBuffer` HOT 1
- [Question] RuntimeError: Unable to sample before the end of the first episode. We recommend choosing a value for learning_starts that is greater than the maximum number of timesteps in the environment. HOT 5
- [Question] Custom Eval Callback for train/optimize HOT 2
- [Bug]: TODO: add test dependencies in the `setup.py` HOT 1
- [Question] Does hyperparameter tuning support custom vectorized environments? HOT 6
- [Bug]: Training suddenly stops at 25000 timesteps and Optuna optimization immediately exits in my custom environment HOT 7
- [Question] exp_manager reward and GAE discount factors HOT 1
- [Bug]: Custom environment not found in gym registry, you maybe meant... error message HOT 1
- [Bug]: Optimization log and optimal policy not in `--optimization-log-path` but in `--log-folder` HOT 1
- [Question] Number of parallel environments with hyperparameters optimization HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rl-baselines3-zoo.