🐛 Bug There is no direct way to optimize hyperparameters for ppo_

I'd be more than happy to send a pull request! <p dir="

[Bug]: ppo_lstm not implemented in hyperparams_opt.py about rl-baselines3-zoo HOT 3 CLOSED

technocrat13 commented on June 19, 2024

[Bug]: ppo_lstm not implemented in hyperparams_opt.py

from rl-baselines3-zoo.

Comments (3)

araffin commented on June 19, 2024 1

I'd be more than happy to send a pull request!

Please do =) It is indeed missing.

I think you just need to extend the ppo sampler (have a sample_ppo_lstm() that will call sample_ppo_params()).

"ppo_lstm": sample_ppo_params, is actually already a quick and good solution.

from rl-baselines3-zoo.

technocrat13 commented on June 19, 2024

In my implementation of this where sample_ppo_lstm_params() calls sample_ppo_params(), I am encountering a limit in optuna's trail.suggest_categorical()

I am sampling net_arch from ["tiny", "medium", "large"] for the LSTM but in vanilla PPO it is sampling from only ["medium", "large"]

Optuna is unable to suggest categorical variables as it does not support having multiple parameters with same name but different value space, it does not even override it. There are some discussions on implementing it but they are old

Using a new name for the LSTM's net_arch (eg. net_arch_lstm) wastes search space and is an inelegant solution

I have three possible solutions to this:

Not including "tiny" in sample_ppo_lstm_params():
- does not impact implementation of PPO
- some environments benefit from a smaller neural network within the lstm and do not require bigger neural networks
Including "tiny" in sample_ppo_params():
- simplest, most straight forward implementation
- increases the search space for PPO, although if performance is not adequate it will not be explored
Passing a flag to sample_ppo_params():
- retains all old functionality and does not impact PPO's search space
- possibly confusing to read and understand in one go

All these 3 assume that I am extending the function and updating the "policy_kwargs" in the returned dictionary, if I just write a new function these issues simply do not exist.

Which solution according to you is ideal @araffin? And do you have any suggestions of your own? I can look into their implementation.

from rl-baselines3-zoo.

araffin commented on June 19, 2024

Solution 1 or 2 are fine for me.

from rl-baselines3-zoo.

Recommend Projects

[Bug]: ppo_lstm not implemented in hyperparams_opt.py about rl-baselines3-zoo HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs