GithubHelp home page GithubHelp logo

Comments (3)

araffin avatar araffin commented on June 19, 2024 1

I'd be more than happy to send a pull request!

Please do =) It is indeed missing.

I think you just need to extend the ppo sampler (have a sample_ppo_lstm() that will call sample_ppo_params()).

"ppo_lstm": sample_ppo_params, is actually already a quick and good solution.

from rl-baselines3-zoo.

technocrat13 avatar technocrat13 commented on June 19, 2024

In my implementation of this where sample_ppo_lstm_params() calls sample_ppo_params(), I am encountering a limit in optuna's trail.suggest_categorical()

I am sampling net_arch from ["tiny", "medium", "large"] for the LSTM but in vanilla PPO it is sampling from only ["medium", "large"]

Optuna is unable to suggest categorical variables as it does not support having multiple parameters with same name but different value space, it does not even override it. There are some discussions on implementing it but they are old

Using a new name for the LSTM's net_arch (eg. net_arch_lstm) wastes search space and is an inelegant solution

I have three possible solutions to this:

  1. Not including "tiny" in sample_ppo_lstm_params():

    • does not impact implementation of PPO
    • some environments benefit from a smaller neural network within the lstm and do not require bigger neural networks
  2. Including "tiny" in sample_ppo_params():

    • simplest, most straight forward implementation
    • increases the search space for PPO, although if performance is not adequate it will not be explored
  3. Passing a flag to sample_ppo_params():

    • retains all old functionality and does not impact PPO's search space
    • possibly confusing to read and understand in one go

All these 3 assume that I am extending the function and updating the "policy_kwargs" in the returned dictionary, if I just write a new function these issues simply do not exist.

Which solution according to you is ideal @araffin? And do you have any suggestions of your own? I can look into their implementation.

from rl-baselines3-zoo.

araffin avatar araffin commented on June 19, 2024

Solution 1 or 2 are fine for me.

from rl-baselines3-zoo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.