GithubHelp home page GithubHelp logo

maddpg-rllib's Introduction

MADDPG in Ray/RLlib

This implementation of MADDPG is recommended for research purposes only. If you want to actually learn something, use parameter sharing.

Notes

-This was forked from wsjeons's original repo due to lack of maintenance

  • The codes in OpenAI/MADDPG were refactored in RLlib, and test results are given in ./plots.

    • It was tested on 7 scenarios of OpenAI/Multi-Agent Particle Environment (MPE).
      • simple, simple_adversary, simple_crypto, simple_push, simple_speaker_listener, simple_spread, simple_tag
        • RLlib MADDPG shows the similar performance as OpenAI MADDPG on 7 scenarios except simple_crypto.
    • Hyperparameters were set to follow the original hyperparameter setting in OpenAI/MADDPG.
  • Empirically, removing lz4 makes running much faster. I guess this is due to the small-size observation in MPE.

References

maddpg-rllib's People

Contributors

jkterry1 avatar wsjeon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

maddpg-rllib's Issues

setproct package in requirements.txt

Hello,
I am working on getting MADDPG-RLlib to work using the Google Football environment. There is an package that I commented out during installation, setproct, which i cannot find any information on. Is this a typo that should be taken out?

Gradient issue

When setting

"actor_feature_reg": None,

and run the simple env, it will report

ValueError: No gradients provided for any variable: ["<tf.Variable 'policy_0/actor/policy_0/actor_feature/dense/kernel:0' shape=(4, 64) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/policy_0/actor_feature/dense/bias:0' shape=(64,) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/policy_0/actor_feature/dense_1/kernel:0' shape=(64, 64) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/policy_0/actor_feature/dense_1/bias:0' shape=(64,) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/dense/kernel:0' shape=(64, 5) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/dense/bias:0' shape=(5,) dtype=float32_ref>"].

So, I think it quite confusing.

        act_n = act_ph_n.copy()
        act_n[agent_id] = act_sampler
        critic, _, _, _ = self._build_critic_network(
            obs_ph_n,
            act_n,
            obs_space_n,
            act_space_n,
            config["use_state_preprocessor"],
            config["critic_hiddens"],
            getattr(tf.nn, config["critic_hidden_activation"]),
            scope="critic")

Since act_sampler is set and the gradient will be computed.

Saving training videos with monitor not working

Hello! I am training RLLIB's maddpg on MPE and have monitor = True set in config. However, it doesn't seem as if any training videos are being saved. Is there something else I need to do in order to be able to save training videos?

Is the singularity file fully up to date?

  1. According to the PR from the old repo, wsjeon#10, we might need to specify some additional apt installs, though it's not clear why to me?
  2. If possible, we need to specify package versions in the singularity file, like we do in the requirements.txt file.

Versions in requirements.txt

Hi there,
I found this repo as a reference implementation for MADDPG with RLlib via their documentation.
Since the last change of this repo, lots of the dependencies changes significantly - so the versions that will install with the requirements.txt file are way too new and therefore causing a lot of issues.
Could you maybe share the exact versions that you used or is there an updated implementation somewhere that I could refer to?
Thank you in advance!

What is lz4

Hi,

removing lz4 makes running much faster

What is lz4 here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.