jkterry1 / maddpg-rllib Goto Github PK

View Code? Open in Web Editor NEW

This project forked from wsjeon/maddpg-rllib

22.0 0.0 8.0 786 KB

MADDPG in Ray/RLlib

Python 100.00%

maddpg-rllib's Introduction

MADDPG in Ray/RLlib

This implementation of MADDPG is recommended for research purposes only. If you want to actually learn something, use parameter sharing.

Notes

-This was forked from wsjeons's original repo due to lack of maintenance

The codes in OpenAI/MADDPG were refactored in RLlib, and test results are given in ./plots.
- It was tested on 7 scenarios of OpenAI/Multi-Agent Particle Environment (MPE).
  - simple, simple_adversary, simple_crypto, simple_push, simple_speaker_listener, simple_spread, simple_tag
    - RLlib MADDPG shows the similar performance as OpenAI MADDPG on 7 scenarios except simple_crypto.
- Hyperparameters were set to follow the original hyperparameter setting in OpenAI/MADDPG.
Empirically, removing lz4 makes running much faster. I guess this is due to the small-size observation in MPE.

References

OpenAI/MADDPG
OpenAI/Multi-Agent Particle Environment
- wsjeon/Multi-Agent Particle Environment
  - It includes the minor change for MPE to work with recent OpenAI Gym.

maddpg-rllib's People

Contributors

Stargazers

Forkers

wnight963 nikunj-gupta pavelcz lalithkishore31 bcs-mli nuaasgq wkkdhj gselim

maddpg-rllib's Issues

setproct package in requirements.txt

Hello,
I am working on getting MADDPG-RLlib to work using the Google Football environment. There is an package that I commented out during installation, setproct, which i cannot find any information on. Is this a typo that should be taken out?

Gradient issue

When setting

"actor_feature_reg": None,

and run the simple env, it will report

ValueError: No gradients provided for any variable: ["<tf.Variable 'policy_0/actor/policy_0/actor_feature/dense/kernel:0' shape=(4, 64) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/policy_0/actor_feature/dense/bias:0' shape=(64,) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/policy_0/actor_feature/dense_1/kernel:0' shape=(64, 64) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/policy_0/actor_feature/dense_1/bias:0' shape=(64,) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/dense/kernel:0' shape=(64, 5) dtype=float32_ref>", "<tf.Variable 'policy_0/actor/dense/bias:0' shape=(5,) dtype=float32_ref>"].

So, I think it quite confusing.

        act_n = act_ph_n.copy()
        act_n[agent_id] = act_sampler
        critic, _, _, _ = self._build_critic_network(
            obs_ph_n,
            act_n,
            obs_space_n,
            act_space_n,
            config["use_state_preprocessor"],
            config["critic_hiddens"],
            getattr(tf.nn, config["critic_hidden_activation"]),
            scope="critic")

Since act_sampler is set and the gradient will be computed.

Runtime parameters of the experiments

Hi, could you please provide configurations (for example, rl, sample_batch_size or num_workers, etc) of different envs of your rllib-maddpg?

Saving training videos with monitor not working

Hello! I am training RLLIB's maddpg on MPE and have monitor = True set in config. However, it doesn't seem as if any training videos are being saved. Is there something else I need to do in order to be able to save training videos?

Is the singularity file fully up to date?

According to the PR from the old repo, wsjeon#10, we might need to specify some additional apt installs, though it's not clear why to me?
If possible, we need to specify package versions in the singularity file, like we do in the requirements.txt file.

Render or perform rollout

Will a rollout function that allows to render and observe the behavior of the policy be developed?

Versions in requirements.txt

Hi there,
I found this repo as a reference implementation for MADDPG with RLlib via their documentation.
Since the last change of this repo, lots of the dependencies changes significantly - so the versions that will install with the requirements.txt file are way too new and therefore causing a lot of issues.
Could you maybe share the exact versions that you used or is there an updated implementation somewhere that I could refer to?
Thank you in advance!

What is lz4

Hi,

removing lz4 makes running much faster

What is lz4 here?

jkterry1 / maddpg-rllib Goto Github PK

maddpg-rllib's Introduction

MADDPG in Ray/RLlib

This implementation of MADDPG is recommended for research purposes only. If you want to actually learn something, use parameter sharing.

Notes

References

maddpg-rllib's People

Contributors

Stargazers

Forkers

maddpg-rllib's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs