GithubHelp home page GithubHelp logo

gt-star-lab / marbler Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 3.0 17.7 MB

Realistic Benchmarks for Collaborative Heterogeneous Multi-Robot Systems

Home Page: https://shubhlohiya.github.io/MARBLER/

License: MIT License

Python 100.00%
multi-agent-reinforcement-learning reinforcement-learning robotarium

marbler's Introduction

MARBLER: Multi-Agent RL Benchmark and Learning Environment for the Robotarium

Team: Reza Torbati, Shubham Lohiya, Shivika Singh, Meher Nigam

Installation Instructions

  1. Create new Conda Environment: conda create -n MARBLER python=3.8 && conda activate MARBLER.
  • Note that python 3.8 is only chosen to ensure compatitbility with EPyMARL.
  1. Download and Install the Robotarium Python Simulator
  • As of now, the most recent commit our code works with is 6bb184e. The code will run with the most recent push to the Robotarium but it will crash during training.
  1. Install our environment by running pip install -e . in this directory
  2. To test successfull installation, run python3 -m robotarium_gym.main to run a pretrained model

Usage

  • To look at current scenarios or create new ones or to evaluate trained models, look at the README in robotarium_gym
  • To upload the agents to the Robotarium, look at the README in robotarium_eval

Training with EPyMARL

  1. Download and Install EPyMARL. On Ubuntu 22.04, to successfully install EPyMARL, I have to:
    • Change line 15 in requirements.txt from protobuf==3.6.1 to protobuf
    • Downgrade wheel to 0.38.4
    • Downgrade setuptools to 65.5.0
    • Install einops and torchscatter
  2. Train agents normally using our gym keys
  • For example: python3 src/main.py --config=qmix --env-config=gymma with env_args.time_limit=1000 env_args.key="robotarium_gym:PredatorCapturePrey-v0"
  • To train faster, ensure robotarium is False, real_time is False, and show_figure_frequency is large or -1 in the environment's config.yaml
  • Known error: if env_args.time_limit<max_episode_steps, EPyMARL will crash after the first episode
  1. Copy the trained weights to the models folder for the scenario that was trained
  • Requires the agent.th file (location should be printed in the cout of the terminal the model was trained in, typically in EPyMARL/results/models/...)
  • Requires the config.json file (typically in EPyMARL/results/algorithm_name/gym:scenario/...)
  1. Update the scenario's config.yaml to use the newly trained agents

Citing

If you use this in your work please cite:

  • Our work:

R. J. Torbati, S. Lohiya, S. Singh, M. S. Nigam and H. Ravichandar, "MARBLER: An Open Platform for Standardized Evaluation of Multi-Robot Reinforcement Learning Algorithms," 2023 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), Boston, MA, USA, 2023, pp. 57-63, doi: 10.1109/MRS60187.2023.10416792.

  • The Robotarium:

S. Wilson, P. Glotfelter, L. Wang, S. Mayya, G. Notomista, M. Mote, and M. Egerstedt. The robotarium: Globally impactful opportunities, challenges, and lessons learned in remote-access, distributed control of multirobot systems. IEEE Control Systems Magazine, 40(1):26โ€“44, 2020.

  • Additionally, if you use the default agents in this repo, also cite EPyMARL:

Papoudakis, Georgios, et al. "Benchmarking multi-agent deep reinforcement learning algorithms in cooperative tasks." arXiv preprint arXiv:2006.07869 (2020).

marbler's People

Contributors

rezatorbati avatar shashwatnigam99 avatar shivika275 avatar shubhlohiya avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

marbler's Issues

Reproducibility: VDN Results

Hi,

Thanks for your contribution to the communities of reinforcement learning and robotics.

Unfortunately, I am having problems reproducing the results for VDN for tasks Arctic Transport, Material Transport, Predator Capture Prey, in table II of the article. Oddly enough Warehouse seems okay. Perhaps you could confirm that your method of aggregating runs follows Papoudakis, et al. 2021?

Maximum returns: For each algorithm, we identify the evaluation timestep during training in which
the algorithm achieves the highest average evaluation returns across five random seeds. We report the
average returns and the 95% confidence interval across five seeds from this evaluation timestep

Moreover the configuration files for each experiment is consistent with #13. The undiscounted returns and their respective 95% (normal) confidence intervals for each tasks are as follows:

Arctic Transport: -28.315 +/- 0.89
Material Transport: 21.895 +/- 0.74
Predator Capture Prey: 125.094 +/- 2.45
Warehouse: 28.572 +/- 0.44

While the ones in the paper are:

Arctic Transport: -6.98 +/- 1.75
Material Transport: 5.15 +/- 1.3
Predator Capture Prey: 33.25 +/- 0.46
Warehouse: 28.7+/- 1.49

Additionally, I send the plots obtained for each task, and the pattern for the algorithm is consistent with the published versions (Figure 3):

ArcticTransport
MaterialTransport
PredatorCapturePrey
Warehouse

What am I missing? Should I normalize for the number of agents? Epymarl is for cooperative MARL, perhaps the reward signals are being aggregated into the joint rewards? Could you please clarify?

Regards,
Guilherme Varela

Reproduce the result in paper

Hi! Thank you for the work!

I was trying to reproduce the TABLE II you illustrated in the paper. Especially the predator capture prey scenario in TABLE III. However, i fail to obtain the similar result maybe due to the hyperparameter setting.

Could you please provide the full script of

  • environment config.yaml
  • epymarl default.yaml
  • gymma.yaml
  • especially VDN.yaml (as in it is the best result)

Thank you and looking forward to hear back from you!

Best regards

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.