GithubHelp home page GithubHelp logo

mistyr0se / eoi_on_smac Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jiechuanjiang/eoi_on_smac

0.0 0.0 0.0 784 KB

An improved version of EOI on Starcraft II task so_many_baneling. (The Emergence of Individuality)

License: MIT License

Python 100.00%

eoi_on_smac's Introduction

EOI_on_SMAC

This is an improved version of EOI on sparse-reward so_many_baneling (The Emergence of Individuality). The codebase for more general SMAC tasks is contained in this repo.

EOI

EOI

EOI learns a probabilistic classifier that predicts a probability distribution over agents given their observation and gives each agent an intrinsic reward of being correctly predicted by the classifier. The intrinsic reward encourages the agents to visit their own familiar observations, and learning the classifier by such observations makes the intrinsic reward signals stronger and in turn makes the agents more identifiable. To further enhance the intrinsic reward and promote the emergence of individuality, two regularizers are proposed to increase the discriminability of the classifier.

In the original version, an auxiliary gradient and update the individual value function by both minimizing the TD error of the joint action-value function and maximizing the cumulative intrinsic rewards. However, in some tasks, the gradients of the two objectives might conflict with each other, which will negatively impact the learning process and lead to low performance in challenging tasks. In the new version, during the interaction, the agent i acts the policy

a_i = argmax(Q_i^a) if episode_num%10 < n else argmax(Q_i^p).

And during the update, the agents are trained by vanilla QMIX without the auxiliary gradient. That is to say, the intrinsic value function Q^p only works in the exploration process, which generates individualized experiences and does not negatively impact the performance.

So_many_baneling

So_many_baneling

So_many_baneling is a SMAC task where 7 Zealots fight with 32 Banelings. The key to winning this task is that the Zealots should cooperatively spread out around the map far from each other so that the Banelings' damage is distributed as thinly as possible. The individualized strategy perfectly matches the motivation of EOI. However, the original so_many_baneling is too easy. We set reward_sparse=True, sight_range=4.0, shoot_range=0.0, move_amount=0.3, and simplify the observed information. The modified version (see starcraft2.py) is much more difficult, and vanilla QMIX is hard to explore the winning experiences.

Performance

win_rate

Citation

If you are using the codes, please cite our paper.

@inproceedings{jiang2021emergence,
  title={The Emergence of Individuality},
  author={Jiang, Jiechuan and Lu, Zongqing},
  booktitle={International Conference on Machine Learning},
  pages={4992--5001},
  year={2021},
  organization={PMLR}
}

eoi_on_smac's People

Contributors

jiechuanjiang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.