GithubHelp home page GithubHelp logo

coagent-networks-revisited's Introduction

Coagent Networks Revisited

This repo contains the code for the paper Coagent Networks Revisited.


Dependencies

To install dependencies for experiments, run the following commands:

conda create -n hoc python=3.8
conda activate hoc
pip install -r requirements.txt

Usage

The main code is contained in hoc.py. The file hppoc.py contains an implementation of Hierarchical Proximal Policy Option-Critic (HPPOC), a generalization of PPOC. The list of arguments may be seen in the main functions implemented in these files. Most of the arguments are self-explanatory. Please refer to the paper for more details.

There are three arguments that need some explanation. The argument graph describes the graph of options with a list of lists of positive integers. The elements of graph correspond to options and their elements are their list of children. Any option having -1 as one of its children will be capable of calling primitive options. A primitive option is an option that executes its corresponding primitive action in the environment. The following code checks if this variable is valid.

for i, v in enumerate(graph):
    assert(len(v) > 0)
    for j in v:
        # the option i is a parent of option j
        # if j == -1, then that option has all the primitive options as children as well as other j in v
        assert(j == -1 or (i < j and j < len(graph)))

If graph is None, then noptions and shared are used to generate the graph. The argument noptions is a list of positive integers. If shared is True, then the children are shared. In other words, we have a Feedforward Options Network. Otherwise, we have a Hierarchical Option Critic.

For example, the following code will run the <1, 1, 1> FON which is shown in Figure 5 in the paper.

python hoc.py --noptions=[1,1] --seed 1 --nruns 5 --nepisodes 50000

This code will run the <1, 2, 2> FON model in the same figure.

python hoc.py --noptions=[2,2] --shared True --seed 1 --nruns 5 --nepisodes 50000

Here the root option has two children and each of them have two children, which they share. Hence the root has 2 grand-children and the model has 5 non-primitive options. If we set --shared False in the above code, we will run an HOC with 7 non-primitive options.

In our experiments, we tend to use seeds from 1 to 5 or 1 to 10 and take 10 or 5 runs with 50000 episodes, for a total of 50 runs.

Performance and Visualizations

An object of type Callback is called at the end of each episode. At then end of each run, it will save both a copy of all the weights in the models and a history array. This is a 4-dimensional array where history[run, episode, option_uid, k] is equal to:

  • if k == 0: avg duration;
  • if k == 1: total discounted reward;
  • if k == 2: number of times this option has been used.

We consider all of the options, even primitive ones. Using this we may analyse the options and compute several statistics such as the average option length. For the root, this stores the duration and the total discounted reward of the entire episode, which is the performance of the model.

To see an example, check out the notebook main.ipynb.


Related Works/Codes

coagent-networks-revisited's People

Contributors

mojishoki avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

davidslayback

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.