GithubHelp home page GithubHelp logo

zfountas / deep-active-inference-mc Goto Github PK

View Code? Open in Web Editor NEW
73.0 4.0 12.0 12.68 MB

Deep active inference agents using Monte-Carlo methods

License: GNU General Public License v3.0

Python 100.00%
active-inference cognitive-architectures cognitive-science machine-learning-models variational-autoencoders variational-inference model-based-reinforcement-learning dsprites animal-ai disentangled-representations

deep-active-inference-mc's Introduction

Deep active inference agents using Monte-Carlo methods

This source code release accompanies the manuscript:

Z. Fountas, N. Sajid, P. A.M. Mediano and K. Friston "Deep active inference agents using Monte-Carlo methods", Advances in Neural Information Processing Systems 33 (NeurIPS 2020).

If you use this model or the dynamic dSprites environment in your work, please cite our paper.


Description

For a quick overview see this video. In this work, we propose the deep neural architecture illustrated below, which can be used to train scaled-up active inference agents for continuous complex environments based on amortized inference, M-C tree search, M-C dropouts and top-down transition precision, that encourages disentangled latent representations.

We test this architecture on two tasks from the Animal-AI Olympics and a new simple object-sorting task based on DeepMind's dSprites dataset.

Demo behavior

Agent trained in the Dynamic dSprites environment Agent trained in the Animal-AI environment

Requirements

  • Programming language: Python 3
  • Libraries: tensorflow >= 2.0.0, numpy, matplotlib, scipy, opencv-python
  • dSprites dataset.

Instructions

Installation
  • Initially, make sure the required libraries are installed in your computer. Open a terminal and type
pip install -r requirements.txt
  • Then, clone this repository, navigate to the project directory and download the dSrpites dataset by typing
wget https://github.com/deepmind/dsprites-dataset/raw/master/dsprites_ndarray_co1sh3sc6or40x32y32_64x64.npz

or by manually visiting the above URL.

Training
  • To train an active inference agent to solve the dynamic dSprites task, type
python train.py

This script will automatically generate checkpoints with the optimized parameters of the agent and store this checkpoints to a different sub-folder every 25 training iterations. The default folder that will contain all sub-folders is figs_final_model_0.01_30_1.0_50_10_5. The script will also generate a number of performance figures, also stored in the same folder. You can stop the process at any point by pressing Ctr+c.

Testing
  • Finally, once training has been completed, the performance of the newly-trained agent can be demonstrated in real-time by typing
python test_demo.py -n figs_final_model_0.01_30_1.0_50_10_5/checkpoints/ -m

This command will open a graphical interface which can be controlled by a number of keyboard shortcuts. In particular, press:

  • q or esc to exit the simulation at any point.
  • 1 to enable the MCTS-based full-scale active inference agent (enable by default).
  • 2 to enable the active inference agent that minimizes expected free energy calculated only for a single time-step into the future.
  • 3 to make the agent being controlled entirely by the habitual network (see manuscript for explanation)
  • 4 to activate manual mode where the agents are disabled and the environment can be manipulated by the user. Use the keys w, s, a or d to move the current object up, down, left or right respectively.
  • 5 to enable an agent that minimizes the terms a and b of equation 8 in the manuscript.
  • 6 to enable an agent that minimizes only the term a of the same equation (reward-seeking agent).
  • m to toggle the use of sampling in calculating future transitions.

Bibtex

@inproceedings{fountas2020daimc,
author = {Fountas, Zafeirios and Sajid, Noor and Mediano, Pedro and Friston, Karl},
booktitle = {Advances in Neural Information Processing Systems},
editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
pages = {11662--11675},
publisher = {Curran Associates, Inc.},
title = {Deep active inference agents using Monte-Carlo methods},
url = {https://proceedings.neurips.cc/paper/2020/file/865dfbde8a344b44095495f3591f7407-Paper.pdf},
volume = {33},
year = {2020}
}

deep-active-inference-mc's People

Contributors

l3str4nge avatar zfountas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

deep-active-inference-mc's Issues

Calculation of Term 1 in G

I am having trouble understanding why for term 1, the sum is taken between the two computed entropy terms:

# E [ log Q(s|pi) - log Q(s|o,pi) ]
term1 = - tf.reduce_sum(entropy_normal_from_logvar(ps1_logvar) + entropy_normal_from_logvar(qs1_logvar), axis=1)

I understand this gives the sum of two entropy terms (where ps1 $= p_\tau = s_\tau|\pi$ and qs1 $= q_\tau = s_\tau|o_\tau,\pi$ ):

$$- \sum_\tau ( \frac{1}{2} \log(2 e \pi \sigma^2_{p_\tau} ) + \frac{1}{2} \log (2 e \pi \sigma^2_{q_\tau} ) ) \xrightarrow{\text{simplify}} - \sum_\tau ( H_{p_\tau} + H_{q_\tau} ) $$

But in the paper, we see that term 1 is given by:

$$\sum_\tau E_{Q(\theta | \pi)}[E_{Q(o_\tau | \theta, \pi)}[H(s_\tau | o_\tau, \pi)] - H(s_\tau | \pi)] \xrightarrow{\text{simplify}} + \sum_\tau ( H_{q_\tau} - H_{p_\tau} ) $$

Why is there the discrepancy between the "+" and the "-"? Or where is my understanding breaking down? Am I simplifying the equations incorrectly? If so, can you explain how to correctly transform between the two?

Problems when trying to follow Readme to train an active inference.

I am using

  1. Geforce 3060 Ti
  2. tensorflow-gpu 2.4.0
  3. python3.7
  4. CUDA 11.0
  5. cudnn 8.0

when trying to run train.py, the error messages show some problems with CUBLAS
failed to create cublasLt handle: CUBLAS_STATUS_INTERNAL_ERROR

I google this message, it looks like it's caused by the limitation of GPU memory. And I can't solve it with the solutions provided online. Can I ask what setting you use for the test? like which GPU and the version of tensorflow

How to train model in Animal-AI environment

In the scientific article there is talk about Agent trained in the Animal-AI environment.
How can I accomplish this.
I tried installing animal-ai environment but it didn't work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.