GithubHelp home page GithubHelp logo

doerlbh / mentalrl Goto Github PK

View Code? Open in Web Editor NEW
24.0 3.0 7.0 60.75 MB

Code for our AAMAS 2020 paper: "A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry".

Python 1.70% Jupyter Notebook 98.30%
reinforcement-learning psychiatry neuroscience-inspired-ai q-learning pytorch double-q-learning aamas aamas2020

mentalrl's Introduction

mentalRL

mentalRL

(image credit to HBR)

Code for our AAMAS 2020 paper:

"A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry"

by Baihan Lin (Columbia)*, Guillermo Cecchi (IBM Research), Djallel Bouneffouf (IBM Research), Jenna Reinen (IBM Research) and Irina Rish (Mila, UdeM).

*Corresponding

For the latest full paper: https://arxiv.org/abs/1906.11286

For my oral talk at AAMAS 2020: https://youtu.be/CQBdQz1bmls

All the experimental results can be reproduced using the code in this repository. Feel free to contact me by [email protected] if you have any question about our work.

Abstract

Drawing an inspiration from behavioral studies of human decision making, we propose here a more general and flexible parametric framework for reinforcement learning that extends standard Q-learning to a two-stream model for processing positive and negative rewards, and allows to incorporate a wide range of reward-processing biases -- an important component of human decision making which can help us better understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems, as well as various neuropsychiatric conditions associated with disruptions in normal reward processing. From the computational perspective, we observe that the proposed Split-QL model and its clinically inspired variants consistently outperform standard Q-Learning and SARSA methods, as well as recently proposed Double Q-Learning approaches, on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the Pac-Man game in a lifelong learning setting across different reward stationarities.

Info

Language: Python3, Python2, bash

Platform: MacOS, Linux, Windows

by Baihan Lin, Sep 2018

Citation

If you find this work helpful, please try the models out and cite our works. Thanks!

Reinforcement Learning case (main paper):

@inproceedings{lin2020astory,
  title={A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry},
  author={Lin, Baihan and Cecchi, Guillermo and Bouneffouf, Djallel and Reinen, Jenna and Rish, Irina},
  booktitle = {Proceedings of the Nineteenth International Conference on Autonomous Agents and Multi-Agent Systems, {AAMAS-20}},
  publisher = {International Foundation for Autonomous Agents and Multiagent Systems},             
  pages     = {744-752},
  year      = {2020},
  month     = {5},
  doi       = {},
  url       = {},
}


@inproceedings{lin2019split,
  title     = {Split Q Learning: Reinforcement Learning with Two-Stream Rewards},
  author    = {Lin, Baihan and Bouneffouf, Djallel and Cecchi, Guillermo},
  booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on
               Artificial Intelligence, {IJCAI-19}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},             
  pages     = {6448--6449},
  year      = {2019},
  month     = {7},
}

Contextual Bandit case:

@article{lin2020unified,
  title={Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits, and RL},
  author={Lin, Baihan and Cecchi, Guillermo and Bouneffouf, Djallel and Reinen, Jenna and Rish, Irina},
  journal={arXiv preprint arXiv:2005.04544},
  year={2020}
}

Tasks

  • Markov Decision Process (MDP) example with multi-modal reward distributions
  • Multi-Armed Bandits (MAB) example with multi-modal reward distributions
  • Iowa Gambling Task (IGT) example scheme 1 and 2
  • PacMan RL game with different stationarities

Requirements

  • Python 3 for MDP and IGT tasks, and Python 2.7 for PacMan task.
  • PyTorch
  • numpy and scikit-learn

Videos of mental agents playing PacMan

  • AD ("Alzheimer's Disease")

AZ

  • ADD ("addition")

ADD

  • ADHD ("ADHD")

ADHD

  • bvFTD (the behavioral variant of Frontotemporal dementia)

bvFTD

  • CP ("Chronic Pain")

CP

  • PD ("Parkinson's Disease")

PD

  • M ("moderate")

M

  • SQL ("Split Q-Learning")

SQL

  • PQL ("Positive Q-Learning")

PQL

  • NQL ("Negative Q-Learning")

NQL

  • QL ("Q-Learning")

QL

  • DQL ("Double Q-Learning")

DQL

Acknowledgements

The PacMan game was built upon Berkeley AI Pac-Man http://ai.berkeley.edu/project_overview.html. We modify many of the original files and included our comparison.

mentalrl's People

Contributors

ak47na avatar doerlbh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mentalrl's Issues

Error while running IGT

Hi, I was testing the source code, It's very interesting research and did good work. But while running your code, I am facing some errors.

The output of console is the following.

C:\Anaconda3\envs\mentalRLENV\lib\site-packages\seaborn\distributions.py:369: UserWarning: Default bandwidth for data is 0; skipping density estimation.
warnings.warn(msg, UserWarning)
Traceback (most recent call last):
File "run_IGT.py", line 38, in
fig, scores, rewards, actions = plotAgents(reports,names,nTrials,reward_functions,labels,fig_name+'_1.png',isIGT=True,plotShortTerm=True)
File "E:\Projects\Projects_Chitkara\Simulators\mentalRL\IGT\utils\vis.py", line 255, in plotAgents
fig.savefig(fig_name)
File "C:\Anaconda3\envs\mentalRLENV\lib\site-packages\matplotlib\figure.py", line 2311, in savefig
self.canvas.print_figure(fname, **kwargs)
File "C:\Anaconda3\envs\mentalRLENV\lib\site-packages\matplotlib\backends\backend_qt5agg.py", line 81, in print_figure
super().print_figure(*args, **kwargs)
File "C:\Anaconda3\envs\mentalRLENV\lib\site-packages\matplotlib\backend_bases.py", line 2217, in print_figure
**kwargs)
File "C:\Anaconda3\envs\mentalRLENV\lib\site-packages\matplotlib\backend_bases.py", line 1639, in wrapper
return func(*args, **kwargs)
File "C:\Anaconda3\envs\mentalRLENV\lib\site-packages\matplotlib\backends\backend_agg.py", line 512, in print_png
dpi=self.figure.dpi, metadata=metadata, pil_kwargs=pil_kwargs)
File "C:\Anaconda3\envs\mentalRLENV\lib\site-packages\matplotlib\image.py", line 1601, in imsave
image.save(fname, **pil_kwargs)
File "C:\Anaconda3\envs\mentalRLENV\lib\site-packages\PIL\Image.py", line 2155, in save
fp = builtins.open(filename, "w+b")
FileNotFoundError: [Errno 2] No such file or directory: './figures/IGT_1_1.png'

Please take a look and suggest if something wrong from my side.
Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.