GithubHelp home page GithubHelp logo

jwarmenhoven / dbda-python Goto Github PK

View Code? Open in Web Editor NEW
667.0 29.0 262.0 75.72 MB

Doing Bayesian Data Analysis, 2nd Edition (Kruschke, 2015): Python/PyMC3 code

License: MIT License

Jupyter Notebook 100.00%
bayesian-data-analysis bayesian-inference pymc3 mcmc hierarchical-models kruschke probabilistic-programming

dbda-python's Introduction

Doing Bayesian Data Analysis - Python/PyMC3

This repository contains Python/PyMC3 code for a selection of models and figures from the book 'Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan', Second Edition, by John Kruschke (2015). The datasets used in this repository have been retrieved from the book's website. Note that, in its current form, this repository is not a standalone tutorial and that you probably should have a copy of the book to follow along. Suggestions for improvement and help with unsolved issues are welcome!

Note that the code is in Jupyter Notebook format and requires modification to use with other datasets.

Some of the general concepts from the book are discussed in papers by Kruschke & Liddell. See references below.

2018-08-16:
Updating the notebooks with PyMC3 v3.5 and general code clean-up. Inserting plots of the PyMC models in plate notation (v3.5 feature). Fixing some deprecation warnings.

Chapter 9 - Hierarchical Models
Chapter 10 - Model Comparison and Hierarchical Modelling
Chapter 12 - Bayesian Approaches to Testing a Point ("Null") Hypothesis
Chapter 16 - Metric-Predicted Variable on One or Two Groups
Chapter 17 - Metric-Predicted Variable with One Metric Predictor
Chapter 18 - Metric Predicted Variable with Multiple Metric Predictors
Chapter 19 - Metric Predicted Variable with One Nominal Predictor
Chapter 20 - Metric Predicted Variable with Multiple Nominal Predictor
Chapter 21 - Dichotomous Predicted Variable
Chapter 22 - Nominal Predicted Variable
Chapter 23 - Ordinal Predicted Variable
Chapter 24 - Count Predicted Variable

Extra:
Bayesian Linear Regression example (Bishop, 2006)
Example on modelling Ordinal Data (Liddell & Kruschke, 2018)

Libraries used:

  • pymc3
  • theano
  • pandas
  • numpy
  • scipy
  • matplotlib
  • seaborn

References:

Bishop, C.M. (2006), Pattern Recognition and Machine Learning, Springer Science+Business Media, New York. https://www.microsoft.com/en-us/research/people/cmbishop/

Kruschke, J.K. (2015), Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan, Second Edition, Academic Press / Elsevier, https://sites.google.com/site/doingbayesiandataanalysis/

Kruschke, J.K. & Liddell, T.M. (2017), The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective, Psychonomic Bulletin & Review, http://dx.doi.org/10.3758/s13423-016-1221-4

Kruschke, J.K. & Liddell, T.M. (2017), Bayesian data analysis for newcomers, Psychonomic Bulletin & Review, http://dx.doi.org/10.3758/s13423-017-1272-1

Liddell, T., & Kruschke, J. K. (2018, April 5). Analyzing ordinal data with metric models: What could possibly go wrong? Retrieved from http://osf.io/3tkz4

Salvatier J, Wiecki TV, Fonnesbeck C. (2016), Probabilistic programming in Python using PyMC3, PeerJ Computer Science 2:e55, https://doi.org/10.7717/peerj-cs.55
PyMC3 - http://pymc-devs.github.io/pymc3/

Note:

The repository below contains python code for the first edition of the book. The code in that repository is a much more direct implementation of the R/JAGS code from the book than you will find here.
https://github.com/aloctavodia/Doing_bayesian_data_analysis

dbda-python's People

Contributors

jwarmenhoven avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dbda-python's Issues

PyMCon 2020

Hi!

As you may have already seen on Twitter or on PyMC Discourse, we are planning a virtual conference for the PyMC community. All the information is available in the Discourse post.

We are currently looking for conference chairs and volunteers and would be very grateful if you could share the word! We also want to encourage you to, if you are interested and available, apply to be a conference chair.

Chapter 10: Wrong number of coin flips for Model 2?

Hey, I was just quickly skimming over the Chapter 10 code and after it says

Model 2 - Two theta variables without pseudo priors

Coin is flipped nine times, resulting in six heads.

y2 = pm.Bernoulli('y2', theta, observed=[1,1,1,1,1,0,0,0])
I can only count 8 flips resulting in 5 heads. It also says so on the plate of the graphical model.

Wrong array dimensions in Chapter 24?

Getting the following error when trying to run code for Chapter 24:

trace1['a1a2'][:,j1,j2])
ValueError: could not broadcast input array from shape (20000) into shape (5000)

Which happens in this part of the code:

# Transforming the trace data to sum-to-zero values
m = np.zeros((Nx1Lvl,Nx2Lvl, n_samples*4))
b1b2 = m.copy()

for (j1,j2) in np.ndindex(Nx1Lvl,Nx2Lvl):
        m[j1,j2,:] =  (trace1['a0'] +
                     trace1['a1'][:,j1] +
                     trace1['a2'][:,j2] +
                     trace1['a1a2'][:,j1,j2])

I think the issue boils down to the size of the variable m, which is set to have dimensions (Nx1Lvl, Nx2Lvl, 5000). However, the trace1 object, ends up producing arrays of length = 20,000 because it concatenates the results of all 4 MCMC chains together:

In [78]: len(trace1['a0'])
Out[78]: 20000

If I change the dimensions of m to be

m = np.zeros((Nx1Lvl,Nx2Lvl, n_samples*4))

the code seems to run just fine.

I'm new to PyMC3 so perhaps this is due to how the trace objects works on different versions of PyMC3. I'm using PyMC3 version 3.4.1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.