uscbiostats / fmcmc Goto Github PK

View Code? Open in Web Editor NEW

16.0 4.0 7.0 14.57 MB

A friendly MCMC framework

Home Page: https://uscbiostats.github.io/fmcmc/

License: Other

R 92.69% C++ 2.51% TeX 3.64% Makefile 0.84% Dockerfile 0.33%

mcmc adaptive parallel-computing markov-chain-monte-carlo r-package bayesian-inference r metropolis-hastings

fmcmc's Introduction

fmcmc: A friendly MCMC framework

What

The fmcmc R package provides a lightweight general framework for implementing Markov Chain Monte Carlo methods based on the Metropolis-Hastings algorithm. This implementation’s primary purpose lies in the fact that the user can incorporate the following flexibly:

Automatic convergence checker: The algorithm splits the MCMC runs according to the frequency with which it needs to check convergence. Users can use either one of the included functions (convergence_gelman, convergence_geweke, etc.), or provide their own.
Run multiple chains in parallel fashion: Using either a PSOCK cluster (default), or providing a personalized cluster object like the ones in the parallel R package.
User defined transition kernels: Besides of canonical Gaussian Kernel, users can specify their own or use one of the included in the package, for example: kernel_adapt, kernel_ram, kernel_normal_reflective, kernel_unif, kernel_mirror, or kernel_unif_reflective.

All the above without requiring compiled code. For the latest about fmcmc, checkout the NEWS.md section.

Who is this for?

While a lot of users rely on MCMC tools such as Stan (via the rstan package) or WinBUGS (via rstan), in several settings either these tools are not enough or provide too much for things that do not need that much. So, this tool is for you if:

You have a simple model to estimate with Metropolis-Hastings.
You want to run multiple chains of your model using out-of-the-box parallel computing.
You don’t want (or cannot) rely on external tools (so you need good-old base R only for your models).
You want to implement a model in which your model parameters are either bounded (like a standard error, for example), or are not, say, continuous (e.g., a size variable in a Binomial distribution), so you need a personalized transition kernel.

In any other case, you may want to take a look at the previously mentioned R packages, or check out the mcmc R package, which also implements the Metropolis-Hastings algorithm (although with not all the features that this R package has), the adaptMCMC R package, or the MCMCpack R package.

Installing

If you want to get the latest bleeding-edge version from Github, you can use devtools:

devtools::install_github("USCbiostats/fmcmc")

The latest (stable) release is also available on CRAN:

install.packages("fmcmc")

Citation

To cite fmcmc in publications use:

  Vega Yon et al., (2019). fmcmc: A friendly MCMC framework. Journal of
  Open Source Software, 4(39), 1427,
  https://doi.org/10.21105/joss.01427

And the actual R package:

Vega Yon G, Marjoram P (2021). _fmcmc: A friendly MCMC framework_. doi:
10.5281/zenodo.3378987 (URL: https://doi.org/10.5281/zenodo.3378987), R
package version 0.5-0, <URL: https://github.com/USCbiostats/fmcmc>.

To see these entries in BibTeX format, use 'print(<citation>,
bibtex=TRUE)', 'toBibtex(.)', or set
'options(citation.bibtex.max=999)'.

Example: Linear regression model

First run

In the following we show how to use the package for estimating parameters in a linear regression model. First, let’s simulate some data to use:

set.seed(78845)
n <- 1000
X <- rnorm(n)
y <- 3.0 + 2.0*X + rnorm(n, sd = 4)

In this case, we have three parameters to estimate, the constant (2.0), the β coefficient (2.0), and the standard deviation parameter of the error (1.5).

To estimate this model, we can either maximize the log-likelihood function–which is what is usually done–or we could do it using MCMC. In either case, we need to specify the log(unnormalized some times)-likelihood function:

ll <- function(p, X., y.) {
  
  joint_ll <- dnorm(y. - (p[1] + X.*p[2]), sd = p[3], log = TRUE)
  joint_ll <- sum(joint_ll)
  
  # If is undefined, then we explicitly return infinte (instead of NaN, for
  # example)
  if (!is.finite(joint_ll))
    return(-Inf)
  
  joint_ll
}

Notice that the function has more than one argument, in this case, p, the vector of parameters, X. and y., the data of our model.

Let’s do a first run of the MCMC algorithm using the function of the same name (first, load the package, of course):

library(fmcmc)

# Running the MCMC (we set the seed first)
set.seed(1215)
ans <- MCMC(
  ll,
  initial = c(0, 0, sd(y)),
  nsteps  = 5000,
  X.      = X,
  y.      = y
  )

As the output object is an object of class mcmc from the coda R package, we can use all the functions from it on our output:

library(coda)
plot(ans)

summary(ans)

## 
## Iterations = 1:5000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##       Mean      SD Naive SE Time-series SE
## par1 3.113 0.17593 0.002488       0.024341
## par2 1.975 0.10647 0.001506       0.022105
## par3 4.093 0.07843 0.001109       0.005951
## 
## 2. Quantiles for each variable:
## 
##       2.5%   25%   50%   75% 97.5%
## par1 2.975 3.029 3.068 3.255 3.354
## par2 1.749 1.907 1.980 2.020 2.145
## par3 3.978 4.070 4.101 4.102 4.226

While the summary statistics look very good (we got very close to the original parameters), the trace of the parameters looks very bad (poor mixing). We can re-run the algorithm changing the scale parameter in the kernel_normal function. To do so, we can pass ans as the initial argument so that the function starts from the last point of that chain:

ans <- MCMC(
  ll,
  initial = ans,
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_normal(scale = .05) # We can set the scale parameter like this
  )
plot(ans)

Much better! Now, what if we use Vihola (2012) Robust Adaptive Metropolis (which is also implemented in the R package adaptMCMC)

ans_RAM <- MCMC(
  ll,
  initial = ans,
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_ram() 
  )
plot(ans_RAM)

1 - rejectionRate(ans_RAM)

##      par1      par2      par3 
## 0.3522705 0.3522705 0.3522705

We can also try using Haario et al. (2001) Adaptive Metropolis

ans_AM <- MCMC(
  ll,
  initial = ans,
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_adapt() 
  )
plot(ans_AM)

1 - rejectionRate(ans_AM)

##      par1      par2      par3 
## 0.5457091 0.5457091 0.5457091

Finally, if needed, we can also access information about the last run using MCMC_OUTPUT. For example, if we wanted to look at the trace of the logposterior function, we could use the get_logpost() function:

plot(get_logpost(), type = "l")

The set of proposed values is also available using the get_draws() function:

boxplot(get_draws(), type = "l")

If the previous run featured multiple chains, then get_logpost() would return a list instead of length get_nchains().

Automatic stop

Now, suppose that the algorithm actually takes a lot of time to actually reach stationary state, then it would be nice to actually sample from the posterior distribution once convergence has been reached. In the following example we use multiple chains and the Gelman-Rubin diagnostic to check for convergence of the chain:

set.seed(1215) # Same seed as before
ans <- MCMC(
  ll,
  initial = c(0, 0, sd(y)),
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_normal(scale = .05),
  nchains = 2,                           # Multiple chains
  conv_checker = convergence_gelman(200) # Checking for conv. every 200 steps
  )

## No convergence yet (steps count: 200). Gelman-Rubin's R: 4.5843. Trying with the next bulk.

## No convergence yet (steps count: 400). Gelman-Rubin's R: 1.1877. Trying with the next bulk.

## No convergence yet (steps count: 600). Gelman-Rubin's R: 1.4297. Trying with the next bulk.

## No convergence yet (steps count: 800). Gelman-Rubin's R: 1.1582. Trying with the next bulk.

## No convergence yet (steps count: 1000). Gelman-Rubin's R: 1.3414. Trying with the next bulk.

## No convergence yet (steps count: 1200). Gelman-Rubin's R: 1.2727. Trying with the next bulk.

## No convergence yet (steps count: 1400). Gelman-Rubin's R: 1.4456. Trying with the next bulk.

## No convergence yet (steps count: 1600). Gelman-Rubin's R: 1.3792. Trying with the next bulk.

## No convergence yet (steps count: 1800). Gelman-Rubin's R: 1.2069. Trying with the next bulk.

## No convergence yet (steps count: 2000). Gelman-Rubin's R: 1.1789. Trying with the next bulk.

## No convergence yet (steps count: 2200). Gelman-Rubin's R: 1.1208. Trying with the next bulk.

## No convergence yet (steps count: 2400). Gelman-Rubin's R: 1.1196. Trying with the next bulk.

## Convergence has been reached with 2600 steps. Gelman-Rubin's R: 1.0792. (2600 final count of samples).

As a difference from the previous case, now we didn’t have to wait until the 5,000 completed, but the algorithm stopped for us, allowing us to start generating the desired sample much quicker.

Kernels: Making sure that we get positive values

For this final example, we will use the kernel argument and provide what corresponds to a transition kernel which makes proposals within certain boundaries, in particular, we want the algorithm to propose only positive values for the sd parameter, which we now must be positive.

Moreover, since we know that we will only get positive values, we can go further and modify ll skipping the check for finite values:

ll <- function(p, X., y.) {
  
  sum(dnorm(y. - (p[1] + X.*p[2]), sd = p[3], log = TRUE))

}

Much simpler function! Let’s do the call of the MCMC function specifying the right transition kernel to increase the acceptance rate. In this example, we will set the max of all parameters to be 5.0, and the min to be -5.0 for the constant and 0 for the beta coefficient and the variance parameter, all this using the kernel_normal_reflective (which implements a normal kernel with boundaries) function:

set.seed(1215) # Same seed as before
ans <- MCMC(
  ll,
  initial = c(0, 0, sd(y)),
  nsteps  = 5000,
  X.      = X,
  y.      = y,
  kernel  = kernel_normal_reflective(
    ub    = 5.0,               # All parameters have the same upper bound
    lb    = c(-5.0, 0.0, 0.0), # But lower bound is specified per parameter
    scale = 0.05               # This is the same scale as before
    ),
  nchains = 2,                           
  conv_checker = convergence_gelman(200)
  )

## No convergence yet (steps count: 200). Gelman-Rubin's R: 3.7891. Trying with the next bulk.

## No convergence yet (steps count: 400). Gelman-Rubin's R: 1.1257. Trying with the next bulk.

## No convergence yet (steps count: 600). Gelman-Rubin's R: 1.4696. Trying with the next bulk.

## No convergence yet (steps count: 800). Gelman-Rubin's R: 1.1313. Trying with the next bulk.

## No convergence yet (steps count: 1000). Gelman-Rubin's R: 1.4384. Trying with the next bulk.

## No convergence yet (steps count: 1200). Gelman-Rubin's R: 1.3696. Trying with the next bulk.

## No convergence yet (steps count: 1400). Gelman-Rubin's R: 1.5243. Trying with the next bulk.

## No convergence yet (steps count: 1600). Gelman-Rubin's R: 1.3720. Trying with the next bulk.

## No convergence yet (steps count: 1800). Gelman-Rubin's R: 1.1722. Trying with the next bulk.

## No convergence yet (steps count: 2000). Gelman-Rubin's R: 1.1492. Trying with the next bulk.

## No convergence yet (steps count: 2200). Gelman-Rubin's R: 1.1004. Trying with the next bulk.

## No convergence yet (steps count: 2400). Gelman-Rubin's R: 1.1161. Trying with the next bulk.

## Convergence has been reached with 2600 steps. Gelman-Rubin's R: 1.0815. (2600 final count of samples).

Again, as the proposal kernel has lower and upper bounds, then we are guaranteed that all proposed states are within the support of the parameter space.

Other tools

fmcmc is just one way to work with Markov Chain Monte Carlo. Besides stan and WinBUGS, there are other ways to do MCMC in R: mcmc, HybridMC, adaptMCMC, and elhmc among others (take a look at the CRAN Task View on Bayesian Inference.)

Contributing to `fmcmc`

We welcome contributions to fmcmc. Whether reporting a bug, starting a discussion by asking a question, or proposing/requesting a new feature, please go by creating a new issue here so that we can talk about it.

Please note that the ‘fmcmc’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Funding

Supported by National Cancer Institute Grant #1P01CA196596.

fmcmc's People

Contributors

Stargazers

Watchers

Forkers

frank-sw arokem kristianeschenburg imarcello jfontestad dmi3kno mcc5635

fmcmc's Issues

proposal argument of kernel_new is not documented clearly

...based on the doc, I have no idea what properties the environment this operates on is expected to have.
I could make some educated guesses based on the example, but I think this needs to be made much more explicit.

Kernels to add

Here is a list of kernels that we could add:

Bactrian (link)
DRAM (link)
DR (link)

Convergence check fails when a dataset is named `dat`

This is a consequence of the convergence checker function not binding the search path of objects to LAST_MCMC, for example. This is easy to fix, just set inherit = FALSE in the get(), assign(), mget(), and exists() functions. Nonetheless, the assign function already has inherit = FALSE.

New mcmc paper

I dont know perhaps you will find it interesting
https://arxiv.org/abs/2306.07961

Return more things

Awesome package. I switched to using fmcmc and love it! Really streamlined interface, burgeoning functionality, stellar performance.

There's one feature I miss from adaptMCMC and that is the ability to return multiple things from the logdensity function. Here's a quote describing the function argument in adaptMCMC::MCMC()

function that returns a value proportional to the log probability density to sample from. Alternatively it can be a function that returns a list with at least one element named log.density.
In some cases the function p may not only calculate the log density but return a list containing also other values. For example, if p is a log posterior one may be also interested to store the corresponding prior and likelihood values. The function must either return always a scalar or always a list, however,the length of the list may vary.

I understand that parsing the returned list will cost performance, but it could enable more interesting output from MCMC and, in particular, allow users to transform parameters (similar to transformed parameters block in STAN).

Follow checklist of rOpenSci

Take a look at the checklist in https://hackmd.io/zVWTAl9ZQeCcj_bvMGcmMQ#fmcmc

Part of rOpenSci statistical software peer review

C++ linkable package (interfaces)

The idea is that the core functions that are written in C++ should be available to be used in other projects based on Rcpp. The way this is done is detailed in section 3.6 Providing a C++ Interface https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-attributes.pdf of the Rcpp vignette Rcpp-attributes.

Automatic stop

When running multiple chains, we should be able to check whether the MCMC converged in an automatic way. This would imply that, in the case of using parallel, child processes should communicate with the master process at every set of iterations. The set of functions that can be used for this are: parallel::children, parallel::readChild, etc. Some observations:

Data should be passed to the master process to calculate, say, gelman stat
Chains should be pausing during this process to avoid messing with the Pseudo-RNG sequence process in general (don't know if the function call passed by the user may be using a random chain as well).
The problem with the communication functions between parent/child processes is that these only work for unix based systems, i.e. it won't work on windows.

there is no tagged release version connected to the JOSS submission

see openjournals/joss-reviews#1427

Error in ...names() : could not find function "...names"

Hello developers,

Thank you for the wonderful work! I am trying to use your package but was faced with some issues. I installed the package and then run the demo code below:

library(fmcmc)
set.seed(78845)
n <- 1000
X <- rnorm(n)
y <- 3.0 + 2.0*X + rnorm(n, sd = 4)

ll <- function(p, X., y.) {
  
  joint_ll <- dnorm(y. - (p[1] + X.*p[2]), sd = p[3], log = TRUE)
  joint_ll <- sum(joint_ll)
  
  if (!is.finite(joint_ll))
    return(-Inf)
  
  joint_ll
}

set.seed(1215)
ans <- MCMC(
  ll,
  initial = c(0, 0, sd(y)),
  nsteps  = 5000,
  X.      = X,
  y.      = y
)

However, it gave me an error said "Error in ...names() : could not find function "...names"". I am using R 4.0.2 with RStudio to run the code. Can you help me with this? Thank you very much.

Easier specification of bounds for parameters

I understand that model kernels have lb and ub arguments for specifying the parameter bounds.

I have a mix of bounded and unbounded parameters. Some parameters are on [0,1] others are [0,$\infty$]. I understand that for unbounded parameters I should provide .Machine$double.xmax, but that is somehow inconvenient to remember.

Ideally, I would love to specify the bounds per parameter in a named vector like this

kernel_ram( ..., lb=c(alpha=NA, beta=0,gamma=0), ub=c(alpha=NA, beta=1, gamma=NA))

This should be interpreted as: alpha is unbounded, beta is bounded on [0,1] and gamma is positive values only.

Alternatively, if you don't want to rewrite the lb, ub, I think including the bounds=list(alpha=NA, beta=c(0,1), gamma=c(0,NA)) into the main function fmcmc::MCMC might be an OK solution. Then you can only look up to this argument in the main function and if the bounds are defined, the kernel would inherit them.

Does this make any sense? I am trying to mimic the parameter specification in Stan where for each parameter I can specify the lower and upper bound right in the parameter declaration block.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.