stan-dev / stan Goto Github PK

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.

Home Page: https://mc-stan.org

License: BSD 3-Clause "New" or "Revised" License

C++ 98.51% HTML 0.01% Python 0.16% Makefile 0.17% Stan 0.34% C 0.81%

stan bayesian-inference bayesian bayesian-methods bayesian-statistics bayesian-data-analysis

stan's Introduction

Stan is a C++ package providing

full Bayesian inference using the No-U-Turn sampler (NUTS), a variant of Hamiltonian Monte Carlo (HMC),
approximate Bayesian inference using automatic differentiation variational inference (ADVI), and
penalized maximum likelihood estimation (MLE) using L-BFGS optimization.

It is built on top of the Stan Math library, which provides

a full first- and higher-order automatic differentiation library based on C++ template overloads, and
a supporting fully-templated matrix, linear algebra, and probability special function library.

There are interfaces available in R, Python, MATLAB, Julia, Stata, Mathematica, and for the command line.

Home Page

Stan's home page, with links to everything you'll need to use Stan is:

http://mc-stan.org/

Interfaces

There are separate repositories in the stan-dev GitHub organization for the interfaces, higher-level libraries and lower-level libraries.

Source Repository

Stan's source-code repository is hosted here on GitHub.

Licensing

The Stan math library, core Stan code, and CmdStan are licensed under new BSD. RStan and PyStan are licensed under GPLv3, with other interfaces having other open-source licenses.

Note that the Stan math library depends on the Intel TBB library which is licensed under the Apache 2.0 license. This dependency implies an additional restriction as compared to the new BSD lincense alone. The Apache 2.0 license is incompatible with GPL-2 licensed code if distributed as a unitary binary. You may refer to the Licensing page on the Stan wiki.

stan's People

Contributors

Stargazers

Watchers

Forkers

zaxtax yajuansi-sophie msuchard aflaxman ryanjparker randommm mbrubake sumtxt bgoodri jrnold alienfeel quanteek jadams41 ascarb jgors ken-b martin-smira djsutherland ksvanhorn zenourn naleksi ecbrown danstowell tomwallis frenchjl angelberihuete gcavanaugh bisaacs fredfang88 davharris afey casallas gemcavoy kforeman bnicenboim itfrombit herrahuu jonschurm siddalal fandres70 alyst nvdnkpr antoniopvgs mlstats303 suyeonkim darthsuogles yuczyk samcarlos in4ins dlovell lazycrazyowl yuouyang wcools brugel18 araymund jpritikin actuariat ewan twistedmove apw zeugirdor tomhaber ido azvoleff mdlerch fpcmotif romarcha maverickg javaosos jtdeweber john-colvin mshvartsman 0xend bhuroc marchandpatrick tosh1ki jq-chen jgabry efernandez dvukcevic daniel-b-smith noamross gpfreitas patricksnape housian0724 chengduozhao piero-ranalli isoyang ezhangle jb3618columbia zhmz90 marcelomata migueldvb bunnyrabbit8mile junjiemao maxc01 ariddell aaronchall infotroph caohy1988

stan's Issues

Parser warning messages when compiled with clang++ 3.3

Not high priority, but since we've been going after these

clang++ -I src -I lib/eigen_3.1.3 -I lib/boost_1.53.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -Wno-unused-function -Wno-tautological-compare   -c -O0 -o bin/stan/gm/grammars/statement_2_grammar_inst.o src/stan/gm/grammars/statement_2_grammar_inst.cpp
In file included from src/stan/gm/grammars/statement_2_grammar_inst.cpp:1:
src/stan/gm/grammars/statement_2_grammar_def.hpp:120:18: warning: multiple unsequenced modifications to '_pass' [-Wunsequenced]
          [_pass = add_conditional_condition_f(_val,_1,
                 ^
1 warning generated.
clang++ -I src -I lib/eigen_3.1.3 -I lib/boost_1.53.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -Wno-unused-function -Wno-tautological-compare   -c -O0 -o bin/stan/gm/grammars/statement_grammar_inst.o src/stan/gm/grammars/statement_grammar_inst.cpp
In file included from src/stan/gm/grammars/statement_grammar_inst.cpp:1:
src/stan/gm/grammars/statement_grammar_def.hpp:406:13: warning: multiple unsequenced modifications to '_pass' [-Wunsequenced]
            = validate_assignment_f(_1,_r2,boost::phoenix::ref(var_map_),
            ^
src/stan/gm/grammars/statement_grammar_def.hpp:483:16: warning: multiple unsequenced modifications to '_pass' [-Wunsequenced]
        [_pass = validate_int_expr2_f(_1,boost::phoenix::ref(error_msgs_))]
               ^
2 warnings generated.
clang++ -I src -I lib/eigen_3.1.3 -I lib/boost_1.53.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -Wno-unused-function -Wno-tautological-compare   -c -O0 -o bin/stan/gm/grammars/term_grammar_inst.o src/stan/gm/grammars/term_grammar_inst.cpp
In file included from src/stan/gm/grammars/term_grammar_inst.cpp:1:
src/stan/gm/grammars/term_grammar_def.hpp:435:38: warning: multiple unsequenced modifications to '_val' [-Wunsequenced]
                               [_val = multiplication(_val,_1,
                                     ^
src/stan/gm/grammars/term_grammar_def.hpp:455:29: warning: multiple unsequenced modifications to '_val' [-Wunsequenced]
                      [_val = negate_expr_f(_1,boost::phoenix::ref(error_msgs_))]
                            ^
src/stan/gm/grammars/term_grammar_def.hpp:467:22: warning: multiple unsequenced modifications to '_val' [-Wunsequenced]
               [_val = add_expression_dimss_f(_val, _1, _pass,
                     ^
src/stan/gm/grammars/term_grammar_def.hpp:480:37: warning: multiple unsequenced modifications to '_val' [-Wunsequenced]
        | fun_r(_r1)          [_val = set_fun_type_named_f(_1,_r1,_pass,boost::phoenix::ref(error_msgs_))]
                                    ^
4 warnings generated.
clang++ -I src -I lib/eigen_3.1.3 -I lib/boost_1.53.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -Wno-unused-function -Wno-tautological-compare   -c -O0 -o bin/stan/gm/grammars/var_decls_grammar_inst.o src/stan/gm/grammars/var_decls_grammar_inst.cpp
In file included from src/stan/gm/grammars/var_decls_grammar_inst.cpp:1:
src/stan/gm/grammars/var_decls_grammar_def.hpp:582:19: warning: multiple unsequenced modifications to '_val' [-Wunsequenced]
            [_val = add_var_f(_1,boost::phoenix::ref(var_map_),_a,_r2,
                  ^
src/stan/gm/grammars/var_decls_grammar_def.hpp:670:18: warning: multiple unsequenced modifications to '_pass' [-Wunsequenced]
          [_pass = validate_int_expr_f(_1,boost::phoenix::ref(error_msgs_))]
                 ^
src/stan/gm/grammars/var_decls_grammar_def.hpp:767:23: warning: multiple unsequenced modifications to '_pass' [-Wunsequenced]
              [ _pass = set_int_range_lower_f(_val,_1,
                      ^
src/stan/gm/grammars/var_decls_grammar_def.hpp:791:23: warning: multiple unsequenced modifications to '_pass' [-Wunsequenced]
              [ _pass = set_double_range_lower_f(_val,_1,
                      ^
src/stan/gm/grammars/var_decls_grammar_def.hpp:824:16: warning: multiple unsequenced modifications to '_pass' [-Wunsequenced]
        [_pass = validate_int_expr_f(_1,boost::phoenix::ref(error_msgs_))]
               ^
5 warnings generated.

multiple on.exit in stan_model require add = TRUE

Hi,
I am reading rstan code to better understand the mechanism which looks great so far. I found that there are multiple instances of on.exit in stan_model. The default value for the add argument in on.exit is FALSE, this however overwrites any previously set expressions. To execute all expressions you need to use add = TRUE.
Peter

Latent variable model

Consider the following small model (latent variable model) in written in STAN:

simulate a data set ---

N=10
n1=2
n2=3
d=2
set.seed(20004)
A=matrix(rnorm(n1*d),byrow=TRUE,ncol=n1)

A=matrix(c(.8,0.5,0.95,.9),byrow=TRUE,ncol=2)

set.seed(20013)
B=matrix(rnorm(n2*d),ncol=n2,byrow=TRUE)

B=matrix(c(.7,.4,.8,.95,.8,.5),ncol=3,byrow=TRUE)

d=nrow(A)
al0=.1
bet0=10
sig_te=1
sig_lb=1
set.seed(2001)
Z=rmvnorm(N, mean=rep(1,d), sigma= diag(rep(1,d)), method=c("svd")) ## latent variable
set.seed(2003)
teta=Z%%A + rmvnorm(nrow(Z), mean=rep(0,ncol(A)), sigma= diag(rep(sig_te,ncol(A))),method=c("svd")) ## construct the mean of the response X (counts)
set.seed(2004)
lambd=Z%%B + rmvnorm(nrow(Z), mean=rep(0,ncol(B)), sigma= diag(rep(sig_lb,ncol(B))),method=c("svd")) ## construct the mean of the response X (counts)
X=apply(exp(teta),c(1,2),rpois,n=1) ## Simulate a matrix of counts
Y=apply(exp(lambd),c(1,2),rpois,n=1) ## Simulate a matrix of counts

stan code is

Bcca_mod <- '
data {
int<lower=1> N; // sample size
int<lower=1> n1;
int<lower=1> n2;
int<lower=1> d;
int X[n1,N];
int Y[n2,N];
}
parameters {
real<lower=0> bet[d];
real<lower=0> sig_te;
real<lower=0> sig_lb;
matrix[d,N] Z;
matrix[n1,d] A;
matrix[n2,d] B;
matrix<lower=0>[n1,N] teta;
matrix<lower=0>[n2,N] lambd;
}
transformed parameters {
matrix[n1,N] mu_te;
matrix[n2,N] mu_lb;
real<lower=0> sigte_sr;
real<lower=0> siglb_sr;
real<lower=0> beta_sqt[d];

mu_te <- A_Z;
mu_lb <- B_Z;
siglb_sr <- sqrt(sig_lb);
sigte_sr <- sqrt(sig_te);
for(i in 1:d)
beta_sqt[i] <- sqrt(bet[i]);
}
model {
bet ~ inv_gamma(1, 10);
sig_lb ~ scaled_inv_chi_square(1, .1) ;
sig_te ~ scaled_inv_chi_square(1, .1) ;
for(i in 1:n1)
for(j in 1:d)
A[i,j] ~ normal(0, beta_sqt[j]);

for(i in 1:n2)
for(j in 1:d)
B[i,j] ~ normal(0, beta_sqt[j]);

for(j in 1:N)
{
for(i in 1:n1)
{
log(teta[i,j]) ~ normal(mu_te[i,j], sigte_sr);
lp__ <- lp__ - log(fabs(teta[i,j]));
X[i,j] ~ poisson(teta[i,j]);
}
for(k in 1:n2)
{
log(lambd[k,j]) ~ normal(mu_lb[k,j], siglb_sr);
lp__ <- lp__ - log(fabs(lambd[k,j]));
Y[k,j] ~ poisson(lambd[k,j]);
}
}
}
'
bcca_dat<-list("X"=t(X_nmct),"Y"=t(Y_nmct),"n1"=n1,"n2"=n2,"N"=N,"d"=d)
fit <- stan(model_code = Bcca_mod, data = bcca_dat, thin=3, iter = 1000, chains =3 ,pars=c("A","B","Z","sig_te","sig_lb","bet"))

I am fitting this hierarchical latent variable model and all the elements of the matrix A and B ares all estimated to be zero, while the element of Z are all estimated very large (see printout bellow)

Inference for Stan model: Bcca_mod.
3 chains: each with iter=20000; warmup=10000; thin=3; 6667 iterations saved.

           mean    se_mean         sd         2.5%         25%         50%        75%      97.5% n_eff Rhat

A[1,1] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2 1.6
A[1,2] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2 2.0
A[2,1] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3 1.6
A[2,2] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 4 2.1
B[1,1] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 8 1.5
B[1,2] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2 2.9
B[2,1] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3 2.4
B[2,2] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3 2.3
B[3,1] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2 1.6
B[3,2] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3 1.5
Z[1,1] -3245858.6 3027342.4 4694869.0 -8296523.8 -8287517.5 -2200344.4 318705.7 6124306.6 2 2.0
Z[1,2] 4504070.7 3684314.0 7250106.8 -2874141.0 -657176.8 4265641.0 4280775.2 26880865.4 4 1.6
Z[1,3] 19626308.7 22365727.0 31572636.9 -1064155.1 -557280.3 350743.2 31023577.3 97536345.9 2 2.0
Z[1,4] 11655709.4 17601811.2 24830371.8 -8987598.7 -8958613.4 3195439.8 22584286.8 73142855.9 2 2.4
Z[1,5] -1122785.7 1150233.5 5007368.7 -13019383.0 -2837632.1 23611.4 617058.3 9846651.0 19 1.2
Z[1,6] 13407266.6 16947374.1 22742390.5 -7224670.5 -7181384.1 7740008.7 22675889.7 69516936.7 2 2.3
Z[1,7] 10338674.6 9285371.6 15815139.6 -730800.5 101530.3 3995719.8 12034942.7 55220472.6 3 1.9
Z[1,8] 7637098.9 7952136.4 13203756.9 -1226119.8 38935.5 1973112.7 9378422.6 45014451.8 3 1.5
Z[1,9] 1210542.8 1061282.0 1974234.5 -1194412.8 -164422.0 1245553.9 1254125.8 6725986.0 3 1.8
Z[1,10] -2693903.5 2686573.4 3841212.4 -7148124.2 -7139701.0 -685440.4 141490.1 4113091.7 2 2.0
Z[2,1] -5801621.5 9253713.8 14810936.4 -46321837.9 -9843362.2 -1341058.0 5800328.4 5816089.0 3 1.6
Z[2,2] 10096596.5 10222673.9 13586522.8 -6755993.9 -6735483.5 13484506.7 17905026.5 36518931.4 2 2.9
Z[2,3] 2800418.4 16855696.0 25850461.6 -26269696.8 -26218933.4 7560253.9 23756635.8 48619880.5 2 2.8
Z[2,4] 4772436.9 24295180.2 32893391.8 -48369262.5 -11745741.8 -9802489.5 39142586.9 60667291.8 2 2.3
Z[2,5] -18865730.6 21355658.6 27715893.3 -75868998.5 -40927244.9 -1598305.9 1489068.9 1516422.9 2 6.3
Z[2,6] 2154677.3 8229186.5 23962144.5 -42000505.6 -11155852.2 -11115477.7 18637537.0 57604470.1 8 1.4
Z[2,7] 10948755.3 21065524.9 28735783.6 -16035994.5 -16006463.7 5552546.1 22052034.6 89614348.3 2 2.1
Z[2,8] -11040127.8 9346621.3 34092517.8 -100445107.4 -9607665.2 -9582981.7 3254107.2 67018467.2 13 1.1
Z[2,9] 1951657.3 1878997.1 5318720.3 -9089229.1 -1078605.5 292002.1 4742178.3 14580104.7 8 1.6
Z[2,10] -2725101.6 8803446.4 11911511.2 -29859881.7 -10526394.2 -917495.2 9189455.9 9206906.8 2 2.6
sig_te 1.8 1.5 2.7 0.3 0.5 0.7 1.9 8.6 3 1.4
sig_lb 2.6 1.3 1.9 0.4 1.3 2.1 3.5 7.3 2 1.7
bet[1] 3.4 0.3 2.2 1.4 2.6 2.8 3.4 9.2 45 1.0
bet[2] 3.5 0.6 2.8 1.3 2.4 2.5 4.1 9.4 21 1.1
lp__ 336.9 5.5 11.7 312.5 328.5 339.3 345.5 358.0 5 1.3

Is it something wrong with my model or code?

Thank you

Remove sign comparison warning in validate_multiplicable.

Need to change this comparison:
if (x1.cols() == x2.rows())

to
if (x1.cols() == static_cast<typename T1::size_type>(x2.rows()))

Blocked on Issue #46. Can't implement this until LDLT_type exposes the size_type.

Line out of sequence in src/models/misc/gaussian-process/gp-sim.R

The line

y <- fit_sim_ss$y[1,]; # any sample will do

needs to be moved in front of the definition of df, where y is used.

I'm using Rstan 1.3 on Linux for amd64.

stanc generating incorrect c++ code for this model

data {
  vector[4] zeros;
  matrix[120,4] EllipseEstimate;
  matrix[120,4] EllipseVar;
  int n;
  int J;
  int K;
  int L;
  int cow[120];
  int realday[120];
  int trial[120];
  int treatment[120];
  vector[120] HLI;
}

parameters {
  vector<lower=0,upper=100>[4] int2sd;
  vector<lower=0,upper=100>[4] taucowsd;
  vector<lower=0,upper=100>[4] taudaysd;
  vector<lower=0,upper=100>[4] tautrialsd;
  corr_matrix[4] int2corr;
  corr_matrix[4] taucowcorr;
  corr_matrix[4] taudaycorr;
  corr_matrix[4] tautrialcorr;

  vector[4] trthli;
  vector[4] trt;
  vector[4] hli;
  vector[4] booterror;
  vector[4] int1;

  vector[4] cowrandom[J];
  vector[4] dayrandom[K];
  vector[4] trialrandom[L];

}

transformed parameters {
  cov_matrix[4] int2;
  cov_matrix[4] taucow;
  cov_matrix[4] tauday;
  cov_matrix[4] tautrial;
  cov_matrix[4] tauhat[n];
  vector[4] Esthat[n];

  for (i in 1:n) {
    Esthat[i] <- int1+trthli*treatment[i]*HLI[i]+trt*treatment[i]+hli*HLI[i]+cowrandom[cow[i]] +dayrandom[realday[i]]+trialrandom[trial[i]];
    tauhat[i] <- int2+diag_matrix(EllipseVar[i]' .* booterror);
  }

  int2 <- diag_matrix(int2sd)*int2corr*diag_matrix(int2sd);
  taucow <- diag_matrix(taucowsd)*taucowcorr*diag_matrix(taucowsd);
  tauday <- diag_matrix(taudaysd)*taudaycorr*diag_matrix(taudaysd);
  tautrial <- diag_matrix(tautrialsd)*tautrialcorr*diag_matrix(tautrialsd);
  //for (z in 1:4) {
  //int2[z,z] <-int2sd[z]*int2sd[z];
  //taucow[z,z] <- taucowsd[z]*taucowsd[z];
  //tauday[z,z] <- taudaysd[z]*taudaysd[z];
  //tautrial[z,z] <- tautrialsd[z]*tautrialsd[z];
  //for (v in (z+1):4) {
  //int2[z,v] <-int2sd[z]*int2sd[v]*int2corr[z,v];
  //taucow[z,v] <- taucowsd[z]*taucowsd[v]*taucowcorr[z,v];
  //tauday[z,v] <- taudaysd[z]*taudaysd[v]*taudaycorr[z,v];
  //tautrial[z,v] <- tautrialsd[z]*tautrialsd[v]*tautrialcorr[z,v];
  //int2[v,z] <- int2[z,v];
  //taucow[v,z] <- taucow[z,v];
  //tauday[v,z] <- tauday[z,v];
  //tautrial[v,z] <- tautrial[z,v];
  //}
  //}
}

model {
  for (i in 1:n){
    EllipseEstimate[i]'~multi_normal(Esthat[i],tauhat[i]);
  }
  trthli~normal(0,10);
  trt~normal(0,10);
  hli~normal(0,10);
  booterror~gamma(1,1);
  int1~normal(0,100);

  //Covariance matrix priors Standard deviations uniformly distributed from 0 to 100, correlations uniformly distributed from -1 to 1 All independent

  int2corr ~ lkj_corr(1);
  taucowcorr ~ lkj_corr(1);
  taudaycorr ~ lkj_corr(1);
  tautrialcorr ~ lkj_corr(1);

  for (j in 1:J){
    cowrandom[j]~multi_normal(zeros,taucow);
  }
  for (k in 1:K){
    dayrandom[k]~multi_normal(zeros,tauday);
  }
  for (l in 1:L){
    trialrandom[l]~multi_normal(zeros,tautrial);
  }
}

matrix to vector

Apologies if this function is already supported, but I couldn't find it in the manual. I'm looking for the following functionality (copied from Matlab):

A =

A(:)

ans =

Intended use:

A(:) ~ multi_normal_prec( zeros , kronecker_product(X,Y) );

Best regards,
Marcel

User Manual Typo? Page 95

I'm working on defining a prior for a cov matrix. On page 95 of the Stan user manual 1.3.0, there is a description for defining a cov matrix from a correlation matrix and a scaling vector.

The bolded line below shows where I think there might be a typo, where the n in Omega should be an m.

... data block as before, but without alpha ...
parameters {
vector[K] mu; // topic mean
corr_matrix[K] Omega; // correlation matrix
vector<lower=0>[K] sigma; // scales
vector[K] eta[M]; // logit topic dist for doc m
simplex[V] phi[K]; // word dist for topic k
}
transformed parameters {
... eta as above ...
cov_matrix[K] Sigma; // covariance matrix
for (m in 1:K) {
Sigma[m,m] <- sigma[m] * sigma[m] * Omega[m,n];
for (n in (m+1):K) {
Sigma[m,n] <- sigma[m] * sigma[n] * Omega[m,n];
Sigma[n,m] <- Sigma[m,n];
}
}
}
model {
mu ~ normal(0,5); // vectorized, diffuse
Omega ~ lkj_corr(2.0); // regularize to unit correlation
sigma ~ cauchy(0,5); // half-Cauchy due to constraint
...

stan_rdump patch

Reported by jeffrey.arnold, Feb 2 (5 days ago)

added option vectors to write some variables as vectors even if
length 1. ( issue #55 )
added option quiet to suppress warning messages.
added warning message if a variable was not numeric.

Parallel Tempering in multiple threads?

I left a comment on Gelman's blog about this but I realized this is the best place to keep track of feature requests. Here's a copy of that request:

Speaking of mixing, Geyer’s parallel tempering seems like a pretty useful technique. Any chance of including it into Stan so that Stan will run NUTS on 2 or 3 additional tempered posterior densities, with the tempering coefficients given as an array or something? If you do switching with lowish probability then the chains may be mostly able to run in parallel threads on multiple cores without a lot of synchronization overhead and this could be a fantastic way to improve exploration of the space at basically no wall-clock-time cost.

Specifically, I imagine something like the following method:

You set up your N temperatures, maybe something like

stan_parallel_temps <- [1.25,1.5,3]; ## you always need 1 to be first, so perhaps it's better to just declare the additional temps and let stan put 1 at the front of this list on its own.

during warmup Stan estimates the average length between U turns in the untempered distribution internally. It then chooses a number N at random exponentially distributed (or maybe gamma distributed or with a distribution the user specifies in the model file) with mean equal to some constant times this average inter-u-turn length (ceiling to the nearest int).

All the chains then run N steps and synchronize on a thread semaphore so that when all the high temp threads are done with N steps the temp 1 thread can proceed. It proceeds by choosing a random adjacent pair of temperatures and attempting an exchange between those states, and then generating a new N and setting all the threads back to work to do N HMC timesteps.

in this scheme you most of the time complete several HMC trajectories before trying to exchange, so you don't disrupt the NUTS sampler too much, but then you have several tempered distributions running in parallel, and feeding your untempered simulation new regions of space so you may be able to explore space more readily. And as I said, with several cores your wall clock overhead is only due to the synchronization, which shouldn't be too bad.

Mislabeling of Matrix Data/Parameters in RStan Samples when Written to File

RStan (1.3.0) in R (3.0.0) appears to be mislabeling parameters stored in matrices when writing to sample file using (sample_file=).

The parameter values appear to be "correct", and the model appears to be fitting the data quite well, however, the labeled output is incorrect.

The sample columns are correctly labeled for parameter arrays of type [1,1], and for vectors/arrays of type [1,], but not for matrices.

In my model Beta is a parameter matrix (Beta.x.y) in which the first value,x, indicates the cluster, and the next value,y, indicates the co-variate.

The csv column heads from the sample file read as follows:
"Beta.1.1, Beta.1.2, Beta.1.3, Beta.2.1, Beta.2.2, Beta.2.3, etc..."

But the samples appear to be populated as follows:
"Beta.1.1,Beta.2.1, Beta.3.1, Beta.2.1, Beta.2.2, Beta.2.3...etc..."

Initializing length 1 array parameters

Reported by jeffrey.arnold, Aug 21, 2012

If there is an array parameter with a length of 1, e.g.

real mu[N];

where N had been set to 1, then there are errors if the dump file used for initialization does not use c(). E.g.

mu <- 0

gives the error,

Exception: require 1 dimensions for variable mu
Diagnostic information:
Dynamic exception type: std::runtime_error
std::exception::what: require 1 dimensions for variable mu

Since R dump files don't distinguish between scalars and vectors of length 1, this should probably be handled by the dump parser. Ensuring all entries in the dump are wrapped in c() doesn't work because then scalar parameters complain that they require 0 dimensions. Oddly, this error message does not occur when data variables declared to be an array with a length of 1 are passed a scalar from the dump file.

I attached a model, data file, and init file that produce this problem. The model isn't a real one, it is just here to produce the error.

$ g++ --version
g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

$ uname -a
Linux Jane 3.2.0-24-generic #37-Ubuntu SMP Wed Apr 25 08:43:22 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

$ git rev-parse HEAD
9678f2f

stan output csv column-major

In stan develop branch, now it has the same problem with rstan 1.3.0 in the output csv files. The header for an array parameter is by row-major, but the iterations are by column-major. Use the following model as an example.

parameters {
  real y;
}
model {
  y ~ normal(0, 1);
}

generated quantities {
  real z[2,2];
  z[1,1] <- 1;
  z[1,2] <- 2;
  z[2,1] <- 3;
  z[2,2] <- 4;
}

The csv file:

# Samples Generated by Stan
#
# stan_version_major=1
# .....
# algorithm=NUTS with a diagonal Euclidean metric
#
lp__,accept_stat__,stepsize__,treedepth__,y,z.1.1,z.1.2,z.2.1,z.2.2
# Adaptation terminated
# Step size = 2.24325
# Diagonal elements of inverse mass matrix:
#0.846105
-0.00494919,0.0539255,2.24325,0,0.0994906,1,3,2,4
-0.00494919,0.263769,2.24325,0,0.0994906,1,3,2,4
-0.00494919,0.00583527,2.24325,0,0.0994906,1,3,2,4

Truncated multi_normal

Is there a way to get the prior which is multivariate truncated normal distributed?

[ FAILED ] prob_transform.ordered_j (0 ms)

Reported by normvcr, Oct 5, 2012

What steps will reproduce the problem?

make test-unit

What is the expected output? What do you see instead?
src/test/prob/transform_test.cpp:240: Failure
Value of: y[1]
Actual: 1.13534
Expected: 1.0 + exp(-2.0)
Which is: 1.13534
src/test/prob/transform_test.cpp:241: Failure
Value of: y[2]
Actual: 1.14207
Expected: 1.0 + exp(-2.0) + exp(-5.0)
Which is: 1.14207
[ FAILED ] prob_transform.ordered_j (0 ms)

What version of the product are you using? On what operating system?
STAN: V 1.0.2 Release Notes
Fedora17:
Linux 3.5.4-2.fc17.i686.PAE #1 SMP Wed Sep 26 22:10:23 UTC 2012 i686 i686 i386 GNU/Linux

Please provide any additional information below.
The Actual and Expected appear identical.
Likely a "number of significant digits" issues in gtest.

Calculation of the dimensions of tdata vector variables

Reported by astukalov, Feb 1, 2013

It's already mentioned on top of TODO file that using tdata block variables for defining vector dimensions in the same tdata block causes a segfault, but here's another use case to consider for a workaround:

...

transformed data {
int N;
vector[N] x;

<code to calculate N, doesn't use x>
...
}


In my particular use case "x" is an array of input data filtered for missing values.
I can think of 4 workarounds:

delay the initialization of x until its first use, but I guess the resulting C++ code for the model would look ugly
allow variable declaration in the middle of the block, so that x could be declared when N is already defined
allow multiple tdata blocks, .e.g transformed data(1) {}, transformed data(2) {}, so that block 2 can use variables from block 1
in my particular use case I can avoid calculating N, if there would be support for boolean vectors and R-like constructs y > 0, where y is a real vector, so that sum(y>0) would give me the number of non-zero entries in y.

"var" is allowed as a variable name in the modeling language

"var" should be a keyword that shouldn't be allowed as a variable name.

Command line does not check that --init=0 provides a valid starting point

If the model fails with --init=0, Stan will run and appear to be working fine, but will fail to do anything. We should check that all inits are valid and error if a user specified init is broken.

replace Boost is NaN test

The current use of Boost's isNaN test is taking up to 5% of total run time.

workaround for R print trimming

Original bug report to stan-users mailing list:

On 5/24/13 4:49 AM, Dieter Menne wrote:
When using rstan 1.3 in RStudio, Windows 7, R 3.0.1, the print statement seems to trim whitspace from each partial statement, so to get readable output I have to use some other delimiter than space:

print("n= ",n,"  -  -Minute= ",Minute[n]);
                ^^ < Note that the trailing spaces are removed

n= 10-  -Minute= 137

Jiqiang followed up with an example:

> code <- '
+ transformed data {
+   real a;
+   a <- 3;
+   print("a=", " [", a, "]");
+ }
+ parameters {
+   real y;
+ }
+ 
+ model {
+   y ~ normal(0, 1);
+ } 
+ '
> 
> stan(model_code = code, chains = 1, iter = 2)

which prints

TRANSLATING MODEL 'a' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'a' NOW.
a=[3]
SAMPLING FOR MODEL 'a' NOW (CHAIN 1).
Iteration: 2 / 2 [100%]  (Sampling)
Elapsed Time: 0 seconds (Warm Up)
              0 seconds (Sampling)
              0 seconds (Total)

and the space is gone.

The fix is to buffer a whole line into a string in Stan and then dump it out all at once.

Chebyshev and other orthogonal polynomials

It would be great to be able to fit the coefficients of a series expansion by using chebyshev polynomials (optimal in some sense for fitting 1D functions on an interval) and also other common basis functions (other than trig which you already have). writing out the chebyshev polynomials in a+b_x+c_x^2 type form is numerically unstable, so it would be nice to have something like cheby(i,x) and have you guys do the evaluation using the numerically stable methods. I'm sure you can find a library for doing this easily.

other orthogonal polynomials, and perhaps some common radial basis functions (such as http://en.wikipedia.org/wiki/Radial_basis_function) would be good to have as well, this would give Stan users the ability to fit curves and hyper-surfaces in a convenient way.

distributions not vectorized for matrices

Ideally, I'd like to be able to write

parameters {
matrix[M, N] Z;
}
model {
Z ~ normal(0, 1);
}

LDLT_factor needs unit tests

src/stan/math/matrix/ldlt.hpp doesn't have any associated unit tests.

make clean-all does not remove .d and .o files within directories

On Linux,

goodrich@CYBERPOWERPC:/opt/stan$ make -n clean
rm -f  
rm -f

while in the makefile

clean:
        $(RM) $(wildcard *.dSYM) $(wildcard *.d.*)
        $(RM) $(wildcard $(MODEL_SPECS:%.stan=%.cpp) $(MODEL_SPECS:%.stan=%$(EXE)) $(MODEL_SPECS:%.stan=%.o))

So, for example, all the .d and .o files persist.

parser error in assignment returns message about logical_lt

Not exactly incorrect, but the suggestion returned is incorrect.

When there is an error with assignment, the suggestions offered relate to logical_lt.

When using version 1.3.0, the following program produces the stanc error message below it.

data {
  int n;
}
parameters {
  vector[n] foo;
}
transformed parameters {
}
model {
  vector[n] bar;
  for (i in 1:n) {
    bar <- foo[i] + 1;
  }
  foo ~ normal(0, 1);
}

Model name=foo
Input file=foo.stan
Output file=foo.cpp

EXPECTATION FAILURE LOCATION: file=foo.stan; line=12, column=5

    bar <- foo[i] + 1;
    ^-- here


DIAGNOSTIC(S) FROM PARSER:
base type mismatch in assignment; left variable=bar; left base type=vector; right base type=real
binary infix operator <= with functional interpretation logical_lt requires arguments or primitive type (int or real), found left type=vector, right arg type=real; no matches for function name="logical_lt"
    arg 0 type=vector
    arg 1 type=real
available function signatures for logical_lt:
0.  logical_lt(int, int) : int
1.  logical_lt(int, real) : int
2.  logical_lt(real, int) : int
3.  logical_lt(real, real) : int
Parser expecting: "}"

replace lp__ manipulation with increment_log_prob(...)

Replace the current direct access to lp__ with an increment function.

Deprecate use of lp__ with warning in the parser output (eventually remove it altogether).

Replace lp__ doc with indication of deprecation and advice on how to replace it. Maybe an entire deprecation chapter?

Update all of the sample models to use increment.

Save the increments in a list, then return their sum using the more efficient sum() implementation.

LDLT_factor needs typedef size_type

For consistency and to be able to use LDLT_factor as a matrix type, it needs to define a size_type.

More informative error messages for data initialization errors

Reported by jeffrey.arnold, Feb 5 (3 days ago)

Sorry for yet another request for the dump loader.

For errors occurring when loading data, could you add the name of the variable for which there was a problem to the error message? It would make tracking down errors much easier. Right now, error messages like the following are cryptic:

Error : mismatch in number dimensions declared and found in context; processing stage=data initialization; dims declared=(1); dims found=()

Simple Poisson model

All,

I have the following code that fits a Poisson model to a data:

library(rstan)
Bcca_mod <- '
data {
int<lower=1> N; // sample size
real<lower=0> Y[N];
real X[N];
}
parameters {
real<lower=0> a;
real<lower=0> b;
}
transformed parameters {

}
model {
a ~ gamma(1,1);
b~ gamma(1,1);
for(i in 1:N)
Y[i] ~ poisson(a + b*X[i]);
}
'
N=20

X = seq(0,20,length.out=N)/20
a=1.5
b=3
Y = apply(matrix(a+b*X,ncol=1),1,rpois,n=1)

bcca_dat<-list("Y"=Y,"N"=N,"X"=X)
fit <- stan(model_code = Bcca_mod, data = bcca_dat, iter = 1000, chains = 2)

Note that I have the right constraints on a and b and Y even though they don't show.
When I run the above code I get this error message:

Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model 'Bcca_mod' with error message:
EXPECTATION FAILURE LOCATION: file=input; line=18, column=1

Y[i] ~ poisson(a + b*X[i]);
^-- here

DIAGNOSTIC(S) FROM PARSER:
no matches for function name="poisson_log"
arg 0 type=real
arg 1 type=real
unknown distribution=poisson
Parser expecting:

Does it mean that stan does not yet support the poisson distribution??

Any help will be appreciated.
Thanks

change in the current csv output of samples

In current develop branch, the csv files output looks like

# Samples Generated by Stan
#
# stan_version_major=1
# stan_version_minor=3
# stan_version_patch=0
# model=dogs
# data=dogs.data.R
# init=random initialization
# append_samples=0
# save_warmup=0
# seed=4102613135
# chain_id=1
# iter=20
# warmup=10
# thin=1
# nondiag_mass=0
# equal_step_sizes=0
# leapfrog_steps=-1
# max_treedepth=10
# epsilon=-1
# epsilon_pm=0
# delta=0.5
# gamma=0.05
# algorithm=NUTS with a diagonal Euclidean metric
#
log_post,accept_stat,stepsize__,depth__,alpha,beta,A,B
# Adaptation terminated
# Step size = 1
# Diagonal elements of inverse mass matrix:
#0.389775, 0.379961
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502
-277.943,0,1,-1,-0.220179,-0.0861009,0.802375,0.917502

Elapsed Time: 0.008461 seconds (Warm Up)
              0.002675 seconds (Sampling)
              0.011136 seconds (Total)

I propose the following change to make it easier in rstan to read CSV files create by Stan from the command line.

log_post back to lp__ . lp__ has been used a lot and appear in out doc and the rstan getting started, https://code.google.com/p/stan/wiki/RStanGettingStarted
depth__ back to treedepth__
use two underscores for accept_stat
comment the last lines about timing using #

by the way, is treedepth -1 here?

Create separate subject index in the manual

The index of the Reference Manual should have entries for vector and row_vector.

Command-line quick start page needs editing to reflect example model data files being moved to data.R

The datafiles in src/models/basic_estimators were moved from .Rdata files to data.R files. This needs to be reflected in the webpage
http://mc-stan.org/command-quickstart.html

in the bernoulli examples

% ./bernoulli --data=bernoulli.Rdata

Should be

% ./bernoulli --data=bernoulli.data.R

initialize real locals to NaN in transformed parameters and model blocks

Use of an unassigned real local in these blocks currently causes a crash.

structure .Dimnames causes Exception: variable does not exist error

Dear Stan developers,
Stan is very impressive. I greatly appreciate the work of the developers.

A brief note about a problem I have discovered.

The short version: the .Dimnames statement in the structure defining an array in the data dump file causes problems with Stan. (Stan 1.3.0 command line)

I had been running models using RStan without any problems. I then switched to command line (Linux) and was stumped by the following error:

Exception: variable does not exist; processing stage=data initialization; variable name=Zstep; base type=double

The dump file clearly contained the data for Zstep, which I had dumped from R, as follows:

Zstep <-
structure(c(-2.64031248865649, -3.88059991653595, -4.13578157792282,
-4.32400477993687, -2.81446740864351, -3.24117323354198, -2.22087514588352,
... more lines of data ...
-4.14601647724783, -4.31811908703931, -3.8596138974236), .Dim = c(891L,
2L), .Dimnames = list(NULL, c("Zminus", "Zplus")))

I eventually discovered that removing the names from the end of the structure statement:
i.e. removing: , .Dimnames = list(NULL, c("Zminus", "Zplus"))
resolved the problem indicating that the superfluous column name data was problematic for Stan. This was the only change I made and it resolved my problem.

Unfortunately, setting the column names to NULL before dumping from R:
colnames(Zstep) <- NULL
still results in a .Dimnames statement being written to the dump file, triggering the aforementioned Stan error. But again, removing the .Dimnames statement from the dump file using a text editor resolves the problem and Stan runs.

I have been unable to find a way of making dump write matrix data without the .Dimnames information.

This issue may indicate an issue with the data reading code in Stan?

Or perhaps you may view this as the user's responsibility. But I fear that many other users will encounter this same problem.

Working example:
(this model is nonsense, but it illustrates the point).

Contents of errordata.dump file:
Zstep <-
structure(c(-2.64031248865649, -3.88059991653595, -4.13578157792282,
-4.32400477993687, -2.81446740864351, -3.24117323354198, -2.22087514588352,
-4.14601647724783, -4.31811908703931, -3.8596138974236), .Dim = c(5L,
2L), .Dimnames = list(NULL, c("Zminus", "Zplus")))

Contents of errordata2.dump file (the .Dimnames component deleted):
Zstep <-
structure(c(-2.64031248865649, -3.88059991653595, -4.13578157792282,
-4.32400477993687, -2.81446740864351, -3.24117323354198, -2.22087514588352,
-4.14601647724783, -4.31811908703931, -3.8596138974236), .Dim = c(5L,
2L))

Model (errordata.stan):
data{
real Zstep[5, 2];
}
parameters {
real<lower=0> sigma;
}
model {
for (i in 1:5) {
Zstep[i,1] ~ normal(0, sigma);
}
}

Command line commands:
bin/stanc --name=errordemo --o=models/errordemo.cpp models/errordata.stan
g++ -O3 -Lbin -Isrc -Ilib/boost_1.53.0 -Ilib/eigen_3.1.2 models/errordemo.cpp -o models/errordemo -lstan

Result of running model on first data file:
./errordemo --data=errordata.dump
Exception: variable does not exist; processing stage=data initialization; variable name=Zstep; base type=double

Result of running model on second data file:
./errordemo --data=errordata2.dump
STAN SAMPLING COMMAND
data = errordata2.dump
init = random initialization
(etc... it does the sampling)

Best wishes,
Hawthorne Beyer
University of Brisbane

add exceptional return function to models

Add a throw_exception() function that takes arguments like print() and uses them to construct the message for an exception. Allow it to be called anywhere.

Add to doc, ideally with advice that it can be used in user code to validate transforms, etc.

template auto-dif comparison ops

Reported by [email protected], Dec 1, 2011

For generality, we should template the comparison operators for stan::agrad::var.

There's an example in:

http://www.boost.org/doc/libs/1_48_0/boost/math/bindings/rr.hpp

Migrated from google code: http://code.google.com/p/stan/issues/detail?id=7&colspec=ID%20Type%20Status%20Priority%20Owner%20Summary

pos-def failure for corr matrix crashes rather than rejects

If the positive-definiteness test fails for a correlation matrix, there's a failure rather than a rejection.

Here's a minimal model to reproduce the problem:

parameters {
  corr_matrix[100] Omega;
}
transformed parameters {
  corr_matrix[100] Omega3;
  Omega3 <- Omega;
}
model { 
}

Here's the output:

~/stan(master)$ make ../temp/pos-def-bug
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O3 -o bin/stan/command/stanc.o src/stan/command/stanc.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O0 -o bin/stan/gm/grammars/expression_grammar_inst.o src/stan/gm/grammars/expression_grammar_inst.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O0 -o bin/stan/gm/grammars/program_grammar_inst.o src/stan/gm/grammars/program_grammar_inst.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O0 -o bin/stan/gm/grammars/statement_2_grammar_inst.o src/stan/gm/grammars/statement_2_grammar_inst.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O0 -o bin/stan/gm/grammars/statement_grammar_inst.o src/stan/gm/grammars/statement_grammar_inst.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O0 -o bin/stan/gm/grammars/term_grammar_inst.o src/stan/gm/grammars/term_grammar_inst.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O0 -o bin/stan/gm/grammars/var_decls_grammar_inst.o src/stan/gm/grammars/var_decls_grammar_inst.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O0 -o bin/stan/gm/grammars/whitespace_grammar_inst.o src/stan/gm/grammars/whitespace_grammar_inst.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O0 -o bin/stan/gm/ast_def.o src/stan/gm/ast_def.cpp
ar -rs bin/libstanc.a bin/stan/gm/grammars/expression_grammar_inst.o bin/stan/gm/grammars/program_grammar_inst.o bin/stan/gm/grammars/statement_2_grammar_inst.o bin/stan/gm/grammars/statement_grammar_inst.o bin/stan/gm/grammars/term_grammar_inst.o bin/stan/gm/grammars/var_decls_grammar_inst.o bin/stan/gm/grammars/whitespace_grammar_inst.o bin/stan/gm/ast_def.o
ar: creating archive bin/libstanc.a
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function    -O0 -o bin/stanc bin/stan/command/stanc.o -Lbin -lstanc

--- Translating Stan graphical model to C++ code ---
bin/stanc ../temp/pos-def-bug.stan --o=../temp/pos-def-bug.cpp
Model name=pos_def_bug_model
Input file=../temp/pos-def-bug.stan
Output file=../temp/pos-def-bug.cpp
i686-apple-darwin11-llvm-g++-4.2: src/stan/model/model_header.hpp: linker input file unused because linking not done
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O3 -o ../temp/pos-def-bug.o ../temp/pos-def-bug.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O3 -o bin/stan/agrad/agrad.o src/stan/agrad/agrad.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O3 -o bin/stan/math/matrix.o src/stan/math/matrix.cpp
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function   -c -O3 -o bin/stan/agrad/matrix.o src/stan/agrad/matrix.cpp
ar -rs bin/libstan.a bin/stan/agrad/agrad.o bin/stan/math/matrix.o bin/stan/agrad/matrix.o
ar: creating archive bin/libstan.a
g++ -I src -I lib/eigen_3.1.2 -I lib/boost_1.52.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -Wno-unused-function    -O3 -o ../temp/pos-def-bug ../temp/pos-def-bug.o -Lbin -lstan
~/stan(master)$ ../temp/pos-def-bug 

Exception: Invalid value of Omega3: Error in function validate transformed params d: y is not positive definite. y(0,0) is 1.
Diagnostic information: 
Dynamic exception type: std::domain_error
std::exception::what: Invalid value of Omega3: Error in function validate transformed params d: y is not positive definite. y(0,0) is 1.

~/stan(master)$

Error in assignment of row vector to matrix row

Not entirely sure what is going on, but it appears that there may be issues when a row vector is assigned to a row in a matrix. Or the bug maybe something completely unrelated. In any case, something either is going wrong in agrad or the parser isn't catching a problem.

I attached a minimal model that generates the behavior. Using Stan 1.1.1, gcc 4.7.2, on Ubuntu 12.10.

Model and output are here: https://gist.github.com/jrnold/4760388

Autocorr and ESS calculation error?

I was trying to diagnose pathologies with my warmup algorithm when I couldn't resolve differences I was seeing in samples.csv and the bin/print output. The raw text output looks fine, but bin/print shows huge Rhats and completely wrong means, standard deviations, quantiles, etc. The obvious problem would be overflow, but often times samples.csv shows values of O(1) and there's only 1000 samples.

Can anyone verify?

src/models/bugs_examples/vol1/magnesium/magnesium --data=src/models/bugs_examples/vol1/magnesium/magnesium.data.R --init=src/models/bugs_examples/vol1/magnesium/magnesium.init.R; bin/print samples.csv --seed=489629017

seems to be problematic, but it seems a bit random which would indicate floating point issues in the summary calculations.

Obviously we should take some of the outliers in the acid tests with a grain of salt at this point.

long long is a C++11 extension, produces warning messages

When compiling with clang++ 3.3 we get:

src/stan/agrad/rev/var.hpp:161:20: warning: 'long long' is a C++11 extension [-Wc++11-long-long]
      var(unsigned long long n) :
                   ^
src/stan/agrad/rev/var.hpp:171:11: warning: 'long long' is a C++11 extension [-Wc++11-long-long]
      var(long long n) :

It would be great to silence warnings like this. Unfortunately, this warning also pops up in gtest:

lib/gtest_1.6.0/include/gtest/internal/gtest-port.h:1501:9: warning: 'long long' is a C++11 extension [-Wc++11-long-long]
typedef long long BiggestInt;  // NOLINT
        ^
lib/gtest_1.6.0/include/gtest/internal/gtest-port.h:1726:11: warning: 'long long' is a C++11 extension [-Wc++11-long-long]
  typedef long long Int;  // NOLINT
          ^
lib/gtest_1.6.0/include/gtest/internal/gtest-port.h:1727:20: warning: 'long long' is a C++11 extension [-Wc++11-long-long]
  typedef unsigned long long UInt;  // NOLINT

I suggest adding -Wno-c++11-long-long when compiling with clang since this is mostly harmless. We could also remove the var constructors if we want to be completely free of C++11isms.

add error messages for periods in identifiers from parser

They're not getting caught everywhere (or anywhere) yet.

So far, there's almost no testing in place for this kind of thing. Just a test of whether a model parses or not.

multi_normal memory problem

I compiled and ran the example Guassian process model, gp-predict.stan. The memory usage is increasing until the program stops running. When I use a Cholesky decomposition of the covarance matrix and multi_normal_cholesky, this doesn't happen. This always happens when using the function multi_normal - the program needs increasing memory and eventually terminates.

STAN version: 1.1.1
Compiler: g++ 4.6.3
OS: Ubuntu

Add Bessel Functions and Pochhammer Symbols

Bessel functions of first and second kind.

And the Pocchammer symbols:

falling_factorial(x,n) = x! / n!

rising_factorial(x,n) = (x + n - 1)! / (x - 1)!

RStan 1.3 hanging during iteration

Jiqiang submitted the following:

I have a wrong model as follows:

parameters { 
   real y;
} 
model {
   y ~ exponential(1);
}

In current rstan, it hangs somewhere and I cannot interrupt so I have to kill the process. rstan checks interruption every iteration, so here it hangs inside an iteration.

Loading required package: stats4
rstan (Version 1.3.0, packaged: 2013-05-24 14:48:48 UTC, GitRev: f39ad4070a14)
>
>
> a <- 'parameters{real y;} model{y ~ exponential(1);}'
> stan(model_code = a, chains=1)

TRANSLATING MODEL 'a' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'a' NOW.
SAMPLING FOR MODEL 'a' NOW (CHAIN 1).
Iteration:  800 / 2000 [ 40%]  (Warmup)
^C
^C

But rstan 1.3.0 seems to handle this problem better. This is the output:

> a <- 'parameters{real y;} model{y ~ exponential(1);}'
> stan(model_code = a, chains=1)

TRANSLATING MODEL 'a' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'a' NOW.
SAMPLING FOR MODEL 'a' NOW (CHAIN 1).
Error : Posterior is improper. Please check your model.
error occurred during calling the sampler; sampling not done
> 
> 
> stan(model_code = a, chains=1)

TRANSLATING MODEL 'a' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'a' NOW.
Iteration: 2000 / 2000 [100%]  (Sampling)

> stan(model_code = a, chains=1)

TRANSLATING MODEL 'a' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'a' NOW.
SAMPLING FOR MODEL 'a' NOW (CHAIN 1).
Error : Posterior is improper. Please check your model.
error occurred during calling the sampler; sampling not done

vari destructed automatically when it shouldn't

I have only seen this issue with Windows (only tried with g++ 4.6.3, O=3). I tried reproducing the error in Mac, but don't see it with either clang++ or g++ (4.6.3) at O=3.

To reproduce, type these two lines:

make models/bugs_examples/vol2/mvn_orange/mvn_orange.exe

models\bugs_examples\vol2\mvn_orange\mvn_orange --seed=343351804 --chain_id=1 --iter=10000 --data=models\bugs_examples\vol2\mvn_orange\mvn_orange.data.R --init=models\bugs_examples\vol2\mvn_orange\mvn_orange.init.R

The error:

STAN SAMPLING COMMAND
data = models\bugs_examples\vol2\mvn_orange\mvn_orange.data.R
init = models\bugs_examples\vol2\mvn_orange\mvn_orange.init.R
samples = samples.csv
append_samples = 0
save_warmup = 0
seed = 343351804 (user specified)
chain_id = 1 (user specified)
iter = 10000
warmup = 5000
thin = 5 (default)
equal_step_sizes = 0
leapfrog_steps = -1
max_treedepth = 10
epsilon = -1
epsilon_pm = 0
delta = 0.5
gamma = 0.05

Iteration:    1 / 10000 [  0%]  (Adapting)
Iteration:   50 / 10000 [  0%]  (Adapting)
...
Iteration: 1750 / 10000 [ 17%]  (Adapting)
terminate called after throwing an instance of 'std::logic_error'
  what():  vari destruction handled automatically

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

Message "INVALID COMMAND-LINE ARGUMENT" printed for parse errors

Reported by jeffrey.arnold, Feb 5 (3 days ago)

Using commit e1a9380.

The command line stanc returns the error for invalid-command line argument, even when it is a parser failure. The attached model produces the following output.

$ bin/stanc foo.stan
Model name=foo_model
Input file=foo.stan
Output file=foo_model.cpp

INVALID COMMAND-LINE ARGUMENT
EXPECTATION FAILURE LOCATION: file=foo.stan; line=2, column=10

vector y;
^-- here

DIAGNOSTIC(S) FROM PARSER:
Parser expecting: "["

echo $?
253

make test/models/basic_estimators/normal_mixture fails

make clean-all && make CC=clang++ O=3 -j8 test/models/basic_estimators/normal_mixture

goodrich@CYBERPOWERPC:/tmp$ cat normal_mixture_issue.txt
--- Translating Stan graphical model to C++ code ---
bin/stanc models/basic_estimators/normal_mixture.stan --o=models/basic_estimators/normal_mixture.cpp
Model name=normal_mixture_model
Input file=models/basic_estimators/normal_mixture.stan
Output file=models/basic_estimators/normal_mixture.cpp
clang++ -I src -I lib/eigen_3.1.3 -I lib/boost_1.53.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -Wno-unused-function -Wno-uninitialized -c -O3 -o models/basic_estimators/normal_mixture.o models/basic_estimators/normal_mixture.cpp

clang++ -I src -I lib/eigen_3.1.3 -I lib/boost_1.53.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -Wno-unused-function -Wno-uninitialized -O3 -o models/basic_estimators/normal_mixture models/basic_estimators/normal_mixture.o -Lbin -lstan

clang++ -I src -I lib/eigen_3.1.3 -I lib/boost_1.53.0 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -Wno-unused-function -Wno-uninitialized -O3 lib/gtest_1.6.0/src/gtest_main.cc test/models/basic_estimators/normal_mixture.o -DGTEST_HAS_PTHREAD=0 -I lib/gtest_1.6.0/include -I lib/gtest_1.6.0 -o test/models/basic_estimators/normal_mixture test/libgtest.a -Lbin -lstan -Lbin -lstanc
test/models/basic_estimators/normal_mixture --gtest_output="xml:test/models/basic_estimators/normal_mixture.xml"
Running main() from gtest_main.cc
[==========] Running 4 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 4 tests from Models_BasicEstimators_NormalMixture/Model_Test_Fixture/0, where TypeParam = Models_BasicEstimators_NormalMixture
make[1]: warning: jobserver unavailable: using -j1. Add +' to parent make rule. make[1]: warning: jobserver unavailable: using -j1. Add+' to parent make rule.
Warning: non-fatal error reading samples
[ RUN ] Models_BasicEstimators_NormalMixture/Model_Test_Fixture/0.TestGradient
[ OK ] Models_BasicEstimators_NormalMixture/Model_Test_Fixture/0.TestGradient (12 ms)
[ RUN ] Models_BasicEstimators_NormalMixture/Model_Test_Fixture/0.RunModel
Warning: non-fatal error reading metadata
Error: error reading header
unknown file: Failure
C++ exception with description "Error with header of input file in parse" thrown in the test body.
[ FAILED ] Models_BasicEstimators_NormalMixture/Model_Test_Fixture/0.RunModel, where TypeParam = Models_BasicEstimators_NormalMixture (1861 ms)
[ RUN ] Models_BasicEstimators_NormalMixture/Model_Test_Fixture/0.ChainsTest
normal_mixture: lib/eigen_3.1.3/Eigen/src/Core/DenseCoeffsBase.h:173: CoeffReturnType Eigen::DenseCoeffsBase<Eigen::Matrix<Eigen::Matrix<double, -1, 1, 0, -1, 1>, -1, 1, 0, -1, 1>, 0>::operator()(Index) const [Derived = Eigen::Matrix<Eigen::Matrix<double, -1, 1, 0, -1, 1>, -1, 1, 0, -1, 1>, Level = 0]: Assertion `index >= 0 && index < size()' failed.
Aborted
make: [test/models/basic_estimators/normal_mixture] Error 134 (ignored)

Kronecker product

It's been mentioned a few of times on the user's list, but perhaps a Kronecker product could be added?

Best regards,
Marcel

assertion failure from diag_pre_multiply in rstan 1.3.0

This is a minimal version of a problem that has caught me in a real model. This model:

data {
     int<lower=0> num;
}

parameters {
    vector<lower=0, upper=1>[num] scale;
    matrix[num, num] fred;
}

transformed parameters {
    matrix[num, num] out;

    out <- diag_pre_multiply(scale, fred);
}

model {
    for (indx in 1:num) {
        for (indy in 1:num) {
            fred[indy, indx] ~ normal(0, 1);
        }
    }
}

dies with the error

/home/pdm/.R/x86_64-unknown-linux-gnu/2.15/rstan/include//stanlib/eigen_3.1.2/Eigen/src/Core/DenseCoeffsBase.h:337: typename Eigen::internal::traits<T>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::operator()(typename Eigen::internal::traits<T>::Index, typename Eigen::internal::traits<T>::Index) [with Derived = Eigen::Matrix<stan::agrad::var, -0x00000000000000001, -0x00000000000000001, 0, -0x00000000000000001, -0x00000000000000001>]: Assertion `row >= 0 && row < rows() && col >= 0 && col < cols()' failed.
Aborted (core dumped)

when run with

library(rstan)

set_cppo("debug")

results <-
  stan(file="dummy.txt",
       data=list(num=100),
       iter=1000,
       chains=4)

The R version is 2.15.1, Rcpp is 0.10.3, gcc is

Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla 
--enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile
--enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC)

and rstan is

rstan (Version 1.3.0, packaged: 2013-04-12 21:12:02 UTC, GitRev: f57455593d14)

The OS is linux amd64.

allow int x in vector <- rep_vector(x,N)

A program with this block fails to compile:

transformed data {
  vector[J] gamma;
  gamma <- rep_vector(1,J);
}

The problem is that rep_vector creates a vector with int scalar and then can't assign to the double scalar vector gamma.

From the command-line (current 2.0 develop branch):

...
lib/eigen_3.1.3/Eigen/src/Core/Assign.h:493:32: error: no member named
      'YOU_MIXED_DIFFERENT_NUMERIC_TYPES__YOU_NEED_TO_USE_THE_CAST_METHOD_OF_MATRIXBASE_TO_CAST_NUMERIC_TYPES_EXPLICITLY'
      in 'Eigen::internal::static_assertion<false>'
  ...YOU_MIXED_DIFFERENT_NUMERIC_TYPES__YOU_NEED_TO_USE_THE_CAST_METHOD_OF_MATRIXBASE_TO_CAST_NUMERIC_TYPES_EXPLICITLY)
     ^
lib/eigen_3.1.3/Eigen/src/Core/util/StaticAssert.h:111:65: note: expanded from macro
      'EIGEN_STATIC_ASSERT'
        if (Eigen::internal::static_assertion<bool(CONDITION)>::MSG) {}
                                                                ^
lib/eigen_3.1.3/Eigen/src/Core/PlainObjectBase.h:393:20: note: in instantiation of function
      template specialization 'Eigen::DenseBase<Eigen::Matrix<double, -1, 1, 0, -1, 1>
      >::lazyAssign<Eigen::Matrix<int, -1, 1, 0, -1, 1> >' requested here
      return Base::lazyAssign(other.derived());
                   ^
lib/eigen_3.1.3/Eigen/src/Core/Assign.h:522:97: note: in instantiation of function template
      specialization 'Eigen::PlainObjectBase<Eigen::Matrix<double, -1, 1, 0, -1, 1>
      >::lazyAssign<Eigen::Matrix<int, -1, 1, 0, -1, 1> >' requested here
  ...dst, const OtherDerived& other) { return dst.lazyAssign(other.derived()); }
                                                  ^
lib/eigen_3.1.3/Eigen/src/Core/PlainObjectBase.h:600:69: note: in instantiation of member
      function 'Eigen::internal::assign_selector<Eigen::Matrix<double, -1, 1, 0, -1, 1>,
      Eigen::Matrix<int, -1, 1, 0, -1, 1>, false, false>::run' requested here
      return internal::assign_selector<Derived,OtherDerived,false>::run(...
                                                                    ^
lib/eigen_3.1.3/Eigen/src/Core/PlainObjectBase.h:585:102: note: in instantiation of function
      template specialization 'Eigen::PlainObjectBase<Eigen::Matrix<double, -1, 1, 0, -1, 1>
      >::_set_noalias<Eigen::Matrix<int, -1, 1, 0, -1, 1> >' requested here
  ...OtherDerived& other, const internal::false_type&) { _set_noalias(other); }
                                                         ^
lib/eigen_3.1.3/Eigen/src/Core/PlainObjectBase.h:577:7: note: in instantiation of function
      template specialization 'Eigen::PlainObjectBase<Eigen::Matrix<double, -1, 1, 0, -1, 1>
      >::_set_selector<Eigen::Matrix<int, -1, 1, 0, -1, 1> >' requested here
      _set_selector(other.derived(), typename ...
      ^
lib/eigen_3.1.3/Eigen/src/Core/Matrix.h:172:20: note: in instantiation of function template
      specialization 'Eigen::PlainObjectBase<Eigen::Matrix<double, -1, 1, 0, -1, 1>
      >::_set<Eigen::Matrix<int, -1, 1, 0, -1, 1> >' requested here
      return Base::_set(other);
                   ^
src/stan/agrad/rev/matrix/assign.hpp:34:15: note: in instantiation of function template
      specialization 'Eigen::Matrix<double, -1, 1, 0, -1, 1>::operator=<Eigen::Matrix<int, -1,
      1, 0, -1, 1> >' requested here
          var = val; // no promotion of RHS
              ^
src/stan/agrad/rev/matrix/assign.hpp:53:60: note: in instantiation of member function
      'stan::agrad::<anonymous namespace>::assigner<false, Eigen::Matrix<double, -1, 1, 0, -1,
      1>, Eigen::Matrix<int, -1, 1, 0, -1, 1> >::assign' requested here
      assigner<needs_promotion<LHS,RHS>::value, LHS, RHS>::assign(var,val);
                                                           ^
../devmitz/carp/projs/ideal-tomatoes/ideal.cpp:116:9: note: in instantiation of function
      template specialization 'stan::agrad::assign<Eigen::Matrix<double, -1, 1, 0, -1, 1>,
      Eigen::Matrix<int, -1, 1, 0, -1, 1> >' requested here
        assign(gamma, rep_vector(3,J));
        ^
1 error generated.

Current release (1.3.0) also fails to compile it for the same reason.

There's an easy fix involving specialization for int types that creates a double value.