suyusung / arm Goto Github PK

View Code? Open in Web Editor NEW

21.0 21.0 5.0 141 KB

Data Analysis Using Regression and Multilevel/Hierarchical Models

R 100.00%

arm's People

Contributors

Stargazers

Watchers

Forkers

mbojan waternk bips-hb sazzledazzles mariusbarth

arm's Issues

`sim` method for `coxph` objects

Can we have a sim() method for coxph objects?

I could submit a PR, but I am not sure what exactly the stat theory is. In particular, I am not sure whether the example from the ARM book generalizes directly to Cox regression. Assuming that object is of class coxph, does it boil down to simply sampling from a multivariate normal with

means equal to object$coefficients
covariance matrix from vcov(object)

standardize.R: Error in thedata[!is.na(thedata)] : object of type 'closure' is not subsettable

The line

arm/R/standardize.R

Line 22 in 3e7f29f

num.categories <- length (unique(thedata[!is.na(thedata)]))

produces the following error:

Error in thedata[!is.na(thedata)] : object of type 'closure' is not subsettable

This is due to the

arm/R/standardize.R

Line 16 in 3e7f29f

thedata <- get(v)

call returning function (f, levelsToKeep) for a binary variable (vector of type factor).

Am I missing something here?

The dummy I have in my regression formula:
dummy [1] 0 0 0 1 ...

Estimation Result depending on order of covariates?

Dear Yu-Sung Su,

I noticed recently a rather strange behaviour of the bayesglm-function. More specifically, it seems that the results of the bayesglm-function depend on the order of covariates in the formula argument. To give an example in R:

library(arm)

data(lalonde)

m1 <- bayesglm(marriedage+educ, lalonde, family = binomial("logit"))
summary(m1)
m2 <- bayesglm(marriededuc+age, lalonde, family = binomial("logit"))
summary(m2)

In the example, the differences of the point estimate and standard error differ only slightly. However, for other datasets I use in my daily work, I noticed that the differences can become quite large. Could there be something wrong with the function?

Best,
André

How to compare nested models created by the bayesglm function?

Is it possible to use maximum likelihood ratio test (anova in R) to compare between models?

arm::sim speed

Hello arm staff,

I've being doing some analysis using arm::bayesglm and to predict confidence interval I use arm::sim for 1E5 simulations but it was running too slow.
Checking the source code, I found in the function for sim.glm the following for loop.

for (s in 1:n.sims){
      beta[s,] <- MASS::mvrnorm (1, beta.hat, V.beta)
 }

I don't understand why do you choose to run as a for loop instead of:

beata <- MASS::mvrnorm(n.sims, beta.hat, V.beta)

I've run some tests and while the second one is almost instantaneous, the first take several seconds.

> system.time(x <- MASS::mvrnorm (n, beta.hat, V.beta))
  usuário   sistema decorrido 
    0.184     0.038     0.204 
> system.time(for (i in 1:n) y[i,] <-  MASS::mvrnorm (1, beta.hat, V.beta))
  usuário   sistema decorrido 
   27.824     0.681    29.698 
> system.time(replicate(n, MASS::mvrnorm (1, beta.hat, V.beta)))
  usuário   sistema decorrido 
   29.766     0.510    31.146

Is there a statistical reason for the choice? I would love to understand it. If soo, would you consider a blocked alternative, such as:

blocks <- 100
blocksize <- floor(n.sims/blocks)
for (s in 1:blocks){
    from <- (s-1)*blocksize
    to <- pmin(s*blocksize, n.sims)
    beta[from:to,] <- MASS::mvrnorm (pmin(blocksize, from-n.sims), beta.hat, V.beta)
}

Regards

Test

'sim' method for 'plm' objects

Dear Yu-Sung Su,

I recently wrote some lines of code to use the sim function with plm objects. I didn't test the code in different scenarios, but it might help to implement the sim function for class plm objects (if this is an option).

sim.plm<-function(object, n.sims=100)
{
  object.class <- class(object)[[1]]
  summ <- summary (object)
  coef <- summ$coef[,1:2,drop=FALSE]
  dimnames(coef)[[2]] <- c("coef.est","coef.sd")
  # sigma.hat <- summ$sigma 
  # TR: define sigma by hand
  NN <- nrow(object$model)
  PP <- nrow(coef)
  sigma.hat <- sqrt(deviance(object) / (NN-PP))
  # TR: end              
  beta.hat <- coef[,1,drop = FALSE]
  # V.beta <- summ$cov.unscaled
  V.beta <- vcov(summ)/sigma.hat^2 # TR: unscale scaled vcov
  # n <- summ$df[1] + summ$df[2]
  # k <- summ$df[1]
  n <- nrow(summ$model) # TR: define n
  k <- nrow(summ$coefficients) # TR: define k
  sigma <- rep (NA, n.sims)
  beta <- array (NA, c(n.sims,k))
  dimnames(beta) <- list (NULL, rownames(beta.hat))
  for (s in 1:n.sims){
    sigma[s] <- sigma.hat*sqrt((n-k)/rchisq(1,n-k))
    beta[s,] <- MASS::mvrnorm(1, beta.hat, V.beta * sigma[s]^2)
  }
  
  ans <- new("sim",
             coef = beta,
             sigma = sigma)
  return (ans)
}


#### Example ####

library(arm)
library(plm)

data(Cigar)

plm.mod<-plm(sales~ price + pop + pimin + price*pimin,
             model="within", effect="individual",
             index=c("state", "year"), data=Cigar)

summary(plm.mod)

plm.mod.sim<-sim.plm(plm.mod, n.sims=500)

Best,
Tobias

Horseshoe prior?

Yu Sung,

Any chance we could get a horseshoe prior added to bayesglm? Vincent Dorie graciously did something similar for blme, but it would also be great to have arm offer this option too. Thanks much.

Here is Vincent's addition: vdorie/blme#3

arm::rescale(a_binary_factor, binary.inputs = "-0.5,0.5") returns 0.5/1.5 instead of -0.5/0.5

I expected that rescale(a_binary_factor, binary.inputs = "-0.5,0.5") would return -0.5 and 0.5, but actually it returns 0.5 and 1.5. According to the R code of rescale(), it converts a binary factor to number (as 1 and 2) by using as.numeric() and then subtracts 0.5 (as 0.5 and 1.5) under recent R version.

It seems that the author incorrectly assumed R convert a binary factor to integer as 0/1 (I'm not sure whether previous R version had such feature), but actually R returns 1/2. Please consider fix this bug. Many thanks.

To reproduce this bug: arm::rescale(gl(2,1), binary.inputs = "-0.5,0.5") # returns c(0.5, 1.5) (R version 4.2.0; Package arm version 1.12-2).

family=quasipoisson issue

When I use family=quasipoisson in bayesglm(...), and then summary(...), it gives an estimate of $\sigma^2$ (the dispersion parameter) that is larger than the square of the estimate of $\sigma$ obtained using sim(...)@sigma, by a factor of $n/(n-p)$.

It appears that the former uses the divisor $n-p$, and the latter uses $n$, where $n$ = # observations and $p$ = # parameters.

This is also seen by comparing the SEs of the regression coefficients given by summary(...), which are larger than the SDs of the posteriors produced by sim(...), by a factor of $\sqrt{(n-p)/n}$.

I assume that a similar issue arises with family=quasibinomial, but I haven't checked that.

Is there a reason for this discrepancy? It certainly leads to worse frequentist coverage for credible intervals based on the posteriors.

How to get R squared in bayesglm?

Question 1. After deriving an estimate of the dependent variable (y) through the Bayesian multiple regression analysis Estimation Function (postf), the calculation of R squared is made using the cumulative sum function (cumsum). Is there no problem finding SSR(Sum or Square Regression) and SST(Sum of Square Total) with the following coding? I have confirmed that SSR and SST are derived.

postf <- bayesglm.fit(xx,y, family = gaussian(), prior.mean = c(8663265, 325.9304, 1435.758, -4388.022, -1973.862, 2686.825, 1830.709, -442.933, 762.1138), prior.df = Inf)
summary(postf)
postf$coeff
postf$fit
postf$y
postf$res

SSE <- cumsum((postf$res)^2) #SSE(Sum of Square Error)
SST <- cumsum((postf$y-mean(postf$y))^2) #SST(Sum of Square Total)
SSR <- cumsum((postf$fit-mean(postf$y))^2) #SSR(Sum or Square Regression)

print(SST)
print(SSR)

Question 2. The number of variables in the Bayesian multiple regression analysis Estimation Function that I intend to execute is nine, and the prior-mean of the nine variables have been replaced by vector type(c()) accordingly. However, when substituting the prior-mean, the error "invalid length for prior.mean" appears. In this case, is it impossible to proceed with Bayesian estimates? Or is there another way?

> postf <- bayesglm.fit(xx,y, family = gaussian(), prior.mean = c(8663265, 325.9304, 1435.758, -4388.022, -1973.862, 2686.825, 1830.709, -442.933, 762.1138))
Error in bayesglm.fit(xx, y, family = gaussian(), prior.mean = c(8663265,  : 
  invalid length for prior.mean

Variance prior for bayesglm

Is it possible to also specify an inverse-gamma prior in bayesglm() for the variance of a Gaussian-family model?

package "arm" cannot be downloaded, therefore sjPlot cannot be used

install.packages('arm')
Installing package into ‘\CNAS.RU.NL/U759233/Documents/R/win-library/3.4’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/arm_1.9-3.zip'
Warning in install.packages :
cannot open URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/arm_1.9-3.zip': HTTP status was '404 Not Found'
Error in download.file(url, destfile, method, mode = "wb", ...) :
cannot open URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/arm_1.9-3.zip'
Warning in install.packages :
download of package ‘arm’ failed
library(sjPlot) # table functions
Error: package or namespace load failed for ‘sjPlot’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
there is no package called ‘arm’

Missing importFrom(): Error in setClass("balance", ... : could not find function "setClass"

When running reverse package checks on {arm}, I'm getting:

    Error in setClass("balance", representation(rawdata = "data.frame", matched = "data.frame",  :
      could not find function "setClass"

Please add "setClass" to the list of imports from {methods} in the NAMESPACE file:

arm/NAMESPACE

Lines 23 to 29 in 7bbc274

 importFrom(methods,

 "as",

 "getMethod",

 "new",

 "setOldClass",

 "show",

 "signature")

rescale returning different results each time.

When I use arm::rescale for each of my variables that I use in the plm, I get almost exactly the same result (same F, R2 etc.) as my raw data.

However, when I run the same code again after restarting Rstudio, I get a totally different result because arm::rescale produces a very different result next time. After some tries, I sometimes get the right results. Sometimes the wrong one. I have no idea how I get the right one (since I don't change anything in between the wrong and the right results).

What am I missing here?

FYI, my data is panel data.

small typo in standardize example

When creating the M1.2 objects in the example code would it make more sense to use the standardize function on the original model M1 rather than the model that has already used rescale (M1.1)? i.e.
M1.2 <- standardize(M1.1)
should be
M1.2 <- standardize(M1)
in both examples

arm not available for version?

I am attempting to load the arm package. In the end I get this specific message: package 'arm' is not available (for R version 3.3.2). I could explore a newer version of R but I would be risking backward compatibility issues. Any suggestions? thx

Issue with ranef function

License: (GPL >2) - What is the actual(s) GPL license version here?

Hi Team, first of all thanks for the great work. I need a clarification: would you be able to state with which GPL version you are licensing your software at the moment? In the DESCRIPTION I can see "(> 2), now, if I read this correctly it would map to a "GPL 3.0 or later". See the current SPDX version list at https://spdx.org/licenses/

Thanks!

	importFrom(methods,
	"as",
	"getMethod",
	"new",
	"setOldClass",
	"show",
	"signature")

suyusung / arm Goto Github PK

arm's People

Contributors

Stargazers

Watchers

Forkers

arm's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs