jslefche / piecewisesem Goto Github PK

View Code? Open in Web Editor NEW

147.0 14.0 47.0 2.51 MB

Piecewise Structural Equation Modeling in R

R 100.00%

sem r

piecewisesem's Introduction

piecewiseSEM: Piecewise Structural Equation Modeling in R

Version 2.3.01

Last updated: 01 June 2023

To install

Run the following code to install the latest version from CRAN:

install.packages("piecewiseSEM")

Run the following code to install the development version:

devtools::install_github("jslefche/piecewiseSEM@devel")

Note: the development version may be unstable and lead to unanticipated bugs. Contact the package developer with any bugs or issues.

Getting Help

See our website at piecewiseSEM

There is an online resource available for SEM, including piecewiseSEM and lavaan, available https://jslefche.github.io/sem_book/

Version 2 is a major update to the piecewiseSEM package that uses a completely revised syntax that better reproduces the base R syntax and output. It is highly recommended that consult the resource above even if you have used the package before as it documents the many changes.

Currently supported model classes: lm, glm, gls, Sarlm, lme, glmmPQL, lmerMod, merModLmerTest, glmerMod. glmmTMB, gam

Example

# Load library
library(piecewiseSEM)

# Create fake data
set.seed(1)

data <- data.frame(
  x = runif(100),
  y1 = runif(100),
  y2 = rpois(100, 1),
  y3 = runif(100)
)

# Create SEM using `psem`
modelList <- psem(
  lm(y1 ~ x, data),
  glm(y2 ~ x, "poisson", data),
  lm(y3 ~ y1 + y2, data),
  data
)

# Run summary
summary(modelList)

# Address conflict using conserve = T
summary(modelList, conserve = T)

# Address conflict using direction = c()
summary(modelList, direction = c("y2 <- y1"))

# Address conflict using correlated errors
modelList2 <- update(modelList, y2 %~~% y1)

summary(modelList2)

piecewisesem's People

Contributors

Stargazers

Watchers

piecewisesem's Issues

Add support for glm.nb and glmer.nb

Odd missing paths output

Using the Day 1 example from http://byrneslab.net/teaching/sem/ produces some odd output from piecewiseSEM's missing paths function. Load the Keeley data at the above URL and the piecewiseSEM library, and then run the following:

modList <- list(
  lm(abiotic ~ distance, data=keeley),
  lm(hetero ~ distance, data=keeley),
  lm(rich ~ abiotic + hetero, data=keeley))

#what paths are missing?
sem.missing.paths(modList, data=keeley)

This produces output where distance is sometimes included twice:

                            missing.path  estimate std.error DF crit.value      p.value
1     rich ~ distance + hetero + abiotic 0.6404318 0.1564575 86   4.093329 9.564005e-05
2 abiotic ~ hetero + distance + distance 8.9334404 6.7189669 87   1.329585 1.871306e-01

What's up with distance being in twice in row 2?

Fix model.control for get.missing.paths()

Currently very messy, does not include glm.control()

Order of terms in model screws up basis set with interactions

If a formula has both main effects and interactions (with different main effects), the placement of the interaction in the formula can cause certain paths that should be retained to be dropped from the basis set, yielding incorrect estimates of Fisher's C in the d-sep test.

data = data.frame(
  y = runif(100, 0, 1),
  x1 = runif(100, 0, 1),
  x2 = runif(100, 0, 1),
  x3 = runif(100, 0, 1),
  x4 = runif(100, 0, 1),
  x5 = runif(100, 0, 1),
  random = letters[1:2]
)

# library(nlme)
modelList = list(
  lme(y ~ x1 + x2, random = ~ 1 | random, data),
  lme(x2 ~ x5 + x3 + x4 + x4 : x3, random = ~ 1 | random, data),
  lme(x4 ~ x3, random = ~ 1 | random, data)
)

modelList2 = list(
  lme(y ~ x1 + x2, random = ~ 1 | random, data),
  lme(x2 ~ x3 + x4 + x4 : x3 + x5, random = ~ 1 | random, data),
  lme(x4 ~ x3, random = ~ 1 | random, data)
)

# library(piecewiseSEM)

sem.missing.paths(modelList, data)
sem.missing.paths(modelList2, data)

Interactions in add.vars

sem.fit will return error when interactions are included in add.vars = argument.

library(piecewiseSEM)


dat = data.frame(
  x1 = runif(100),
  x2 = runif(100),
  x3 = runif(100),
  x4 = runif(100)
)

# Model with interactions
model.x.list = list(

  lm(x1 ~ x2, dat),
  lm(x3 ~ x1 * x2, dat),
  lm(x4 ~ x3, dat)

)

sem.fit(model.x.list, dat)

# Reduced model

model.red.list = list(

  lm(x1 ~ x2, dat),
  lm(x3 ~ x1, dat),
  lm(x4 ~ x3, dat)

)

sem.fit(model.red.list, dat)

sem.fit(model.red.list, dat, add.vars = c("x1:x2", "x2"))
sem.fit(model.red.list, dat, add.vars = c("x1 * x2")) # this throws an error

corr.errors not working in get.sem.coefs

Returns NA for r and p

Package support wishlist

Current requests:

-metafor
-MRM() in ecodist

Does not handle interactions with :

get.basis.set does not currently handle interactions with a : vs *.

Example:

data = data.frame(
  a = runif(1000,0,5),
  b = runif(1000,10,20),
  c = runif(1000,0,1),
  d = runif(1000,100,500)
)
modelList = list(
  lm(c~a+b+a:b,data),
  lm(d~c,data))

filter.exogenous(modelList)

#Missing: "a:b" "d"   "c"

offsets in glm.nb() models treated as predictors in sem.fit()

when offsets are included within glm.nb() models in the model list, sem.fit treats the offsets as predictors and includes them in the list of missing paths in the output, which also affects the Fisher's C and AIC values. In glm.nb, offsets must be included within the model formula. Example below:

model_list=list(
response1= glm(response1 ~ predictor1 , data=Data)
response2= glm.nb(response2 ~ response1 + predictor2 + offset(log(variable)), data=Data)
)

sem.fit(model_list, Data)

Dummy variables yield lower standardized estimates

Scaling dummy variables for factors (e.g., level 1 = 0, level 2 = 1, etc.) in get.sem.coefs(standardized=T) results in a standardized estimate that is ~0.5x what it should be.

sem.coefs produces -Inf when standardize = "scale" and variables are transformed in the formula

Example:

data = data.frame(
  y = runif(100, 0, 1),
  x1 = runif(100, 0, 1),
  x2 = runif(100, 0, 1),
  x3 = runif(100, 0, 1)
)

modelList = list(
  lm(y ~ x1 + x2, data),
  lm(x2 ~ x3, data)
)

sem.coefs(modelList, data, standardize = "scale")

modelList2 = list(
  lm(y ~ log10(x1) + x2, data),
  lm(x2 ~ x3, data)
)

sem.coefs(modelList2, data, standardize = "scale")

Dealing with transformed vars when standardize = "scale" in sem.coefs

Scaling the variables in the data.frame will cause them to be transformed again in the update argument. Need to strip scaling from model formulae before updating for standardized coefs

sem.coeff does not standardize coefficients when offsets are included in the models

I am trying to extract the standardized coefficients from the model set, but when I set standardize="scale" I receive the error warning "Transformations detected in the formula! This may produce invalid scaled values Store transformations as new variables, and re-specify the models." Do you think that this error is due to the offsets within the model?

Error in basis set

Looks like topSort from ggm may be generating a weird adjacency matrix, but the basis set is definitely wrong. Need to look into this!

library(piecewiseSEM)


data = data.frame(
  rep = runif(100),
  acc = runif(100),
  rep = runif(100),
  cap = runif(100),
  comm = runif(100),
  infl = runif(100),
  know = runif(100),
  out = runif(100)
)


modelList = list(
  lm(rep ~ acc, data),
  lm(out ~ know + cap + comm + infl + rep + acc, data),
  lm(cap ~ know + comm, data)
)

sem.fit(modelList, data)

sem.coefs and "Poisson" distribution

For some reason is throwing an error with capital P poisson...

Transformations in basis set

Should consider removing or somehow handling in a way that does not duplicate if some vars are transformed (predictors) and others are not (responses)

get.sem.fit returns wrong statistics when variables are transformed/untransformed

When variables are alternately transformed (as responses) and not (as predictors), get.missing.paths treats them as two separate variables

+ 1 Transformed corr.errors results in failure to detect in get.basis.set()

Dependencies error for igraph (?)

Interactions in sem.lavaan

Interactions need to be hand calculated since lavaan does not play nice with interactions

sem.fisher.c reports df = 2k instead of k

Need to change column header to reflect "df" instead of "k"

get.model.control fails in lme4 1.1-11

Need to see if this got fixed version 1.0.2

Interaction terms switched in model output

If interactions are present in the basis set, get.missing.paths() may return an error because the order of interaction components are switched in the model output

Does pgls behave as gls?

In terms of how coefs are stored, etc. Need to check

get.sem.fit borks when multiple different interactions

When more than one pair of variables interacts, get.basis.set and filter.exogenous do not correctly remove interactions appearing as responses

What to do with standardized interactions?

Standard errors and resulting P-values are incorrect when interactions have already been standardized.

See: http://quantpsy.org/interact/interactions.htm

data = data.frame(
  y = runif(100),
  x1 = runif(100),
  x2 = runif(100)
)

mod.noint = lm(y ~ x1 + x2, data)

sem.coefs(mod.noint, data)

sem.coefs(mod.noint, data)
sem.coefs(mod.noint, data, standardize = "scale")

mod.int = lm(y ~ x1 * x2, data)

sem.coefs(mod.int, data)
sem.coefs(mod.int, data, standardize = "scale")

data.scaled = data.frame(apply(data, 2, scale))

mod.int.scaled = lm(y ~ x1 * x2, data = data.scaled)
sem.coefs(mod.int.scaled, data.scaled)

Standardized coefficients results in error

Occurs when there are categorical predictors in the dataset

Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric

Check df for sem.fisher.c

shipley.test() from ggm gives double the df:

library(piecewiseSEM)
library(ggm)

data(marks)
dag <- DAG(mechanics ~ vectors+algebra, vectors ~ algebra, 
           statistics ~ algebra+analysis, analysis ~ algebra)
shipley.test(dag, cov(marks), n=88)

modelList = list(
  lm(mechanics ~ vectors+algebra, data = marks),
  lm( vectors ~ algebra, data = marks),
  lm(statistics ~ algebra+analysis, data = marks),
  lm(analysis ~ algebra, data = marks)
)

sem.fit(modelList, marks)

lmerTest returns error / 'lmerMod'

If model does not converge correctly, lmerTest returns:

"summary from lme4 is returned
some computational error has occurred in lmerTest"

without p-values in the summary output. get.missing.paths() therefore returns an error when p.adjust = F because it cannot access the correct column

simulated.data=read.table("http://www.esapubs.org/archive/ecol/E094/047/simulated.data.txt")

modelList = list( model1.1 = lmer(x2~x1+(x1|species),data=simulated.data), model1.2 = lmer(x3~x2+(x2|species),data=simulated.data), model1.3 = lmer(x4~x2+(x2|species),data=simulated.data), model1.4 = lmer(x5~x3+x4+(x3+x4|species),data=simulated.data) )

get.sem.fit(modelList)

`get.random.formula` does not like uncorrelated intercepts

Example:

model = lme(DD ~ lat, random = ~ -1 + lat | site/tree, na.action = na.omit, data = shipley2009)
rhs = "lat"
(random.formula = get.random.formula(basis.mod, rhs, modelList = list(model))) # should not have lat as varying slope

get.sem.coefs

Output should be organized by response variable, then by p-value

Cannot construct basis.model when filter.exog = F

get.missing.paths does not which model in the model list to select as the basis model when leaving exogenous variables in the basis.set, since the exog vars do not appear as responses.

library(lmerTest)
library(piecewiseSEM)

set.seed(3423)

data = data.frame(

y = sample(0:1, 100, replace = T),

Y = rpois(100, 10),
X1 = runif(100, 0, 100),
X2 = runif(100, 0, 100),
X3 = runif(100, 0, 100),
random = letters[1:10]
)

modelList = list(
glmer(Y ~ 1 + (1|random), na.action = na.omit, family = poisson,
data = data))

sem.fit(modelList, data, add.vars = c("X1", "X2"), filter.exog = F)
| | 0%
Show Traceback

Rerun with Debug
Error in if (as.character(ans[[1L]])[1L] == "~") { :
missing value where TRUE/FALSE needed

how to declare correlation and problems during instalation

Hi, thanks for this nice contribution.

I would like to know if this packages allows the inclusion of correlation between two variables (A,B) or if all paths must be causal? I notice that in the examples explored in the manuscript there are only causal relationships.

I am working on a phylogenetic path analysis where I want to declare simple correlations between some pairs of variables (usually a two-side arrow in path diagrams) in the model. Could you please give me some advice on that?

Further I am getting the following error while installing your package on R v3.2:

Warning: replacing previous import by ‘ggplot2::unit’ when loading ‘Hmisc’
Warning: replacing previous import by ‘ggplot2::arrow’ when loading ‘Hmisc’
Warning: replacing previous import by ‘scales::alpha’ when loading ‘Hmisc’
Error in eval(expr, envir, enclos) :
não foi possível encontrar a função "eval"
Error : unable to load R code in package ‘Hmisc’
ERROR: lazy loading failed for package ‘Hmisc’

removing ‘/home/paternogbc/R/x86_64-pc-linux-gnu-library/3.2/Hmisc’
Warning in install.packages :
installation of package ‘Hmisc’ had non-zero exit status
ERROR: dependency ‘Hmisc’ is not available for package ‘lmerTest’

removing ‘/home/paternogbc/R/x86_64-pc-linux-gnu-library/3.2/lmerTest’
Warning in install.packages :
installation of package ‘lmerTest’ had non-zero exit status
ERROR: dependency ‘lmerTest’ is not available for package ‘piecewiseSEM’

removing ‘/home/paternogbc/R/x86_64-pc-linux-gnu-library/3.2/piecewiseSEM’
Warning in install.packages :
installation of package ‘piecewiseSEM’ had non-zero exit status

Thanks in advance!

Add model controls

Sometimes models fail to converge during d-sep tests, need to integrate optional model controls to tweak convergence criterion for get.sem.fit and get.partial.resids

Rounding in get.missing.paths

Can this be made an option? 3 can often be too rounded.

get.sem.coefs return table instead of list

get.partial.resid chokes with NAs

Add na.action to arguments

Including corr.errors with sem.coefs gives an error

> sem.coefs(sem_mod_nlme, data=exp_data_func, corr.errors="belowbiomass_g~~abovebiomass_g")
Error in match.names(clabs, names(xi)) : 
  names do not match previous names

Although corr.errors is in the function help file as a possible argument.

Include more details of tests in get.missing.paths

Can more details about the test used to produce a p-value be included in get.missing.paths? What kind of test is it? What's the coefficient estimate and SE (if applicable)?

Basis set for interactions includes missing path from interaction to one of it's factors

When running a model with an interaction, a path from the interaction variable to one of its factors is included. This seems like bad behavior.

 set.seed(2002)
 d <- data.frame(x =rnorm(100))

 d <- within(d, {
   x1 <- rnorm(100, 3*x)
   x2 <- rnorm(100, 9)
   y <- rnorm(100, x1*x2, 20)
   y1 <- rnorm(100, y*4, 40)
  })

 #make some models
 modlist <- list(
   lm(x1 ~ x, data=d),
   lm(y ~ x1*x2, data=d),
   lm(y1 ~ y, data=d)
 )

 #fit and see  x1 <- x1:x2!
 get.sem.fit(modlist, data=d)

 get.basis.set(modlist)

> sem.coefs(sem_mod_nlme, data=exp_data_func, standardize="scale")
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric
In addition: Warning message:
In if (class(data) == "comparative.data") newdata = data$data else newdata = data :
  the condition has length > 1 and only the first element will be used

Hrm........

Feature request: range standardization

Have switch to allow range standardization for get.sem.coefs() instead of centering and scaling predictors.

End up with backwards predictions in basis set with multiple sets of interactions

 data = data.frame(
   y = runif(100, 0, 10),
   x1 = runif(100, 0, 10),
   x2 = runif(100, 0, 10),
   x3 = runif(100, 0, 10),
   x4 = runif(100, 0, 10),
   x5 = runif(100, 0, 10) )

 modelList = list(
    lm(y ~ x1 * x2, data),
    lm(log10(x3) ~ y, data),
    lm(x5 ~ log10(x3) * x4, data) )

 get.sem.fit(modelList, data)

jslefche / piecewisesem Goto Github PK

piecewisesem's Introduction

piecewiseSEM: Piecewise Structural Equation Modeling in R

Version 2.3.01

Last updated: 01 June 2023

To install

Getting Help

Example

piecewisesem's People

Contributors

Stargazers

Watchers

Forkers

piecewisesem's Issues

y = sample(0:1, 100, replace = T),

Recommend Projects

Recommend Topics

Recommend Org

Jobs