(I'm putting this here more as "Hey, I want to remember to do this later," rather than

Something along these lines? <div class="snippet-clipboard-content notranslate pos

In that case, I recommend closing this in favor of <a class="issue-link js-issue-link"

Tidy multiple models at once about broom HOT 8 CLOSED

tidymodels commented on July 19, 2024

Tidy multiple models at once

from broom.

Comments (8)

dgrtwo commented on July 19, 2024

I frequently have the same need or similar needs. A simple version of this function would be straightforward enough to write as:

straighten = function(...) {
    straight <- lapply(list(...), tidy)
    straight <- lapply(names(straight), function(n) cbind(model=n, straight[[n]]))
    rbind_all(straight)
}

(For now this gives a warning for turning strings into factors; I should consider turning all tidy output into character vectors so they can be easily recombined).

However, I'd be interested in seeing whether this can be done using dplyr's tools rather than adding a new utility to broom. I have a common pattern wherein I write a function that performs one model fit or simulation, and makes its decisions based on simple character/numeric arguments. Then I create a table of my parameter combinations, using either data.frame or expand.grid, and use group_by and do to perform the model:

fitmod <- function(..., model="linear") {
    if (model == "linear") lm(...)
    else if (model == "logit") glm(..., family="binomial")
}

library(dplyr)
results <- data.frame(model=c("linear", "logit")) %>% group_by(model) %>%
    do(tidy(fitmod(y ~ x, dat, model=.$model)))

On its own, this looks much more complicated than the use of straighten. However, consider that you could add any number of parameters, or factorial combinations of parameters, and it would perform and label all combinations (straighten would need to give each model a unique name). If there are ways to further simplify this kind of pattern I'd be interested in supporting them.

from broom.

markdanese commented on July 19, 2024

One other area where this comes into play is with multinomial regression (package nnet). In this case there are essentially n-1 models, where n is the number of factors in the outcome variable. For example, one could model preferences for ice cream as a function of covariates, with responses of vanilla, chocolate, and strawberry. In this case, the model essentially outputs coefficients for the covariates for two models: strawberry vs. chocolate and vanilla vs. chocolate (it uses the first level of the factor that defines the outcome as the reference group in the model). So, in this case, it is essentially the same as running the separate logistic regression models. However, it is all in one model, so you could save the output to a list of data.frames.

A list of data frames, one per model, would generalize to running multiple models. In the simple case of two models, one could then merge the two dataframes on the coefficient column and create and expanded data frame with all of the coefficients from both models, and all of the output for all of the models. (Not my idea, but someone in my office suggested this.) One would have to use all = TRUE in merge and it would return NA for coefficients that are in one model but not another. Also, one would have to have a way of differentiating the columns that apply to each model (perhaps using .1, .2, etc as a suffix). There is no question it gets complicated.

The package texreg does a nice job with aligning table output for side-by-side models. Might be worth a look.

Anyway, these are just random thoughts in case they are helpful.

from broom.

mbojan commented on July 19, 2024

This is an old issue, but I run into this myself.

Should the long-term solution to this be outer-join of multiple tidyied models on the term column with adjusted coumn names (e.g. estimate1, estimate2 if estimates come from models 1, 2, and so on)?

from broom.

nutterb commented on July 19, 2024

Something along these lines?

library(broom)
library(survival)
fit1 <- lm(mpg ~ qsec + wt + am, data = mtcars)
fit2 <- glm(am ~ mpg + qsec + factor(gear), data = mtcars, family = binomial)
fit3 <- coxph(Surv(futime, fustat)~ age + resid.ds, data = ovarian)


straighten <- function(..., fn = tidy){
  fits <- list(...)
  if (is.null(names(fits))) names(fits) <- character(length(fits))
  
  # If a fit isn't named, use the object name
  dots <- match.call(expand.dots = FALSE)$...
  obj_nms <- vapply(dots, deparse, character(1))
  names(fits)[names(fits) == ""] <- obj_nms[names(fits) == ""]
  
  purrr::map2(.x = fits,
              .y = names(fits),
              .f = function(x, n){
                data.frame(model = n, 
                           fn(x),
                           stringsAsFactors = FALSE)
              }) %>%
    dplyr::bind_rows()
}

straighten(x = fit1, fit2, ovarian = fit3)



fit1 <- lm(mpg ~ wt + disp, data = mtcars)
fit2 <- lm(mpg ~ wt + disp + factor(gear), data = mtcars)

library(dplyr)
library(reshape2)
straighten(fit1, fit2) %>% 
  select(model, term, estimate) %>% 
  dcast(term ~ model, 
        value.var = "estimate")  


straighten(fit1, fit2, fn = glance)

from broom.

alexpghayes commented on July 19, 2024

Okay I've been thinking about this for a while and finally have some thoughts. One feature request that keeps popping up again and again and is a tidying method that works on multiple models at once (#202, #206, several other places I don't recall off the top of my head).

I'm increasingly of the opinion that broom verbs should work on a single model, mostly because working with a single model plays really well with purrr::map in a very explicit way. For example, the code examples above become:

library(survival)

fit1 <- lm(mpg ~ qsec + wt + am, data = mtcars)
fit2 <- glm(am ~ mpg + qsec + factor(gear), data = mtcars, family = binomial)
fit3 <- coxph(Surv(futime, fustat)~ age + resid.ds, data = ovarian)

models <- list(fit1 = fit1, fit2 = fit2, fit3 = fit3)
purrr::map_df(models, broom::tidy, .id = "model")
#>    model          term     estimate    std.error    statistic      p.value
#> 1   fit1   (Intercept)    9.6177805 6.959593e+00  1.381945832 1.779152e-01
#> 2   fit1          qsec    1.2258860 2.886696e-01  4.246675671 2.161737e-04
#> 3   fit1            wt   -3.9165037 7.112016e-01 -5.506882344 6.952711e-06
#> 4   fit1            am    2.9358372 1.410905e+00  2.080819191 4.671551e-02
#> 5   fit2   (Intercept) 1050.1649470 7.025103e+05  0.001494875 9.988073e-01
#> 6   fit2           mpg   32.1663723 1.796845e+04  0.001790158 9.985717e-01
#> 7   fit2          qsec  -99.1600110 5.862305e+04 -0.001691485 9.986504e-01
#> 8   fit2 factor(gear)4  126.1569921 8.824948e+04  0.001429549 9.988594e-01
#> 9   fit2 factor(gear)5  -60.5404408 1.555674e+05 -0.000389159 9.996895e-01
#> 10  fit3           age    0.1445394 5.141628e-02  2.811161152 4.936306e-03
#> 11  fit3      resid.ds    0.6141357 7.335776e-01  0.837178889 4.024920e-01
#>       conf.low conf.high
#> 1           NA        NA
#> 2           NA        NA
#> 3           NA        NA
#> 4           NA        NA
#> 5           NA        NA
#> 6           NA        NA
#> 7           NA        NA
#> 8           NA        NA
#> 9           NA        NA
#> 10  0.04376539 0.2453135
#> 11 -0.82365000 2.0519214

The output will be even nicer one we return tibbles. Since this is the workflow that I'd like to promote, I'm hesitant to also have a straighten that does the same thing.

More broadly, I think there's a big need at the moment to document the purrr::map workflow for modelling as pertains to broom (#353).

from broom.

nutterb commented on July 19, 2024

I just cut straighten from my branch. If you're able to comment on #115, I'll be able to commit and update my pull request.

from broom.

nutterb commented on July 19, 2024

In that case, I recommend closing this in favor of #353 (and then you'll be under 90 open issues!!! :) )

from broom.

github-actions commented on July 19, 2024

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

from broom.

Tidy multiple models at once about broom HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs