larmarange / ggstats Goto Github PK

Extension to ggplot2 for plotting stats

Home Page: https://larmarange.github.io/ggstats/

License: GNU General Public License v3.0

R 100.00%

ggstats's Introduction

`ggstats`: extension to `ggplot2` for plotting stats

The ggstats package provides new statistics, new geometries and new positions for ggplot2 and a suite of functions to facilitate the creation of statistical plots.

Installation & Documentation

To install stable version:

install.packages("ggstats")

Documentation of stable version: https://larmarange.github.io/ggstats/

To install development version:

remotes::install_github("larmarange/ggstats")

Documentation of development version: https://larmarange.github.io/ggstats/dev/

Plot model coefficients

library(ggstats)

mod1 <- lm(Fertility ~ ., data = swiss)
ggcoef_model(mod1)

ggcoef_table(mod1)

Comparing several models

mod2 <- step(mod1, trace = 0)
mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss)
models <- list(
  "Full model" = mod1,
  "Simplified model" = mod2,
  "With interaction" = mod3
)

ggcoef_compare(models, type = "faceted")

Compute custom proportions

library(ggplot2)
ggplot(as.data.frame(Titanic)) +
  aes(x = Class, fill = Survived, weight = Freq, by = Class) +
  geom_bar(position = "fill") +
  geom_text(stat = "prop", position = position_fill(.5)) +
  facet_grid(~Sex)

Compute weighted mean

data(tips, package = "reshape")
ggplot(tips) +
  aes(x = day, y = total_bill, fill = sex) +
  stat_weighted_mean(geom = "bar", position = "dodge") +
  ylab("Mean total bill per day and sex")

Compute cross-tabulation statistics

ggplot(as.data.frame(Titanic)) +
  aes(
    x = Class, y = Survived, weight = Freq,
    size = after_stat(observed), fill = after_stat(std.resid)
  ) +
  stat_cross(shape = 22) +
  scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) +
  scale_size_area(max_size = 20)

Plot survey objects taking into account weights

library(survey, quietly = TRUE)
#> 
#> Attachement du package : 'survey'
#> L'objet suivant est masqué depuis 'package:graphics':
#> 
#>     dotchart
dw <- svydesign(
  ids = ~1,
  weights = ~Freq,
  data = as.data.frame(Titanic)
)
ggsurvey(dw) +
  aes(x = Class, fill = Survived) +
  geom_bar(position = "fill") +
  ylab("Weighted proportion of survivors")

Plot Likert-type items

library(dplyr)
#> 
#> Attachement du package : 'dplyr'
#> Les objets suivants sont masqués depuis 'package:stats':
#> 
#>     filter, lag
#> Les objets suivants sont masqués depuis 'package:base':
#> 
#>     intersect, setdiff, setequal, union
likert_levels <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree"
)
set.seed(42)
df <-
  tibble(
    q1 = sample(likert_levels, 150, replace = TRUE),
    q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1),
    q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q5 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0))
  ) %>%
  mutate(across(everything(), ~ factor(.x, levels = likert_levels)))

gglikert(df)

ggstats's People

Contributors

Stargazers

Watchers

Forkers

rorylawless

ggstats's Issues

Feature request or workaround: Extending `gglikert` with `ggiraph`?

Dear @larmarange, your gglikert and gglikert_stacked are exactly the ready-made solutions for what I have been to a large extent implementing manually. However, I would also need some ggiraph interactivity added. Would you happen to know of a way for me to use ggstats as the foundation, and then add ggiraph on top? This is for a new package in development so if there is some dirty hack that I could hide within a wrapper function, it would be ok. Though I guess it would require replacing many of the ggplot2 geoms, scales, etc with the corresponding ones from ggiraph. The specific part that I need is that hovering over a category (a cell) highlights the count. Perhaps also more info such as showing the variable name when clicking on the variable label, etc.

I expect you not wanting to pull in another dependency such as ggiraph. Would however a conditional check of whether ggiraph is installed, and a new default argument interactive = FALSE be something to consider, if the wrapper approach seems tricky?

Best

Release ggstats 0.5.0

Prepare for release:

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)

Fix

Here

https://github.com/larmarange/ggstats/blob/main/R/ggcoef_model.R#L396

Change the font family for the predictor modality names

Hello,
Thanks for your work, this package is really great!

I wonder if there is a way to change the font family for the predictor modality names?
It's possible to change the font for everything else, except the predictor modality font.

For example :
mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris) ggcoef_model(mod) + theme(text = element_text(family = "Roboto"))

Thanks !

Add pairwise contrasts

waiting for the release of broom.helpers 1.11.0
update DESCRIPTION and require min version of broom.helpers

Test compatibility with next version of ggplot2

You can install the release candidate of ggplot2 using devtools::install_github('tidyverse/[email protected]') to test this out.

Release ggstats 0.5.1

fix #49

Prepare for release:

Submit to CRAN:

usethis::use_version('patch')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)

How to change the colour of the labels when using gglikert

Hi,

I am using the gglikert and found it's a great tool. Thanks for developing it.

Now I need to use some dark colors for the bars so I want to change the colour of the percentage labels from black to white. Is there an easy way to achieve it please? thank you.

Release ggstats 0.4.0

Prepare for release:

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)

Add a zero-inflated model example in the vignette

Using y.level argument of ggcoef_multinom()
Once new version of broom.helpers on CRAN with support of zero-inflated models

Release ggstats 0.3.0

Prepare for release:

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

Add LICENCE file

Package must be loaded to namespace for gglikert() and cousins to work

Tried the examples for the gglikert() help page, by only adding the ggstats::-prefix, but then a weird error arises. Not having to load entire packages into namespaces is rather useful for package development and Quarto use (where memory management is an issue).

library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

likert_levels <- c(
    "Strongly disagree",
    "Disagree",
    "Neither agree nor disagree",
    "Agree",
    "Strongly agree"
)
set.seed(42)
df <-
    tibble(
        q1 = sample(likert_levels, 150, replace = TRUE),
        q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1),
        q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
        q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
        q5 = sample(c(likert_levels, NA), 150, replace = TRUE),
        q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0))
    ) %>%
    mutate(across(everything(), ~ factor(.x, levels = likert_levels)))

ggstats::gglikert(df)
#> Error in `geom_bar()`:
#> ! Can't find stat called "prop"
#> Backtrace:
#>     ▆
#>  1. └─ggstats::gglikert(df)
#>  2.   └─ggplot2::geom_bar(...)
#>  3.     └─ggplot2::layer(...)
#>  4.       └─ggplot2:::check_subclass(...)
#>  5.         └─cli::cli_abort("Can't find {argname} called {.val {x}}", call = call)
#>  6.           └─rlang::abort(...)

^{Created on 2023-09-27 with reprex v2.0.2}

Release ggstats version 0.2.1

Prepare for release:

Verify CRAN checks
devtools::check(remote = TRUE, manual = TRUE)
spelling::spell_check_package()
urlchecker::url_check() and urlchecker::url_update()
revdepcheck::revdep_reset()
revdepcheck::revdep_check(num_workers = 4)
Polish NEWS
Polish pkgdown reference index

Submit to CRAN:

usethis::use_version()
Update cran-comments.md
devtools::submit_cran()
Approve email

Wait for CRAN...

Use dontrun to have shorter examples

But execute such examples in pkgdown

Add a prop_lower option to sort gglikert()

See https://stackoverflow.com/questions/78346665/sort-the-gglikert-chart-based-on-the-combination-percentages-in-r/78347886#78347886

gglikert() : align total proportions when faceting

ggstats has been archived

The package was archived last week (2023-11-14). CRAN is asking for another update in GGally which depends on {ggstats}.

Are there plans to get {ggstats} unarchived? The package seemed to only fail a test on R-devel, so I don't know why it was actually archived.

Check Details
Version: 0.5.0
Check: examples
Result: ERROR
    Running examples in ‘ggstats-Ex.R’ failed
    The error most likely occurred in:
    
    > base::assign(".ptime", proc.time(), pos = "CheckExEnv")
    > ### Name: ggcoef_model
    > ### Title: Plot model coefficients
    > ### Aliases: ggcoef_model ggcoef_table ggcoef_compare ggcoef_multinom
    > ### ggcoef_multicomponents ggcoef_plot
    >
    > ### ** Examples
    >
    > mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris)
    > ggcoef_model(mod)
    Error in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
     conversion failure on 'p ≤ 0.05' in 'mbcsToSbcs': for ≤
    Calls: <Anonymous> ... <Anonymous> -> widthDetails -> widthDetails.text -> grid.Call
    Execution halted
Flavor: [r-devel-linux-x86_64-debian-clang](https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-debian-clang/ggstats-00check.html)

Thank you!

Release ggstats version 0.1.0

Prepare for release:

devtools::check(remote = TRUE, manual = TRUE)
spelling::spell_check_package()
urlchecker::url_check() and urlchecker::url_update()
revdepcheck::revdep_reset()
revdepcheck::revdep_check(num_workers = 4)
Polish NEWS
Polish pkgdown reference index

Submit to CRAN:

usethis::use_version()
Update cran-comments.md
devtools::submit_cran()
Approve email

Wait for CRAN...

Add vignettes

Feature request: Default inversed text colour based on fill background

Having darker shades in the palette with black text is not the best.

The code for inversing font colour is quite simple and fast:

hex_bw <- function(hex_code) {

  rgb_conv <-
    lapply(grDevices::col2rgb(hex_code), FUN = function(.x) {
      ifelse(.x / 255 <= 0.04045,
             .x * 12.92 / 255,
             ((.x / 255 + 0.055) / 1.055) ^ 2.4)
    }) %>%
    unlist() %>%
    matrix(ncol = length(hex_code), byrow = FALSE) %>%
    sweep(., MARGIN = 1, STATS = c(0.2126, 0.7152, 0.0722), FUN = `*`) %>%
    apply(., MARGIN = 2, FUN = sum)

  hex <- ifelse(rgb_conv > 0.179,
                "#000000",
                "#ffffff")

  hex[is.na(hex_code)] <- "#ffffff"
  hex

}
hex_bw("#123456")

Feature Request: Change order of reference level in ggcoef_compare()

I find your ggocef_compare() such a useful tool for comparing models. Is there a way to retain the factor ordering for the reference levels?

d <- as.data.frame(Titanic)

d$Sex <- factor(d$Sex, levels=c("Male", "Female"))
d$Sex=relevel(as.factor(d$Sex),ref="Male")

m1 <- glm(Survived ~ Sex + Age, family = binomial, data = d)
m2 <- glm(Survived ~ Sex + Age + Class, family = binomial, data = d)

models <- list("Model 1" = m1, "Model 2" = m2)

ggcoef_compare(models, exponentiate = TRUE)

Which produces:

This differs from ggcoef_model(), which retains the specified order:

d <- as.data.frame(Titanic)

d$Sex <- factor(d$Sex, levels=c("Male", "Female"))
d$Sex=relevel(as.factor(d$Sex),ref="Male")

m1 <- glm(Survived ~ Sex + Age, family = binomial, data = d)    

ggcoef_model(m1, exponentiate = TRUE)

There is probably a straightforward way to do this, but I'm struggling to find it. Additionally, if I try to remove the reference row, I receive an error:

d <- as.data.frame(Titanic)

d$Sex <- factor(d$Sex, levels=c("Male", "Female"))
d$Sex=relevel(as.factor(d$Sex),ref="Male")

m1 <- glm(Survived ~ Sex + Age, family = binomial, data = d)
m2 <- glm(Survived ~ Sex + Age + Class, family = binomial, data = d)

models <- list("Model 1" = m1, "Model 2" = m2)

ggcoef_compare(models, exponentiate = TRUE, add_reference_rows = FALSE)
```

Error in eval_tidy(dot, data = mask): object 'reference_row' not found

Thanks!

Add a coefficient table to the ggcoef_model() output

The output plot of ggcoef_* functions are good. I would also love to see an option that allows users to specify whether or not to include a table which shows coefficient estimates (e.g. OR) and confidence intervals corresponding to each term in the model.

Here is an example of what I mean (although the code could use a significant improvement for more flexible customizability):

attach trial data from gtsummary package

data(trial, package="gtsummary")

fit a logistic regression model

glm_model <- glm( response ~ rms::rcs(age, 3) + trt*grade + marker + stage, data = trial, family = binomial )

define a function to attach a coefficients table to ggcoef_model output

ggcoef_table <- function(model) {
  #library(tidyverse)
  coef_data <- ggstats::ggcoef_model(
    model = model, 
    exponentiate = TRUE,
    return_data = TRUE
  )
  
  coef_plot <- ggstats::ggcoef_model(
    model = model,
    exponentiate = TRUE,
    show_p_values = FALSE,
    signif_stars = FALSE
  )
  
  coef_data <- coef_data %>% 
    dplyr::mutate(across(c(estimate, conf.low, conf.high), \(x) round(x, 1))) 
  
  coef_data_table <- coef_data %>% 
    dplyr::mutate(
      estimate = as.character(estimate),
      conf_interval = paste0(conf.low, ", ", conf.high)
    ) %>% 
    tidyr::pivot_longer(
      c(estimate, conf_interval), 
      names_to = "stat",
      names_transform = list(stat = as.factor)
    ) %>% 
    dplyr::mutate(
      stat = fct_relevel(stat, c("estimate", "conf_interval"))
    )
  
  coef_data_table_plot <- coef_data_table %>% 
    ggplot2::ggplot(
      aes(
        x = stat, 
        y = term,
        label = value
      )
    ) +
    ggplot2::geom_text(hjust = 1, size = 3) +
    ggplot2::scale_x_discrete(position = "top", labels = c("OR", "95% CI")) +
    ggplot2::scale_y_discrete(limits = rev) + # reverse factor levels to align estimates with corresponding terms
    ggplot2::labs(
      y = NULL, 
      x = NULL
    ) +
    cowplot::theme_cowplot() +
    ggplot2::theme(
      # customize theme to have a clear background for the text
      panel.grid.major = element_blank(),
      axis.line = element_blank(),
      axis.ticks = element_blank(),
      axis.text.y = element_blank(),
      axis.text.x = element_text(face = "bold", hjust = 1),
      strip.text = element_blank()
    )
  
  # join the plots
  coef_plot + patchwork::plot_spacer() + coef_data_table_plot + patchwork::plot_layout(widths = c(4, -0.5, 3))
}

use custom function on fitted model
ggcoef_table(model = glm_model)

plyr::round_any() not imported

Release ggstats 0.6.0

Prepare for release:

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)

New argument `data_fun` to `gglikert()`

This is not a huge priority for me or my colleagues, but one might want to sort not based on the proportion of categories above (or including the middle), but simply the top category.
Maybe something for 0.7 or 0.8? :)

gglikert: Is it possible to tell which items are "bad" and which are "good" if there are even number of them?

In the likert package I can tell where to set the "center" by specifying x.5 (halves) options.

For example, having 6 options {0=no pain, 1=mild, 2=moderate, 3=severe, 4=very severe, 5=horrible pain} I can ask the procedure to draw ONLY the first 2 options (0 and 1) as shades of green (meaning success), and all >= 2 - as shades of yellow/red (meaning problem).

In the likert package the center value would be = 2.5 resulting in:

This cut-off is very important in this case. Is it possible to tell gglikert() where to split the colors in case of even number of items?

Sorting in `gglikert` could be more flexible

Hi! I was hoping I could sort in the case of a single question split on a multicategory y variable, but this is ignored. Is there a workaround not involving faceting? Same applies to gglikert_stacked.

library(ggstats)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
likert_levels <- c(
    "Strongly disagree",
    "Disagree",
    "Neither agree nor disagree",
    "Agree",
    "Strongly agree"
)
df <-
    tibble(
        q1 = sample(likert_levels, 150, replace = TRUE),
        q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1),
        q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
        indep = sample(letters[1:5], 150, replace = TRUE)
    ) %>%
    mutate(across(-indep, ~ factor(.x, levels = likert_levels)))
gglikert(df, 
         include=q1,
         y=indep, 
         sort = "ascending")

^{Created on 2024-03-20 with reprex v2.1.0}

Add a weight option to gglikert()

Italic on ggcoef_table

Hey everyone,

I've been using the ggcoef_table function, and I've been asked to italicize partially variable names within it. I've tried a couple of methods like HTML and expressions, but I haven't had any luck. Could someone please assist me? I've provided a reprex below. My aim is to italicize just the word "Variable."

library(ggstats)
dat<-data.frame(x=rnorm(100),y=rnorm(100),w=runif(100),z=runif(100))
mod<-lm(x~y+w+z)
ggcoef_table(mod,colour = NULL,add_reference_rows = F,
variable_labels = list(y= "Variable Y",
w = "Variable W",
z = "Variable Z"))

Thanks!

Use cli

Is it possible to extend ggstats::ggcoef_table for MICE-imputed GEE-GLM (geeglm) model pooled coefficients?

It partially works, but fails to recognize everything.
It doesn't recognize odds ratios, doesn't format the coefficients nicely like in the examples.

The model is the GEE estimated logistic regression (geepack::geeglm).
There exists a basic tidier:

> tidy(mod)
                                         term estimate std.error statistic p.value        b    df dfcom    fmi lambda  m    riv     ubar
1                                 (Intercept)  -0.2990    0.4300   -0.6953  0.4868 5.74e-03 17876   Inf 0.0327 0.0326 20 0.0337 0.178911
2                                      ArmHD    0.3809    0.6201    0.6142  0.5391 2.18e-02  5378   Inf 0.0598 0.0594 20 0.0632 0.361710
3                          Visit_nOrdMonth 12  -0.0829    0.3678   -0.2253  0.8218 2.10e-02   713   Inf 0.1656 0.1633 20 0.1951 0.113180
4                          Visit_nOrdMonth 20  -0.3074    0.3837   -0.8012  0.4237 3.61e-02   287   Inf 0.2623 0.2572 20 0.3462 0.109375
5                            CurrentSmokerYes  -0.7840    0.7094   -1.1052  0.2691 1.07e-02 38132   Inf 0.0224 0.0223 20 0.0228 0.491951
...

But the results:

> ggstats::ggcoef_table(mod, exponentiate = TRUE)
✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Pooled model data:

mod <- structure(list(call = pool(object = m_longit), m = 20L, pooled = structure(list(
    term = structure(1:30, levels = c("(Intercept)", "ArmHD ", 
    "Visit_nOrdMonth 12", "Visit_nOrdMonth 20", "CurrentSmokerYes", 
    "BMI_centered", "GenderMale", "Age_centered", "ArmHD :Visit_nOrdMonth 12", 
    "ArmHD :Visit_nOrdMonth 20", "ArmHD :CurrentSmokerYes", 
    "ArmHD :BMI_centered", "ArmHD :GenderMale", 
    "ArmHD :Age_centered", "Visit_nOrdMonth 12:CurrentSmokerYes", 
    "Visit_nOrdMonth 20:CurrentSmokerYes", "Visit_nOrdMonth 12:BMI_centered", 
    "Visit_nOrdMonth 20:BMI_centered", "Visit_nOrdMonth 12:GenderMale", 
    "Visit_nOrdMonth 20:GenderMale", "Visit_nOrdMonth 12:Age_centered", 
    "Visit_nOrdMonth 20:Age_centered", "ArmHD :Visit_nOrdMonth 12:CurrentSmokerYes", 
    "ArmHD :Visit_nOrdMonth 20:CurrentSmokerYes", "ArmHD :Visit_nOrdMonth 12:BMI_centered", 
    "ArmHD :Visit_nOrdMonth 20:BMI_centered", "ArmHD :Visit_nOrdMonth 12:GenderMale", 
    "ArmHD :Visit_nOrdMonth 20:GenderMale", "ArmHD :Visit_nOrdMonth 12:Age_centered", 
    "ArmHD :Visit_nOrdMonth 20:Age_centered"), class = "factor"), 
    m = c(20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 
    20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 
    20L, 20L, 20L, 20L, 20L, 20L, 20L), estimate = c(-0.29903187595548, 
    0.380872027401296, -0.0828676502765822, -0.307444671001533, 
    -0.783980495389177, 0.035212462157362, 0.250206944263597, 
    -0.017234017916225, 0.0873716935040566, 0.501669357945944, 
    0.629976239010189, 0.0103309021118499, -0.710440238180912, 
    0.0284496288428124, 0.187255608262129, 0.365859459306186, 
    -0.113801927460189, -0.0214189834022172, 0.639079773673284, 
    0.467942998794559, 0.0336432513729626, -0.0032003933188581, 
    -0.0390287655013661, -0.367468618749712, 0.110564691330047, 
    0.0195080672542059, -0.609947802591084, -0.514663582186983, 
    -0.0191275913422143, 0.0229444015254167), ubar = c(0.178910542612104, 
    0.361709639160786, 0.11318010075047, 0.109374942177015, 0.491950601812907, 
    0.00359752616836495, 0.37258391707241, 0.000779654085582352, 
    0.248436936335551, 0.253871527293263, 0.821531331040922, 
    0.00762841036178101, 0.695736347093203, 0.00144871817311475, 
    0.400419307876188, 0.582910240807128, 0.00341763960905729, 
    0.00448395840623119, 0.203718949538353, 0.322473550400124, 
    0.000569997779978852, 0.000608363225145927, 0.646926965308594, 
    0.899047011197042, 0.00617998444713391, 0.00874507037666418, 
    0.440344746613085, 0.615333409679517, 0.00154075485784863, 
    0.00123053826533071), b = c(0.00574228980863424, 0.0217701125420553, 
    0.0210353227321423, 0.0360599329922958, 0.0106970974728955, 
    0.000609255139901839, 0.0231763298949901, 2.57010039837727e-05, 
    0.0301639128892515, 0.036985935195778, 0.0370945204494253, 
    0.000944934906536697, 0.0306037473509527, 0.000249506763463724, 
    0.202339930508713, 0.094294780937389, 0.000974055403472398, 
    0.0026671211819612, 0.0692512364870469, 0.0402692080443424, 
    8.49167907626829e-05, 0.000182968384479355, 0.250361623612087, 
    0.123376260799054, 0.00190559100079968, 0.00279970449631357, 
    0.133737715221484, 0.0696958456346859, 0.000579398967755914, 
    0.000380137562633506), t = c(0.18493994691117, 0.384568257329944, 
    0.135267189619219, 0.147237871818926, 0.503182554159447, 
    0.00423724406526188, 0.39691906346215, 0.000806640139765313, 
    0.280109044869265, 0.29270675924883, 0.860480577512819, 0.00862059201364454, 
    0.727870281811703, 0.00171070027475166, 0.612876234910337, 
    0.681919760791386, 0.00444039778270331, 0.00728443564729045, 
    0.276432747849752, 0.364756218846684, 0.000659160410279669, 
    0.00080048002884925, 0.909806670101286, 1.02859208503605, 
    0.00818085499797357, 0.0116847600977934, 0.580769347595643, 
    0.688514047595937, 0.00214912377399234, 0.00162968270609589
    ), dfcom = c(Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, 
    Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, 
    Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf), df = c(17875.8312273481, 
    5377.74893584068, 712.625859410522, 287.318926486729, 38132.409398344, 
    833.572847238697, 5054.63256049474, 16975.9654941346, 1486.1201142506, 
    1079.36242884061, 9273.37078501162, 1434.31784421067, 9748.40946606314, 
    810.13651553986, 158.109284183625, 901.293506980286, 358.138852078407, 
    128.552737041216, 274.598980261076, 1413.95059720754, 1038.41293270817, 
    329.855627850284, 227.581899074746, 1197.83817922747, 317.623777540153, 
    300.185847842698, 324.993426903818, 1681.84833695141, 237.105935251678, 
    316.737544169371), riv = c(0.033700665209754, 0.0631960437167034, 
    0.195149931147748, 0.346175539737723, 0.0228314637793895, 
    0.177821610450628, 0.0653145379461191, 0.0346128554726992, 
    0.127485505983443, 0.152971986932216, 0.0474105429704621, 
    0.130064011348216, 0.0461869425864503, 0.180837174889335, 
    0.530586120237343, 0.169853800899371, 0.299258637726326, 
    0.624554687476034, 0.356931932332146, 0.131119803140739, 
    0.156426276439402, 0.315792927255325, 0.406351441336649, 
    0.144091546076687, 0.323766276105696, 0.336153923812171, 
    0.318896959853922, 0.118928432562332, 0.394851207539387, 
    0.324365728405779), lambda = c(0.0326019575530752, 0.0594396904410815, 
    0.163284895109635, 0.25715482826643, 0.0223218238662962, 
    0.150974993897925, 0.0613100972714056, 0.0334548863273935, 
    0.113070638431174, 0.132676239029223, 0.0452645271604824, 
    0.115094375223085, 0.04414788667909, 0.153143192588159, 0.346655515310087, 
    0.145192331527914, 0.230330304557391, 0.384446699326795, 
    0.263043358201978, 0.115920349707134, 0.135266968267993, 
    0.24000199477744, 0.288940181943737, 0.125944070272003, 0.244579637621653, 
    0.251583232905602, 0.241790655040438, 0.106287792052962, 
    0.283077654021558, 0.244921566187188), fmi = c(0.0327101746988321, 
    0.0597892924465919, 0.165623310061771, 0.262272270527071, 
    0.0223730979022412, 0.153004763352921, 0.0616812946217472, 
    0.0335687383891115, 0.114261851084983, 0.134278888452115, 
    0.045470369616539, 0.116325704296666, 0.0443439305672371, 
    0.155226131408954, 0.354766090538579, 0.147082885118919, 
    0.234592763115993, 0.393804973651112, 0.26835286359789, 0.11716821205609, 
    0.1369276602079, 0.244568527181479, 0.295107706982863, 0.127399813343547, 
    0.249291829388427, 0.256520249408316, 0.24641397540461, 0.10734867354401, 
    0.289049371007454, 0.249644680176034)), class = "data.frame", row.names = c(NA, 
-30L)), glanced = NULL), class = c("mipo", "data.frame"))

Release ggtstats version 0.1.1

Prepare for release:

Verify CRAN checks
devtools::check(remote = TRUE, manual = TRUE)
spelling::spell_check_package()
urlchecker::url_check() and urlchecker::url_update()
revdepcheck::revdep_reset()
revdepcheck::revdep_check(num_workers = 4)
Polish NEWS
Polish pkgdown reference index

Submit to CRAN:

usethis::use_version()
Update cran-comments.md
devtools::submit_cran()
Approve email

Wait for CRAN...

Release ggstats version 0.2.0

Prepare for release:

Merge #15
Verify CRAN checks
devtools::check(remote = TRUE, manual = TRUE)
spelling::spell_check_package()
urlchecker::url_check() and urlchecker::url_update()
revdepcheck::revdep_reset()
revdepcheck::revdep_check(num_workers = 4)
Polish NEWS
Polish pkgdown reference index

Submit to CRAN:

usethis::use_version()
Update cran-comments.md
devtools::submit_cran()
Approve email

Wait for CRAN...

Correct before 2022-11-26 to safely retain your package on CRAN.

cf. https://cran.r-project.org/web/checks/check_results_ggstats.html

checking Rd cross-references ... NOTE
Undeclared package ‘glue’ in Rd xrefs

'Packages which use Internet resources should fail gracefully with an informative message
if the resource is not available or has changed (and not give a check warning nor error).'

Term order

Thank you for your efforts on this package.

I am using ggcoef_table to visualize coefficients from a survival::coxph model. It works great. However, I've noticed that categorical terms are sorted based on their string values. For example, when I specify a factor variable with levels=c("0", "1", ">=2")), the terms are displayed in this order: "1", "0", ">=2". Is there a way to enforce the same order as in the factor levels?

I found a temporary workaround. Specifying categorical_terms_pattern="level={level_rank}; {level}" puts them in correct order. However, it does not seem like a clean solution.

Thank you.

Update vignette with ggcoef_table()

Question: geom_stripped_cols with dates on the x-axis.

Hi,

Thanks for your excellent package that has helped me a lot in achieving what I wanted to plot. However, now I am stuck, I cannot get columns when using date as a x-axis, plotting monthly data. What am I doing wrong? I am assuming that plotting monthly data and using a width of 12 would shade each year.

library(openair)
library(ggstats)
library(ggplot2)

mydata <- openair::mydata

mydata |>
  openair::timeAverage(avg.time = "month") |> 
  ggplot2::ggplot(aes(date, nox)) +   
  ggplot2::geom_point() +   
  ggplot2::geom_line() +
  ggplot2::theme_minimal() +
  ggstats::geom_stripped_cols(width = 12)

^{Created on 2023-12-20 with reprex v2.0.2}

Documentation: Modify gglikert_* sort argument description

gglikert_* sort-argument currently reads "should variables be sorted?" which implies a logical flag. I suggest changing to:
'Sorting order of include variables. One of "none" (default), "ascending" or "descending".`
Depends a bit on whether you in #55 will allow sorting the y categories.

Add a `tidy_args` option in `ggcoef_model()` to pass additional argument to `tidy_plus_plus()`

Adding phi coefficients to stat_cross() function

Copied from ggobi/ggally#437

The stat_cross() function is very useful ! Would you consider adding phi measures of local associations, on top of Pearson's residuals ? Phi coefficients have the advantage of being bounded between -1 and 1, just as Pearson's correlation, and so their value is easily interpretable. Practically, I think it would imply very few changes in the code. This worked for me :

                       # compute cross statistics
                       panel <- broom::augment(chisq.test(xtabs(weight ~ y + x, data = data)))
                       panel$.phi <- with(data, GDAtools::phi.table(y, x, weight)) %>% as.data.frame() %>% dplyr::pull(Freq)

                       panel_names <- names(panel)
                       for (to_name in c(
                         "observed",
                         "prop",
                         "row.prop",
                         "col.prop",
                         "expected",
                         "resid",
                         "std.resid",
                         "phi"
                       )) {

larmarange / ggstats Goto Github PK

ggstats's Introduction

ggstats: extension to ggplot2 for plotting stats

Installation & Documentation

Plot model coefficients

Comparing several models

Compute custom proportions

Compute weighted mean

Compute cross-tabulation statistics

Plot survey objects taking into account weights

Plot Likert-type items

ggstats's People

Contributors

Stargazers

Watchers

Forkers

ggstats's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs

`ggstats`: extension to `ggplot2` for plotting stats