erblast / easyalluvial Goto Github PK

View Code? Open in Web Editor NEW

108.0 108.0 10.0 1.01 GB

create alluvial plots with a single line of code

Home Page: https://erblast.github.io/easyalluvial/

R 99.43% Dockerfile 0.57%

easyalluvial's People

Stargazers

Watchers

Forkers

j-johanness swipswaps clinicopath han-tun dondealban sqjin captain-science kmavrommatis gernophil olivroy

easyalluvial's Issues

not compatible with dplyr 8.0

require(tidyverse)

# causes memory overflow with dplyr dev version (8GB RAM)
df = ggplot2::diamonds %>%
  mutate_if( is.numeric, cut, 5) %>%
  group_by_all()

sessionInfo()
```
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] forcats_0.3.0     stringr_1.3.1     dplyr_0.7.99.9000 purrr_0.2.5       readr_1.3.1       tidyr_0.8.2       tibble_1.4.2      ggplot2_3.1.0    
[9] tidyverse_1.2.1  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       cellranger_1.1.0 pillar_1.3.1     compiler_3.5.2   plyr_1.8.4       tools_3.5.2      jsonlite_1.6     lubridate_1.7.4 
 [9] gtable_0.2.0     nlme_3.1-137     lattice_0.20-38  pkgconfig_2.0.2  rlang_0.3.0.1    cli_1.0.1        rstudioapi_0.8   yaml_2.2.0      
[17] haven_2.0.0      withr_2.1.2      xml2_1.2.0       httr_1.4.0       generics_0.0.2   hms_0.4.2        grid_3.5.2       tidyselect_0.2.5
[25] glue_1.3.0       R6_2.3.0         readxl_1.2.0     modelr_0.1.2     magrittr_1.5     backports_1.1.3  scales_1.0.0     rvest_0.3.2     
[33] assertthat_0.2.0 colorspace_1.3-2 stringi_1.2.4    lazyeval_0.2.1   munsell_0.5.0    broom_0.5.1      crayon_1.3.4

alluvial_wide() does not work for character columns only

mtcars2 %>%
  select_if(is.factor) %>%
  alluvial_wide()

mtcars2 %>%
  select_if(is.factor) %>%
  mutate_all(as.character) %>%
  alluvial_wide()

Error: This tidyselect interface doesn't support predicates yet.
ℹ Contact the package author and suggest using eval_select().

easyalluvial and parcats are not compatible with the latest R?

Hi, there,

Thanks a lot for these two great packages, the outputs are awesome! However, recently, when I need to re-install the packages in a new machine, I encountered the following error:

> if (!require(easyalluvial)) {install.packages("easyalluvial",repos = "http://cran.us.r-project.org"); 
+         require(easyalluvial)}
Loading required package: easyalluvial
Warning in install.packages :
  package ‘easyalluvial’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages

> if (!require(parcats)) {install.packages("parcats",repos = "http://cran.us.r-project.org"); 
+         require(parcats)}
Loading required package: parcats
Warning in install.packages :
  package ‘parcats’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages

I am wondering, are these two packages not updated recently?

Best
Chuan-Peng

easyalluvial is a great tool. I was wondering if it is planned to create an option for having the legend for the strata outside of the plot. My problem is that my strata labels are long and partly cover the plot. Or is there already a solution for this that I have missed?

Many thanks!

TODO for 0.2.0

easyalluvial::manip_bin_numerics(c(Inf, 1, 2, 3))

easyalluvial::manip_bin_numerics(c(Inf, 1, 2, 3))
Inf is not getting handled properly

alluvial functions fail when input dataframe is grouped by dplyr::groupby()

alluvial_wide with size from column

I have data in wide format, with another column that determines the size of the flow in each row. I'm able to plot it with ggalluvial, but I can't figure out how to map the size column to the geom flow in easyalluvial:

flow_data
#> # A tibble: 10 x 7
#>       id group station1   station2  station3    station4    size
#>    <int> <chr> <chr>      <chr>     <chr>       <chr>      <dbl>
#>  1     1 Men   Eligible   Tried Psy Student     Has degree 51229
#>  2     2 Men   Never went No Psy    Not Student No degree  40091
#>  3     3 Men   Ineligible No Psy    Not Student No degree  35106
#>  4     4 Men   Eligible   No Psy    Not Student No degree  16181
#>  5     5 Men   Eligible   Tried Psy Student     No degree  12791
#>  6     6 Women Never went No Psy    Not Student No degree  56452
#>  7     7 Women Ineligible No Psy    Not Student No degree   3042
#>  8     8 Women Never went No Psy    Student     No degree   2849
#>  9     9 Women Never went No Psy    Student     Has degree  1950
#> 10    10 Women Eligible   No Psy    Not Student No degree    944

ggplot(flow_data, aes(axis1 = station1, axis2 = station2, axis3 = station3, axis4 = station4, y=size)) +
  geom_alluvium(aes(fill = group)) +
  geom_stratum() +
  geom_text(stat = "stratum", infer.label = TRUE)

^{Created on 2019-12-06 by the reprex package (v0.3.0)}

data_key dataframe returned by plot functions has empty levels

Forthcoming release of ggplot2 and easyalluvial

We are contacting you because you are the maintainer of easyalluvial, which imports ggplot2 and uses vdiffr to manage visual test cases. The upcoming release of ggplot2 includes several improvements to plot rendering, including the ability to specify lineend and linejoin in geom_rect() and geom_tile(), and improved rendering of text. These improvements will result in subtle changes to your vdiffr dopplegangers when the new version is released.

Because vdiffr test cases do not run on CRAN by default, your CRAN checks will still pass. However, we suggest updating your visual test cases with the new version of ggplot2 as soon as possible to avoid confusion. You can install the development version of ggplot2 using remotes::install_github("tidyverse/ggplot2").

If you have any questions, let me know!

Current version not on CRAN (9/11/2023)

It looks like the current version was removed from CRAN. I really appreciate this package and along with visualizing complex data sets, I also use it to teach about design matrices in linear models in my classes, so keeping it easy to get for student use would be great. Thanks!

faulty plots without error message for ggplot2 < 3.1.0. and ggalluvial < 0.9.1

y-axis plot input aesthetic parameter switched from weight to y
will be added to dependency in next version

manip_bin_numerics() optionally return range or median/mean as bin label

reverse dependency error from ggalluvial 0.12.5

Hello,

I'm preparing to release {ggalluvial} version 0.12.5, a patch that corrects a previous patch to fix a bug resulting from the new {dplyr} version 1.1.0. I got the email below from the CRAN team suggesting an issue with the reverse dependency {easyalluvial}.

I tried installing {easyalluvial} on multiple machines with different version of {ggalluvial} and encountered no such error myself, though i only have Macs at my disposal whereas the issue seems to occur on Windows, and anyway the error message provides no details. So i wonder if you have any insight into it. {ggalluvial} 0.12.4 is currently on CRAN, while 0.12.5 is the current development version.

No worries if not—i've asked the maintainers for guidance on next steps. Thanks in advance!

Cory

Dear maintainer,
 
package ggalluvial_0.12.5.tar.gz has been auto-processed. The auto-check found problems when checking the first order strong reverse dependencies.
Please reply-all and explain: Is this expected or do you need to fix anything in your package? If expected, have all maintainers of affected packages been informed well in advance? Are there false positives in our results?
 
*** Changes to worse in reverse dependencies ***
Debian: [<https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/reverseDependencies/summary.txt>](https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/reverseDependencies/summary.txt)
 
Log dir: [<https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/>](https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/)
The files will be removed after roughly 7 days.
 
Pretests:
Windows: [<https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/Windows/00check.log>](https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/Windows/00check.log)
Debian: [<https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/Debian/00check.log>](https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/Debian/00check.log)
 
Last published version on CRAN:
 
CRAN Web: [<https://cran.r-project.org/package=ggalluvial>](https://cran.r-project.org/package=ggalluvial)
 
Best regards,
CRAN teams' auto-check service


Package check result: OK

Changes to worse in reverse depends:

Package: easyalluvial
Check: whether package can be installed
New result: ERROR
  Installation failed.

install.packages("titanic") # only data in package
data("titanic_train",package="titanic")
library(tidyverse)
str(titanic_train)

d <- titanic_train %>% as_tibble %>%
  mutate(title=str_replace_all(string = Name, # extract title as general feature
                               pattern = "^[[:alpha:][:space:]'-]+,\\s+(the\\s)?(\\w+)\\..+",
                               replacement = "\\2")) %>%
  mutate(title=str_trim(title),
         title=case_when(title %in% c('Mlle','Ms')~'Miss', # normalize some titles
                         title=='Mme'~ 'Mrs',
                         title %in% c('Capt','Don','Major','Sir','Jonkheer', 'Col')~'Sir',
                         title %in% c('Dona', 'Lady', 'Countess')~'Lady',
                         TRUE~title)) %>%
  mutate(title=as_factor(title),
         Survived=factor(Survived,levels = c(0,1),labels=c("no","yes")),
         Sex=as_factor(Sex),
         Pclass=factor(Pclass,ordered = T)) %>%
  group_by(title) %>% # impute Age by median in current title
  mutate(Age=replace_na(Age,replace = median(Age,na.rm = T))) %>% ungroup
table(d$title,d$Sex) # look on title distribution        
caret::nearZeroVar(x = d,saveMetrics = T) # search and drop some unusefull features (PassengerId,Name,Ticket)
d <- d %>% select_at(vars(-c(PassengerId,Name,Ticket)))
d %>% summarise_all(~sum(is.na(.))) # control NAs

library(ranger)
m <- ranger(formula = Survived~.,data = d,mtry = 6,min.node.size = 5, num.trees = 600,
            importance = "permutation")

library(easyalluvial)
imp <- importance(m) %>% as.data.frame %>% tidy_imp(imp = .,df=d)
alluvial_wide(data = select(d,Survived,title,Pclass,Sex,Fare),fill_by = "first_variable") # ok, it work but i wont describe model (not describe data)

gds <- get_data_space(df = d,imp,degree = 4) # Error in Summary.factor(c(1L, 2L, 3L, 2L, 1L, 1L, 1L, 4L, 2L, 2L, 3L,  : ‘max’ not meaningful for factors

# ok, don`t  give up and try caret
library(caret)
trc <- trainControl(method = "none")
m <- train(Survived~.,data = d,method="rf",trControl=trc,importance=T)
alluvial_model_response_caret(train = m,degree = 4,bins=5,stratum_label_size = 2.8) # Error in tidy_imp(imp, df) : not all listed important variables found in input data

Interactive alluvial plot

Dear erblast,

Thanks for this fine package!

I would be great to have a interactive plot with these facilities:

Highlight a flow when mouse pointer is placed on it
Display flow info and some summary stats when mouse pointer is placed on it
Remove or temporarily disable columns

Some javascript and D3 magic should help. Let me know if this aligns with your aims.

caret based graphs look different for caret 6.0-84

topepo/caret#1048

recipe breaking changes

I'm doing reverse dependencies for recipes and saw an error for easyalluvial:

   > 
   > data = as_tibble(mtcars)
   > categoricals = c('cyl', 'vs', 'am', 'gear', 'carb')
   > numericals = c('mpg', 'cyl', 'disp', 'hp', 'drat', 'wt', 'qsec')
   > max_variables = 5
   > 
   > data = data %>%
   +   mutate_at( vars(categoricals), as.factor )
   > 
   > 
   > alluvial_wide( data = data
   +                 , max_variables = max_variables
   +                 , fill_by = 'first_variable' )
   Error: No role currently exists for column(s): 'easyalluvialid'. Please use `update_role()` instead.
   Execution halted

We changed the role system in 0.1.4 that broke this (and updated it a little more in 0.1.5).

Let me know if you need help. You can test against the current master in GH for recipes.

things to refactor

leaner functions with reduced but preciseroxygen documentation
use rlang::.data fo non standard eval https://resources.rstudio.com/rstudio-developed/tidyeval-2
use stopifnot() to check var types of all functions
use @inheritParam roxygen tag
in a test file make smaller test_that() functions and execute code used by many tests outside of the test_that() functions

jamovi integration

Dear @erblast
Thank you for the package.
I am collecting the codes that I use as jamovi modules. (https://github.com/sbalci/ClinicoPathJamoviModule)

I have prepared a function using easyalluvial package. See gif below:

I think it will be a good method to visualize both data and models. I will try to add model response codes as well.

Please let me know your comments.

erblast / easyalluvial Goto Github PK

easyalluvial's People

Stargazers

Watchers

Forkers

easyalluvial's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs