GithubHelp home page GithubHelp logo

erblast / easyalluvial Goto Github PK

View Code? Open in Web Editor NEW
108.0 108.0 10.0 1.01 GB

create alluvial plots with a single line of code

Home Page: https://erblast.github.io/easyalluvial/

R 99.43% Dockerfile 0.57%

easyalluvial's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

easyalluvial's Issues

not compatible with dplyr 8.0

require(tidyverse)

# causes memory overflow with dplyr dev version (8GB RAM)
df = ggplot2::diamonds %>%
  mutate_if( is.numeric, cut, 5) %>%
  group_by_all()

sessionInfo()
```
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] forcats_0.3.0     stringr_1.3.1     dplyr_0.7.99.9000 purrr_0.2.5       readr_1.3.1       tidyr_0.8.2       tibble_1.4.2      ggplot2_3.1.0    
[9] tidyverse_1.2.1  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       cellranger_1.1.0 pillar_1.3.1     compiler_3.5.2   plyr_1.8.4       tools_3.5.2      jsonlite_1.6     lubridate_1.7.4 
 [9] gtable_0.2.0     nlme_3.1-137     lattice_0.20-38  pkgconfig_2.0.2  rlang_0.3.0.1    cli_1.0.1        rstudioapi_0.8   yaml_2.2.0      
[17] haven_2.0.0      withr_2.1.2      xml2_1.2.0       httr_1.4.0       generics_0.0.2   hms_0.4.2        grid_3.5.2       tidyselect_0.2.5
[25] glue_1.3.0       R6_2.3.0         readxl_1.2.0     modelr_0.1.2     magrittr_1.5     backports_1.1.3  scales_1.0.0     rvest_0.3.2     
[33] assertthat_0.2.0 colorspace_1.3-2 stringi_1.2.4    lazyeval_0.2.1   munsell_0.5.0    broom_0.5.1      crayon_1.3.4 

alluvial_wide() does not work for character columns only

mtcars2 %>%
  select_if(is.factor) %>%
  alluvial_wide()

mtcars2 %>%
  select_if(is.factor) %>%
  mutate_all(as.character) %>%
  alluvial_wide()

Error: This tidyselect interface doesn't support predicates yet.
ℹ Contact the package author and suggest using eval_select().

easyalluvial and parcats are not compatible with the latest R?

Hi, there,

Thanks a lot for these two great packages, the outputs are awesome! However, recently, when I need to re-install the packages in a new machine, I encountered the following error:

> if (!require(easyalluvial)) {install.packages("easyalluvial",repos = "http://cran.us.r-project.org"); 
+         require(easyalluvial)}
Loading required package: easyalluvial
Warning in install.packages :
  package ‘easyalluvial’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages

> if (!require(parcats)) {install.packages("parcats",repos = "http://cran.us.r-project.org"); 
+         require(parcats)}
Loading required package: parcats
Warning in install.packages :
  package ‘parcats’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages

I am wondering, are these two packages not updated recently?

Best
Chuan-Peng

external legends for strata

Hey,

easyalluvial is a great tool. I was wondering if it is planned to create an option for having the legend for the strata outside of the plot. My problem is that my strata labels are long and partly cover the plot. Or is there already a solution for this that I have missed?

Many thanks!

TODO for 0.2.0

  • set seed in other tests
  • marginal histograms
  • categoric response
  • keep labels
  • optimize pdp performance, estimate
  • add training predictions
  • marginal histograms for long
  • add importance plot
  • parameter to change label size
  • refactor check_imp()
  • #7

alluvial_wide with size from column

I have data in wide format, with another column that determines the size of the flow in each row. I'm able to plot it with ggalluvial, but I can't figure out how to map the size column to the geom flow in easyalluvial:

flow_data
#> # A tibble: 10 x 7
#>       id group station1   station2  station3    station4    size
#>    <int> <chr> <chr>      <chr>     <chr>       <chr>      <dbl>
#>  1     1 Men   Eligible   Tried Psy Student     Has degree 51229
#>  2     2 Men   Never went No Psy    Not Student No degree  40091
#>  3     3 Men   Ineligible No Psy    Not Student No degree  35106
#>  4     4 Men   Eligible   No Psy    Not Student No degree  16181
#>  5     5 Men   Eligible   Tried Psy Student     No degree  12791
#>  6     6 Women Never went No Psy    Not Student No degree  56452
#>  7     7 Women Ineligible No Psy    Not Student No degree   3042
#>  8     8 Women Never went No Psy    Student     No degree   2849
#>  9     9 Women Never went No Psy    Student     Has degree  1950
#> 10    10 Women Eligible   No Psy    Not Student No degree    944

ggplot(flow_data, aes(axis1 = station1, axis2 = station2, axis3 = station3, axis4 = station4, y=size)) +
  geom_alluvium(aes(fill = group)) +
  geom_stratum() +
  geom_text(stat = "stratum", infer.label = TRUE)

Created on 2019-12-06 by the reprex package (v0.3.0)

Forthcoming release of ggplot2 and easyalluvial

We are contacting you because you are the maintainer of easyalluvial, which imports ggplot2 and uses vdiffr to manage visual test cases. The upcoming release of ggplot2 includes several improvements to plot rendering, including the ability to specify lineend and linejoin in geom_rect() and geom_tile(), and improved rendering of text. These improvements will result in subtle changes to your vdiffr dopplegangers when the new version is released.

Because vdiffr test cases do not run on CRAN by default, your CRAN checks will still pass. However, we suggest updating your visual test cases with the new version of ggplot2 as soon as possible to avoid confusion. You can install the development version of ggplot2 using remotes::install_github("tidyverse/ggplot2").

If you have any questions, let me know!

Current version not on CRAN (9/11/2023)

It looks like the current version was removed from CRAN. I really appreciate this package and along with visualizing complex data sets, I also use it to teach about design matrices in linear models in my classes, so keeping it easy to get for student use would be great. Thanks!

reverse dependency error from ggalluvial 0.12.5

Hello,

I'm preparing to release {ggalluvial} version 0.12.5, a patch that corrects a previous patch to fix a bug resulting from the new {dplyr} version 1.1.0. I got the email below from the CRAN team suggesting an issue with the reverse dependency {easyalluvial}.

I tried installing {easyalluvial} on multiple machines with different version of {ggalluvial} and encountered no such error myself, though i only have Macs at my disposal whereas the issue seems to occur on Windows, and anyway the error message provides no details. So i wonder if you have any insight into it. {ggalluvial} 0.12.4 is currently on CRAN, while 0.12.5 is the current development version.

No worries if not—i've asked the maintainers for guidance on next steps. Thanks in advance!

Cory

Dear maintainer,
 
package ggalluvial_0.12.5.tar.gz has been auto-processed. The auto-check found problems when checking the first order strong reverse dependencies.
Please reply-all and explain: Is this expected or do you need to fix anything in your package? If expected, have all maintainers of affected packages been informed well in advance? Are there false positives in our results?
 
*** Changes to worse in reverse dependencies ***
Debian: [<https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/reverseDependencies/summary.txt>](https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/reverseDependencies/summary.txt)
 
Log dir: [<https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/>](https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/)
The files will be removed after roughly 7 days.
 
Pretests:
Windows: [<https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/Windows/00check.log>](https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/Windows/00check.log)
Debian: [<https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/Debian/00check.log>](https://win-builder.r-project.org/incoming_pretest/ggalluvial_0.12.5_20230213_172426/Debian/00check.log)
 
Last published version on CRAN:
 
CRAN Web: [<https://cran.r-project.org/package=ggalluvial>](https://cran.r-project.org/package=ggalluvial)
 
Best regards,
CRAN teams' auto-check service


Package check result: OK

Changes to worse in reverse depends:

Package: easyalluvial
Check: whether package can be installed
New result: ERROR
  Installation failed.

not work on titanic dataset

I’m impressed while reading your blog about model interpretation and try to test this package on popular dataset “titanic” but all my attemtions is failed.

install.packages("titanic") # only data in package
data("titanic_train",package="titanic")
library(tidyverse)
str(titanic_train)

d <- titanic_train %>% as_tibble %>%
  mutate(title=str_replace_all(string = Name, # extract title as general feature
                               pattern = "^[[:alpha:][:space:]'-]+,\\s+(the\\s)?(\\w+)\\..+",
                               replacement = "\\2")) %>%
  mutate(title=str_trim(title),
         title=case_when(title %in% c('Mlle','Ms')~'Miss', # normalize some titles
                         title=='Mme'~ 'Mrs',
                         title %in% c('Capt','Don','Major','Sir','Jonkheer', 'Col')~'Sir',
                         title %in% c('Dona', 'Lady', 'Countess')~'Lady',
                         TRUE~title)) %>%
  mutate(title=as_factor(title),
         Survived=factor(Survived,levels = c(0,1),labels=c("no","yes")),
         Sex=as_factor(Sex),
         Pclass=factor(Pclass,ordered = T)) %>%
  group_by(title) %>% # impute Age by median in current title
  mutate(Age=replace_na(Age,replace = median(Age,na.rm = T))) %>% ungroup
table(d$title,d$Sex) # look on title distribution        
caret::nearZeroVar(x = d,saveMetrics = T) # search and drop some unusefull features (PassengerId,Name,Ticket)
d <- d %>% select_at(vars(-c(PassengerId,Name,Ticket)))
d %>% summarise_all(~sum(is.na(.))) # control NAs

library(ranger)
m <- ranger(formula = Survived~.,data = d,mtry = 6,min.node.size = 5, num.trees = 600,
            importance = "permutation")

library(easyalluvial)
imp <- importance(m) %>% as.data.frame %>% tidy_imp(imp = .,df=d)
alluvial_wide(data = select(d,Survived,title,Pclass,Sex,Fare),fill_by = "first_variable") # ok, it work but i wont describe model (not describe data)

gds <- get_data_space(df = d,imp,degree = 4) # Error in Summary.factor(c(1L, 2L, 3L, 2L, 1L, 1L, 1L, 4L, 2L, 2L, 3L,  : ‘max’ not meaningful for factors

# ok, don`t  give up and try caret
library(caret)
trc <- trainControl(method = "none")
m <- train(Survived~.,data = d,method="rf",trControl=trc,importance=T)
alluvial_model_response_caret(train = m,degree = 4,bins=5,stratum_label_size = 2.8) # Error in tidy_imp(imp, df) : not all listed important variables found in input data


Interactive alluvial plot

Dear erblast,

Thanks for this fine package!

I would be great to have a interactive plot with these facilities:

  1. Highlight a flow when mouse pointer is placed on it
  2. Display flow info and some summary stats when mouse pointer is placed on it
  3. Remove or temporarily disable columns

Some javascript and D3 magic should help. Let me know if this aligns with your aims.

recipe breaking changes

I'm doing reverse dependencies for recipes and saw an error for easyalluvial:

   > 
   > data = as_tibble(mtcars)
   > categoricals = c('cyl', 'vs', 'am', 'gear', 'carb')
   > numericals = c('mpg', 'cyl', 'disp', 'hp', 'drat', 'wt', 'qsec')
   > max_variables = 5
   > 
   > data = data %>%
   +   mutate_at( vars(categoricals), as.factor )
   > 
   > 
   > alluvial_wide( data = data
   +                 , max_variables = max_variables
   +                 , fill_by = 'first_variable' )
   Error: No role currently exists for column(s): 'easyalluvialid'. Please use `update_role()` instead.
   Execution halted

We changed the role system in 0.1.4 that broke this (and updated it a little more in 0.1.5).

Let me know if you need help. You can test against the current master in GH for recipes.

things to refactor

  • leaner functions with reduced but preciseroxygen documentation
  • use rlang::.data fo non standard eval https://resources.rstudio.com/rstudio-developed/tidyeval-2
  • use stopifnot() to check var types of all functions
  • use @inheritParam roxygen tag
  • in a test file make smaller test_that() functions and execute code used by many tests outside of the test_that() functions

jamovi integration

Dear @erblast
Thank you for the package.
I am collecting the codes that I use as jamovi modules. (https://github.com/sbalci/ClinicoPathJamoviModule)

I have prepared a function using easyalluvial package. See gif below:

jamovi-ClinicoPath-easyalluvial

I think it will be a good method to visualize both data and models. I will try to add model response codes as well.

Please let me know your comments.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.