biometryhub / biometryassist Goto Github PK

A package to aid in teaching experimental design and analysis through easy access and documentation of helper functions. Renaming of previous BiometryTraining package.

Home Page: https://biometryhub.github.io/biometryassist

License: Other

R 100.00%

rstats rstats-package r package experimental-design teaching biometry

biometryassist's People

Stargazers

Watchers

Forkers

igorkf

biometryassist's Issues

install_asreml() doesn't work on ARM Macs

Need to check for arm64 processor

Sys.info()["machine"]
"arm64"

Template file for analyses

It might be helpful for people if there was a way they could create template analysis files, where they just "fill in the blanks". See usethis::use_test as an example.

Speed up resplot

Apparently resplot is quite slow compared to some other plots. Can we improve this?

Design plots don't save well with lots of treatments

Investigate automatic sizing

Extend variogram to work with other models

Especially Sommer but also lm/aov and lme4

Does not install on R 4.0

Does not install because lmeInfo (a dependency of predictmeans) requires R >= 4.1.0.
Can we investigate why, or perhaps switch away from predictmeans?

Move predict call for asreml inside multiple comparisons

We can just pass arguments to predict.asreml() via ...

multiple_comparisons fails if treatments are not factors

If treatments are not factors, the multiple_comparisons() function will fail, while the rest of an aov() analysis will work fine.

Can we make this lighter?

This package installs 181 other packages on installation, can we reduce that a little?

Enable output of table of p-values from multiple_comparisons

To get a p-value for every comparison between pairs.

Consider a way to flip the interaction order in multiple_comparisons

At present there is a default order for interactions, we could introduce an argument/option to swap the order. Or perhaps specify the order of factors?
Perhaps this is better just as DIY ggplot code?

Enable power transformations in multiple_comparisons

More general power transformations could be provided, if determined to be necessary (e.g. via Box-Cox).

install_asreml() errors on macOS Monterey

It does not install properly because it requires an additional folder to be created.

BiometryTraining

Add notice about rename of package
Merge BiometryTraining dev branch to Master
Build final binary versions

biometryassist

Add W2 code as test cases

We need to automatically test against W2 code, to check that updates don't introduce bugs or errors affecting the code.

Remove R.param and G.param from call on resplot in asreml models

Unless specifically specified, they just add confusion.

Enable more powerful treatment specification in `design()`

For example, enable entering factorial treatments with labels and levels straight into treatments =
For example:

design("crossed:crd", treatments = list(PD = c(200, 260, 330), Irr = c(20, 100, 180)), reps = 4, nrows = 6, ncols = 6)

Also, allow subplot treatments to be specified in the same way in split plot if desired.

label rotation doesn't work

autoplot(pred.out, rotation = 90, label_rotation = 45)

Should rotate the axes by 90 degrees and the labels by 45 degrees, but doesn't.

Enable new colour blind palettes

viridisLite has new palettes: rocket, turbo and mako

Switch to using rlang

This can replace ellipsis and we can also:

check for interactivity to enable testing of the package startup message
check required packages are installed
check arguments missing or required
Tidy evaluation of arguments?

Create a heatmap function

Useful for plotting response against spatial variables

Error in `levels<-`(`tmp`, value = as.character(levels)) : factor level [20] is duplicated

Hi, I have a weird issue that I'm not sure how to solve. I have a data .csv file with columns 'name', 'block', 'year', 'rep', and maturity date('mat'). On the same file is a column for larvae ('la') and another column for stem breakage ('sb'), which are the response data. Everything works great with the larvae reponse data, but get a error for running the stem breakage data...

I read that there are ways to correct by how the data is read in, but not sure where to start with this since the file is the same for data that works and for the data that doesn't.

Code:
e<-read.csv2("720_raw_blocking.csv",header=TRUE,sep=",")

library(data.table)
library(ggplot2)
library(jsonlite)
library(biometryassist)
library(asreml)

e$year <- as.factor(e$year)
e$name <- as.factor(e$name)
e$Block <- as.factor(e$Block)
e$rep <- as.factor(e$rep)
e$mat <- as.numeric(e$mat)
e$la_pro <- as.numeric(e$la_pro)
e$sb_pro <- as.numeric(e$sb_pro)

current.asr <- asreml (fixed = la_pro ~ name + year + mat, random = ~Block + rep + Block:rep,
data=e,
family=asr_binomial(link = "logit", dispersion = 1, total = 5),
na.action = na.method(x="include")
)

##or wih the response variale sb_pro if running for the sb data.

wald(current.asr)

pred.out <-multiple_comparisons(
current.asr,
classify = "name",
sig = 0.05,
int.type = "ci",
trans = NA,
offset = NA,
decimals = 2,
descending = TRUE,
plot = FALSE,
label_height = 0.1,
rotation = 0,
save = TRUE,
savename = "predicted_values_la2", #savename changes pending on the analysis
)
pred.out

##Output from when code from biometryassist is run ('pred.out <- multiple comparisions..") with the sb_pro response data (no error with the la_pro response data is used):

Binomial; Logit Mu=P=1/(1+exp(-XB)); V=Mu(1-Mu)/N
Note: The LogLik value is unsuitable for comparing GLM models

Deviance from GLM fit: 1812.25
Variance heterogenity factor (Deviance/df): 0.79
(assuming 2297 degrees of freedom)
Binomial; Logit Mu=P=1/(1+exp(-XB)); V=Mu(1-Mu)/N
Note: The LogLik value is unsuitable for comparing GLM models

Deviance from GLM fit: 1812.25
Variance heterogenity factor (Deviance/df): 0.79
(assuming 2297 degrees of freedom)
Calculating denominator DF
Binomial; Logit Mu=P=1/(1+exp(-XB)); V=Mu(1-Mu)/N
Note: The LogLik value is unsuitable for comparing GLM models

Deviance from GLM fit: 1812.25
Variance heterogenity factor (Deviance/df): 0.79
(assuming 2297 degrees of freedom)
Calculating denominator DF

Error in levels<-(*tmp*, value = as.character(levels)) :
factor level [20] is duplicated

##More info on the raw data.

'data.frame': 6144 obs. of 6 variables:
$ year : Factor w/ 2 levels "2020","2021": 1 1 1 1 1 1 1 1 1 1 ...
$ rep : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ Block : Factor w/ 5 levels "1","2","3","4",..: 2 2 2 3 3 3 3 3 3 3 ...
$ name : Factor w/ 724 levels "FC004002B","FC029333",..: 1 2 3 4 5 6 7 8 9 10 ...
$ mat : num 104 102 104 98 100 98 108 108 107 105 ...
$ la_pro : num 0.6 0.4 0.8 1 0.2 1 0.4 0.4 0.4 0.8 ...
$ sb_pro : num 0.2 0.0667 0.36 0.12 0.04 ...

Thank you in advance for looking this over!

Change error for missing offset to a warning

Set default value of offset when missing to 0 and warn.

Perhaps variogram colours go from green to red?

Green being low values and red being high? Although that gives the impression that high is "bad" which is not necessarily the case.

autoplot documentation not generated properly

autoplot function doesn't show up in the R help pages (pkgdown website is fine), and conflicts with ggplot2 when searching

Aliased levels in multiple_comparisons not printed properly

Aliased levels are added as a single string rather than a vector, which makes printing a little difficult.

Check if classify terms in multiple_comparisons are provided in wrong order

Currently gives an unhelpful error such as:

Error in names(diffs) <- m : 
  'names' attribute [45] must be the same length as the vector [0]

Vignette for installing asreml

It's a common question, so let's write a vignette to explain the steps.
I wonder if we can have the asreml key behind a login website like the guest wifi password?

Deprecate `resplt` in favour of `resplot`

This has been a change for a while, time to start removing resplt().

Refactor install_asreml()

THe install_asreml() function could be refactored into a download component and an install component, possibly even refactoring some of the other functions within it as well.
This may make testing easier and enable more robust checking across different operating systems.

Add exact methods

Check lmPerm for inspiration

Output coefficient of variation

A couple of people have suggested that coefficient of variation is sometimes requested, and they would like to be able to get it.

logl.test works if random terms are given in resid

For example:

dat.asr <- asreml(response ~ Genotype, random = ~ Row + Block, 
                  residual = ~ id(Column):id(Row), data = dat)

logl.test(dat.asr, rand.terms = NULL, resid.terms = "Row")

should fail or warn, because Row is already id(Row) in resid.

Checking if model needs the present argument (sed)

Hey Sam, as suggested:

pred.out2 <- multiple_comparisons(
  model2,
  classify = 'liming_treatment:rate:year',
  present = c('year', 'control', 'liming_treatment', 'rate')
)

werks

But:

pred.out2 <- multiple_comparisons(
  model2,
  classify = 'liming_treatment:rate:year',
  present = c('year', 'control', 'liming_treatment', 'rate'),
  sed = TRUE
)

doesn't, and errors with this:

Error in asreml::predict.asreml(model.obj, classify = classify, sed = TRUE,  : 
  formal argument "sed" matched by multiple actual arguments
Error:
! Arguments in `...` must be used.
✖ Problematic arguments:
• present = c("year", "control", "liming_treatment", "rate")
• sed = TRUE
Run `rlang::last_error()` to see where the error occurred.

Looks like it's reading in the sed argument twice, so perhaps a check to make sure that arguments passed with ... are needed or not?

Issues with next version of ggplot2

We preparing the next release of ggplot2 and our reverse dependency checks show that your package is failing with the new version. Looking into it we see that this is due to changes in the warnings and errors thrown by ggplot2 that you test for in your package

You can install the release candidate of ggplot2 using devtools::install_github('tidyverse/[email protected]') to test this out.

We plan to submit ggplot2 by the end of October and hope you can have a fix ready before then

Kind regards
Thomas

Only show start up message once a day

Otherwise it will get annoying. Need to think about how to do this without adding loads more dependencies though

Explore alternatives to Agricolae

Agricolae has very poor documentation which could confuse users - can we improve this somehow?

variogram bug - doesn't handle NA's

Shiny app for graphical design

We could wrap the design functions in shiny to enable people to use them graphically.

Check if variogram is missing ar1 components in asreml model

Currently when not provided, it gives an unhelpful error about

Error in 0:(ncols - 1) : result would be too long a vector

Write some Vignettes

A little more in depth example than what's shown in the examples

Some ideas:

Generating a complex factorial design (e.g. crossed but with control treatment(s) that aren't crossed)
Analysis of a complex factorial design as above
Interpreting residual plots
Interpreting variograms
updates in the new version

Is there an NA action?

Hi I have NA's in my data and getting an error that 'NA is not allowed'. Is there a way to say 'include' or 'omit'?

Here is my code if at all helpful:
asreml_Iblock_di <- asreml (fixed = cbind (la_pro, sb_pro) ~ name + year + mat, random = ~rep + Block + rep:Block,
residual = ~id(units):us(trait),
data=e,
family=asr_binomial(link = "logit", dispersion = 1, total = 5),
na.action = na.method(x="include")
)
wald(asreml_Iblock_di, denDF = "default", ssType = "conditional")
pred.asr <- predict(asreml_Iblock_di, classify = "name", sed = TRUE)
pred.out <- mct.out(model.obj = asreml_Iblock_di, pred.obj = pred.asr,
classify = "name", order = "descending", decimals = 5, save = TRUE, savename = "predicted_values_05_23.csv")

Mct still not ordering things correctly for "default" ordering.

E.g.

library(tidyverse)

dat <- read.csv("example1.csv")
dat <- dat %>% mutate(across(c(1:3), factor))
dat.aov <- aov(RL ~ trt, data = dat)            # fitting the model
pred.out <- mct.out(model.obj = dat.aov, classify = "trt")

example1.csv

Better defensive programming to check column names in multiple_comparisons

If a column name is provided that is too similar to the output columns, multiple_comparisons() will change the names of all the columns between the first and the maximum matching column to factors.

See example code below:

# Any of these names will break the function
c("predicted.value", "std.error", "Df",
  "groups", "PredictedValue", "ApproxSE", "ci", "low", "up")
library(biometryassist)
dat <- design("crd", LETTERS[1:4], 4, nrow = 4, ncols = 4)$design
names(dat)[5] <- "groups"
dat$response <- rnorm(16, 10)
dat.aov <- aov(response~groups, data = dat)
multiple_comparisons(dat.aov, classify = "groups", trans = "log", offset = 0)

biometryhub / biometryassist Goto Github PK

biometryassist's People

Stargazers

Watchers

Forkers

biometryassist's Issues

BiometryTraining

biometryassist

Recommend Projects

Recommend Topics

Recommend Org

Jobs