statomics / tradeseq Goto Github PK

TRAjectory-based Differential Expression analysis for SEQuencing data

License: Other

R 11.24% HTML 88.52% TeX 0.24%

tradeseq's Introduction

R package: tradeSeq

TRAjectory Differential Expression analysis for SEQuencing data

tradeSeq provides a flexible method for discovering genes that are differentially expressed along one or multiple lineages, using a variety of tests suited to answer many questions of interest.

Installation

To install the current version of tradeSeq in Bioconductor, run.

if(!requireNamespace("BiocManager", quietly = TRUE)) {
 install.packages("BiocManager") 
}
BiocManager::install("tradeSeq")

To install the development version in R, run

devtools::install_github("statOmics/tradeSeq")

The installation should only take a few seconds. The dependencies of the package are listed in the DESCRIPTION file of the package.

Changes

Major changes are reported in the NEWS file, make sure to check it out if you want to follow the latest developments.

Issues and bug reports

Please use https://github.com/statOmics/tradeSeq/issues to submit issues, bug reports, and comments.

Usage

Start with the vignette online.

Cheatsheet

You can also refer to this cheatsheet to undersand a common workflow

Contributing and requesting

A number of tests have been implemented in tradeSeq, but researchers may be interested in other hypotheses that current implementations may not be able to address. We therefore welcome contributions on GitHub on novel tests based on the tradeSeq model. Similar, you may also request novel tests to be implemented in tradeSeq by the developers, preferably by adding an issue on the GitHub repository. If we feel that the suggested test is widely applicable, we will implement it in tradeSeq.

tradeseq's People

Contributors

Stargazers

Watchers

tradeseq's Issues

Travis CI errors due to time limit

It seems to run out of time limit when running the tests.
Not sure if this is because the tests take too long or the upstream checks are taking too long.

https://travis-ci.com/github/statOmics/tradeSeq/builds/167128665

Add package tests

Write tests to see if the package still works as expected under new commits, using the testthat package.

Error parallelizing fitGAM

Hi, I'm running v1.1.16 in R3.6.1 in a CentOS environment (more info below). I get the following error running fitGAM in parallel mode:

sce <- fitGAM(counts = counts, sds = crv1, nknots = 15, verbose = TRUE, BPPARAM = BPPARAM, parallel = TRUE)
Adding 23989 jobs ...
Submitting 23989 jobs in 12 chunks using cluster functions 'Multicore' ...
Error in .reduceResultsList(ids, fun, ..., missing.val = missing.val,  :
  All jobs must be have been successfully computed
In addition: Warning messages:
1: In .findKnots(nknots, pseudotime, wSamp) :
  Impossible to place a knot at all endpoints.Increase the number of knots to avoid this issue.
2: In parallel::mccollect(jobs, wait = FALSE, timeout = timeout) :
  2 parallel jobs did not deliver results

The same command runs fine without parallel = TRUE but the cpu time is prohibitively long.
The error seems to come from batchtools but could be related to how tradeSeq prepares the jobs. Thanks in advance for looking into this!

Relevant upstream code:

register(BatchtoolsParam(workers=12))
BPPARAM <- BiocParallel::bpparam()
BPPARAM
class: BatchtoolsParam
  bpisup: FALSE; bpnworkers: 12; bptasks: 0; bpjobname: BPJOB
  bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
  bpRNGseed: NA; bptimeout: 2592000; bpprogressbar: FALSE
  bpexportglobals: TRUE
  bplogdir: NA
  bpresultdir: NA
  cluster type: multicore
  template: NA
  registryargs:
    file.dir: /gpfs/home/trebbiano/file67874bb6408
    work.dir: getwd()
    packages: character(0)
    namespaces: character(0)
    source: character(0)
    load: character(0)
    make.default: FALSE
  saveregistry: FALSE
  resources:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.7 (Final)

Matrix products: default
BLAS:   /opt/applications/R/3.6.1/gnu/lib64/R/lib/libRblas.so
LAPACK: /opt/applications/R/3.6.1/gnu/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
 [3] LC_TIME=en_US.utf8        LC_COLLATE=C
 [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8
 [7] LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] BiocParallel_1.20.1 tradeSeq_1.1.16     bigmemory_4.5.36
[4] Biobase_2.46.0      BiocGenerics_0.32.0 slingshot_1.4.0
[7] princurve_2.1.4     Seurat_3.1.4

loaded via a namespace (and not attached):
  [1] reticulate_1.14             tidyselect_1.0.0
  [3] RSQLite_2.2.0               AnnotationDbi_1.48.0
  [5] htmlwidgets_1.5.1           grid_3.6.1
  [7] combinat_0.0-8              docopt_0.6.1
  [9] Rtsne_0.15                  RNeXML_2.4.3
 [11] base64url_1.4               munsell_0.5.0
 [13] codetools_0.2-16            mutoss_0.1-12
 [15] ica_1.0-2                   future_1.16.0
 [17] withr_2.1.2                 colorspace_1.4-1
 [19] fastICA_1.2-2               uuid_0.1-4
 [21] zinbwave_1.8.0              stats4_3.6.1
 [23] SingleCellExperiment_1.8.0  ROCR_1.0-7
 [25] gbRd_0.4-11                 listenv_0.8.0
 [27] NMF_0.22.0                  Rdpack_0.11-1
 [29] slam_0.1-47                 GenomeInfoDbData_1.2.2
 [31] mnormt_1.5-6                bit64_0.9-7
 [33] pheatmap_1.0.12             rhdf5_2.30.1
 [35] batchtools_0.9.12           vctrs_0.2.4
 [37] TH.data_1.0-10              R6_2.4.1
 [39] doParallel_1.0.15           GenomeInfoDb_1.22.0
 [41] rsvd_1.0.3                  VGAM_1.1-2
 [43] locfit_1.5-9.1              bitops_1.0-6
 [45] DelayedArray_0.12.2         assertthat_0.2.1
 [47] scales_1.1.0                multcomp_1.4-12
 [49] gtable_0.3.0                phylobase_0.8.10
 [51] npsurv_0.4-0                globals_0.12.5
 [53] sandwich_2.5-1              rlang_0.4.5
 [55] genefilter_1.68.0           splines_3.6.1
 [57] lazyeval_0.2.2              brew_1.0-6
 [59] checkmate_2.0.0             reshape2_1.4.3
 [61] backports_1.1.5             tools_3.6.1
[63] gridBase_0.4-7              ggplot2_3.3.0
 [65] gplots_3.0.3                RColorBrewer_1.1-2
 [67] ggridges_0.5.2              TFisher_0.2.0
 [69] Rcpp_1.0.4                  plyr_1.8.6
 [71] progress_1.2.2              zlibbioc_1.32.0
 [73] purrr_0.3.3                 RCurl_1.98-1.1
 [75] densityClust_0.3            prettyunits_1.1.1
 [77] pbapply_1.4-2               viridis_0.5.1
 [79] cowplot_1.0.0               S4Vectors_0.24.3
 [81] zoo_1.8-7                   SummarizedExperiment_1.16.1
 [83] ggrepel_0.8.2               cluster_2.1.0
 [85] fs_1.3.2                    magrittr_1.5
 [87] data.table_1.12.8           RSpectra_0.16-0
 [89] lmtest_0.9-37               RANN_2.6.1
 [91] mvtnorm_1.1-0               fitdistrplus_1.0-14
 [93] matrixStats_0.56.0          hms_0.5.3
 [95] patchwork_1.0.0             lsei_1.2-0
 [97] xtable_1.8-4                XML_3.99-0.3
 [99] sparsesvd_0.2               IRanges_2.20.2
[101] gridExtra_2.3               HSMMSingleCell_1.6.0
[103] compiler_3.6.1              tibble_2.1.3
[105] KernSmooth_2.23-16          crayon_1.3.4
[107] htmltools_0.4.0             mgcv_1.8-31
[109] tidyr_1.0.2                 howmany_0.3-1
[111] DBI_1.1.0                   MASS_7.3-51.5
[113] rappdirs_0.3.1              Matrix_1.2-18
[115] ade4_1.7-15                 gdata_2.18.0
[117] metap_1.3                   igraph_1.2.4.2
[119] GenomicRanges_1.38.0        pkgconfig_2.0.3
[121] bigmemory.sri_0.1.3         rncl_0.8.4
[123] sn_1.5-5                    registry_0.5-1
[125] numDeriv_2016.8-1.1         locfdr_1.1-8
[127] plotly_4.9.2                xml2_1.2.5
[129] foreach_1.4.8               annotate_1.64.0
[131] rngtools_1.5                pkgmaker_0.31
[133] multtest_2.40.0             XVector_0.26.0
[135] bibtex_0.4.2.2              stringr_1.4.0
[137] digest_0.6.25               sctransform_0.2.1
[139] RcppAnnoy_0.0.16            tsne_0.1-3
[141] softImpute_1.4              DDRTree_0.1.5
[143] leiden_0.3.3                uwot_0.1.8
[145] edgeR_3.28.1                kernlab_0.9-29
[147] gtools_3.8.1                lifecycle_0.2.0
[149] monocle_2.14.0              nlme_3.1-145
[151] jsonlite_1.6.1              Rhdf5lib_1.8.0
[153] clusterExperiment_2.6.1     viridisLite_0.3.0
[155] limma_3.42.2                pillar_1.4.3
[157] lattice_0.20-40             httr_1.4.1
[159] plotrix_3.7-7               survival_3.1-11
[161] glue_1.3.2                  qlcMatrix_0.9.7
[163] FNN_1.1.3                   png_0.1-7
[165] iterators_1.0.12            bit_1.1-15.2
[167] HDF5Array_1.14.3            stringi_1.4.6
[169] blob_1.2.1                  memoise_1.1.0
[171] caTools_1.18.0              dplyr_0.8.5
[173] irlba_2.3.3                 future.apply_1.4.0
[175] ape_5.3

Differential test on two conditions within the same slingshot trajectory?

We have a trajectory of single cells generated by slingshot, but the cells are mixed from two conditions (mutant, wild-type)--in other words, we have one unified slingshot object and one trajectory of interest within it. We see some differences along pseudotime of that unified trajectory, suggestive of some kind of developmental shift due to genotype. What we'd like to do is run tradeSeq to call genes that are differentially expressed along this lineage, but it seems the vignette mainly describes calling differences across two independent lineages.

Is there a way to subset the slingshot object and/or read the data into tradeSeq such that we can compare expression of WT cells to mutant cells along the same pseudotime axis?

Thanks!

Error in associationTest - incorrect number of dimensions

I have encountered an error when running the associationTest function on my data:

# Import experiment data
sce <- readRDS("SingleCellExperiment.rds")
sds <- readRDS("SlingshotDataSet.rds")

# Extract counts matrix
dgc <- counts(sce)
mat <- as.matrix(dgc)

# Test on subset of genes
idx <- 1:100

# Fit the NB-GAM model
fit <- fitGAM(counts = mat, sds = sds, genes = idx)

# Run the association test
res <- associationTest(fit)
Error in rowData(models)$tradeSeq$beta[[1]][1, ] : 
  incorrect number of dimensions

I have uploaded the experiment data so you can have a reproducible example: https://www.dropbox.com/s/r9oxnjqveo8euyr/tradeSeq.zip?dl=0

Edit: I should also mention that I encounter the same error when using the data and commands from the Bioconductor vignette

I am using the latest master build from GitHub but the problem is also present in the latest Bioconductor release:

R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.3

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tradeSeq_1.1.16             bigmemory_4.5.36            slingshot_1.4.0             princurve_2.1.4            
 [5] hues_0.2.0                  EnsDb.Mmusculus.v79_2.99.0  ensembldb_2.10.2            AnnotationFilter_1.10.0    
 [9] GenomicFeatures_1.38.2      AnnotationDbi_1.48.0        MouseGastrulationData_1.0.0 BiocNeighbors_1.4.2        
[13] scater_1.14.6               ggplot2_3.3.0               scran_1.14.6                SingleCellExperiment_1.8.0 
[17] SummarizedExperiment_1.16.1 DelayedArray_0.12.2         BiocParallel_1.20.1         matrixStats_0.56.0         
[21] Biobase_2.46.0              GenomicRanges_1.38.0        GenomeInfoDb_1.22.1         IRanges_2.20.2             
[25] S4Vectors_0.24.3            BiocGenerics_0.32.0         workflowr_1.6.1            

loaded via a namespace (and not attached):
  [1] tidyselect_1.0.0              RSQLite_2.2.0                 grid_3.6.1                    combinat_0.0-8               
  [5] docopt_0.6.1                  Rtsne_0.15                    RNeXML_2.4.3                  munsell_0.5.0                
  [9] codetools_0.2-16              statmod_1.4.34                withr_2.1.2                   colorspace_1.4-1             
 [13] fastICA_1.2-2                 knitr_1.28                    uuid_0.1-4                    zinbwave_1.8.0               
 [17] rstudioapi_0.11               NMF_0.22.0                    labeling_0.3                  slam_0.1-47                  
 [21] GenomeInfoDbData_1.2.2        farver_2.0.3                  bit64_0.9-7                   pheatmap_1.0.12              
 [25] rhdf5_2.30.1                  vctrs_0.2.4                   xfun_0.12                     BiocFileCache_1.10.2         
 [29] R6_2.4.1                      doParallel_1.0.15             ggbeeswarm_0.6.0              rsvd_1.0.3                   
 [33] VGAM_1.1-2                    locfit_1.5-9.4                bitops_1.0-6                  assertthat_0.2.1             
 [37] promises_1.1.0                scales_1.1.0                  beeswarm_0.2.3                gtable_0.3.0                 
 [41] phylobase_0.8.10              rlang_0.4.5                   genefilter_1.68.0             splines_3.6.1                
 [45] rtracklayer_1.46.0            lazyeval_0.2.2                BiocManager_1.30.10           yaml_2.2.1                   
 [49] reshape2_1.4.3                httpuv_1.5.2                  tools_3.6.1                   gridBase_0.4-7               
 [53] ellipsis_0.3.0                RColorBrewer_1.1-2            Rcpp_1.0.4                    plyr_1.8.6                   
 [57] progress_1.2.2                zlibbioc_1.32.0               purrr_0.3.3                   RCurl_1.98-1.1               
 [61] densityClust_0.3              prettyunits_1.1.1             openssl_1.4.1                 pbapply_1.4-2                
 [65] viridis_0.5.1                 ggrepel_0.8.2                 cluster_2.1.0                 fs_1.3.2                     
 [69] magrittr_1.5                  RSpectra_0.16-0               RANN_2.6.1                    packrat_0.5.0                
 [73] ProtGenerics_1.18.0           hms_0.5.3                     mime_0.9                      evaluate_0.14.1              
 [77] xtable_1.8-4                  XML_3.99-0.3                  sparsesvd_0.2                 gridExtra_2.3                
 [81] HSMMSingleCell_1.6.0          compiler_3.6.1                biomaRt_2.42.1                tibble_3.0.0                 
 [85] crayon_1.3.4                  htmltools_0.4.0               mgcv_1.8-31                   later_1.0.0                  
 [89] tidyr_1.0.2                   howmany_0.3-1                 DBI_1.1.0                     ExperimentHub_1.12.0         
 [93] dbplyr_1.4.2                  MASS_7.3-51.5                 rappdirs_0.3.1                Matrix_1.2-18                
 [97] ade4_1.7-15                   cli_2.0.2                     igraph_1.2.5                  bigmemory.sri_0.1.3          
[101] pkgconfig_2.0.3               rncl_0.8.4                    GenomicAlignments_1.22.1      registry_0.5-1               
[105] locfdr_1.1-8                  xml2_1.2.5                    foreach_1.5.1                 annotate_1.64.0              
[109] vipor_0.4.5                   rngtools_1.5                  dqrng_0.2.1                   pkgmaker_0.31.1              
[113] XVector_0.26.0                bibtex_0.4.2.2                stringr_1.4.0                 digest_0.6.25                
[117] softImpute_1.4                DDRTree_0.1.5                 Biostrings_2.54.0             rmarkdown_2.1                
[121] uwot_0.1.8                    edgeR_3.28.1                  DelayedMatrixStats_1.8.0      kernlab_0.9-29               
[125] curl_4.3                      shiny_1.4.0.2                 Rsamtools_2.2.3               lifecycle_0.2.0              
[129] monocle_2.14.0                nlme_3.1-145                  Rhdf5lib_1.8.0                clusterExperiment_2.6.1      
[133] viridisLite_0.3.0             askpass_1.1                   limma_3.42.2                  fansi_0.4.1                  
[137] pillar_1.4.3                  lattice_0.20-40               survival_3.1-11               fastmap_1.0.1                
[141] httr_1.4.1                    interactiveDisplayBase_1.24.0 glue_1.3.2                    qlcMatrix_0.9.7              
[145] FNN_1.1.3                     iterators_1.0.12              BiocVersion_3.10.1            bit_1.1-15.2                 
[149] HDF5Array_1.14.3              stringi_1.4.6                 blob_1.2.1                    BiocSingular_1.2.2           
[153] AnnotationHub_2.18.0          memoise_1.1.0                 dplyr_0.8.5                   irlba_2.3.3                  
[157] ape_5.3

Running fitGam on a subset of the data

Right now, if you want to run the fitGam function on a subset of the genes but still use the full library size as normalization, you need to recompute the offset on your own, use it as input to the offset parameter.
We could add a subset option (defaulting to "all") that does that automatically.

Can we implement more efficient plotting?

On large datasets, plotSmoothers takes relatively long and consumes a lot of memory. Can we do better?

too many clusters for gene expression pattern clustering

Hi, thanks for providing such a useful package. I have run tradeseq on my data and got 60 clusters for primary clustering results. How can I reduce this number by just some parameters? I prefer a relatively small clusters number for my downstream analysis. Thank you.

Xin

Transferring data from Seurat and general question about test to use

Hi,

Thank you for developing tradeseq. I want to try tradeseq to calculate differential gene expression in a time series data. For example, I have data sets from four different time points. I integrated the data sets with Seurat to find common cells across the four time points. Now, I want to identify gene changes across time in "cluster 1" for example. That is, I want to see genes that are uniquely expressed in early, mid, or late time point for this particular cell type. Is it possible to use Tradeseq to accomplish this? If so, what test will be best to implement?

Lastly, how can I transfer input data from Seurat to Tradeseq?

Thank you.

Interpretive power of reported p-values

Hi,
I tested your new package with great interest. Some questions arose regarding the p-values of the available statistical tests.

Just for context: As you know, the probably most-used analysis for RNA-seq data is DE calling. There are various methods to obtain interpretive p-values, which are seen by most people as statistical valid. Is there a huge difference to the tradeSeq p-values for a "DE calling along a latent time"?

I read in your publication, that you view the reported "[...] p-values simply as useful numerical summaries for ranking the genes". Would you argue, that a probabilistic interpretation of these p-values would be incorrect (or at least not favourable)? Are there any new assumptions for the tests that do not simply arise from the nature of scRNA-seq data?
Is the assumption correct, that all reported p-values are not multiple testing corrected?

Thanks in advance.

Best,
Tobi

predictSmooth

Hello!

When I am trying to run "predictSmooth" function (either for my data or as in example provided: https://statomics.github.io/tradeSeq/reference/predictSmooth.html) I got an error: could not find function "predictSmooth". Is there something wrong I am doing?

Thank you.

Evgenii

Error with plotGeneCount()

I am trying to run the steps as shown in the vignette. When I reach the plotGeneCount() function in the "Discovering progenitor marker genes" section I get this error:

plotGeneCount(crv, counts=counts, gene = sigGeneStart)
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class ‘structure("SlingshotDataSet", package = "slingshot")’ to a data.frame

The same thing is happening in the subsequent uses of this function too. Could you please help me figure out what Is going wrong and how I can fix it ?

Thanks.

Error running associationTest for individual lineages

Hi, I am running the Github version 1.1.16 of tradeSeq in R3.6.1. I have run fitGAM on a SlingshotDataSet and a subset of genes from the full matrix, then successfully ran startVsEndTest and diffEndTest. However, I get the following error when attempting to run associationTest:

sce2k <- fitGAM(counts = countsFullOrdered[1:2000,], sds = crv1pt1, nknots = 25, verbose = TRUE, BPPARAM = BPPARAM)
assoRes <- associationTest(sce2k, global = FALSE, lineages = TRUE)
Error in .getFoldChanges(betam, L) : object 'L' not found

The associationTest works in global mode (global = TRUE, lineages = FALSE).
What seems to be the problem? Shall I increase the number of genes used for fitGAM?
Session info below. Thank you in advance!

Jerry

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.7 (Final)

Matrix products: default
BLAS:   /opt/applications/R/3.6.1/gnu/lib64/R/lib/libRblas.so
LAPACK: /opt/applications/R/3.6.1/gnu/lib64/R/lib/libRlapack.so

Random number generation:
 RNG:     Mersenne-Twister
 Normal:  Inversion
 Sample:  Rounding

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
 [3] LC_TIME=en_US.utf8        LC_COLLATE=C
 [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8
 [7] LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] tradeSeq_1.1.16     bigmemory_4.5.36    Biobase_2.46.0
[4] BiocGenerics_0.32.0 BiocParallel_1.20.1

loaded via a namespace (and not attached):
  [1] uuid_0.1-4                  backports_1.1.5
  [3] VGAM_1.1-2                  NMF_0.22.0
  [5] plyr_1.8.6                  igraph_1.2.4.2
  [7] lazyeval_0.2.2              splines_3.6.1
  [9] densityClust_0.3            usethis_1.5.1
 [11] rncl_0.8.4                  GenomeInfoDb_1.22.0
 [13] fastICA_1.2-2               ggplot2_3.3.0
 [15] gridBase_0.4-7              digest_0.6.25
 [17] foreach_1.4.8               viridis_0.5.1
 [19] fansi_0.4.1                 magrittr_1.5
 [21] checkmate_2.0.0             memoise_1.1.0
 [23] base64url_1.4               cluster_2.1.0
 [25] doParallel_1.0.15           remotes_2.1.1
 [27] limma_3.42.2                annotate_1.64.0
 [29] matrixStats_0.56.0          docopt_0.6.1
 [31] prettyunits_1.1.1           princurve_2.1.4
 [33] colorspace_1.4-1            blob_1.2.1
 [35] rappdirs_0.3.1              ggrepel_0.8.2
 [37] dplyr_0.8.5                 callr_3.4.2
 [39] sparsesvd_0.2               crayon_1.3.4
 [41] RCurl_1.98-1.1              bigmemory.sri_0.1.3
 [43] genefilter_1.68.0           phylobase_0.8.10
 [45] brew_1.0-6                  survival_3.1-11
 [47] iterators_1.0.12            ape_5.3
 [49] glue_1.3.2                  registry_0.5-1
 [51] gtable_0.3.0                zlibbioc_1.32.0
 [53] XVector_0.26.0              DelayedArray_0.12.2
 [55] pkgbuild_1.0.6              kernlab_0.9-29
 [57] Rhdf5lib_1.8.0              SingleCellExperiment_1.8.0
 [59] HDF5Array_1.14.3            scales_1.1.0
 [61] pheatmap_1.0.12             DBI_1.1.0
 [63] edgeR_3.28.1                rngtools_1.5
 [65] bibtex_0.4.2.2              Rcpp_1.0.4
 [67] viridisLite_0.3.0           xtable_1.8-4
 [69] progress_1.2.2              bit_1.1-15.2
 [71] stats4_3.6.1                httr_1.4.1
 [73] FNN_1.1.3                   RColorBrewer_1.1-2
 [75] ellipsis_0.3.0              pkgconfig_2.0.3
 [77] XML_3.99-0.3                locfit_1.5-9.1
 [79] howmany_0.3-1               tidyselect_1.0.0
 [81] rlang_0.4.5                 softImpute_1.4
 [83] reshape2_1.4.3              AnnotationDbi_1.48.0
 [85] munsell_0.5.0               tools_3.6.1
 [87] cli_2.0.2                   RSQLite_2.2.0
 [89] ade4_1.7-15                 devtools_2.2.2
 [91] stringr_1.4.0               fs_1.3.2
 [93] processx_3.4.2              bit64_0.9-7
 [95] DDRTree_0.1.5               purrr_0.3.3
 [97] RANN_2.6.1                  pbapply_1.4-2
 [99] nlme_3.1-145                monocle_2.14.0
[101] slam_0.1-47                 xml2_1.2.5
[103] compiler_3.6.1              curl_4.3
[105] testthat_2.3.2              tibble_2.1.3
[107] RNeXML_2.4.3                stringi_1.4.6
[109] ps_1.3.2                    desc_1.2.0
[111] RSpectra_0.16-0             lattice_0.20-40
[113] Matrix_1.2-18               HSMMSingleCell_1.6.0
[115] vctrs_0.2.4                 pillar_1.4.3
[117] lifecycle_0.2.0             combinat_0.0-8
[119] zinbwave_1.8.0              data.table_1.12.8
[121] bitops_1.0-6                irlba_2.3.3
[123] GenomicRanges_1.38.0        R6_2.4.1
[125] gridExtra_2.3               IRanges_2.20.2
[127] sessioninfo_1.1.1           codetools_0.2-16
[129] pkgload_1.0.2               MASS_7.3-51.5
[131] assertthat_0.2.1            rhdf5_2.30.1
[133] SummarizedExperiment_1.16.1 rprojroot_1.3-2
[135] pkgmaker_0.31               withr_2.1.2
[137] qlcMatrix_0.9.7             batchtools_0.9.12
[139] S4Vectors_0.24.3            GenomeInfoDbData_1.2.2
[141] locfdr_1.1-8                mgcv_1.8-31
[143] hms_0.5.3                   grid_3.6.1
[145] tidyr_1.0.2                 slingshot_1.4.0
[147] Rtsne_0.15                  clusterExperiment_2.6.1

evaluateK to estimate nknots necessary when using Monocle to generate pseudotime?

Hello,

I am using tradeSeq downstream of calculating pseudotime with Monocle. The Monocle vignette ends with the construction of sce

sce <- fitGAM(counts = counts,
       pseudotime = pseudotime,
       cellWeights = cellWeights)

and links back to the main vignette. However, in the main vignette the construction of sce by fitGAM follows the nknots estimation procedure with evaluateK.

set.seed(5)
icMat <- evaluateK(counts = counts, sds = crv, k = 3:10, 
                   nGenes = 200, verbose = T)
set.seed(7)
pseudotime <- slingPseudotime(crv, na = FALSE)
cellWeights <- slingCurveWeights(crv)
sce <- fitGAM(counts = counts, pseudotime = pseudotime, cellWeights = cellWeights,
                 nknots = 6, verbose = FALSE)

Does this mean that the evaluateK step is unnecessary when fitGAM is applied to the output from Monocle?

Thanks for you help,

Brian

Slingshot to tradeSeq from adata object?

Hi, and thank you for your tutorial! I've been following a workflow from the Theis lab (https://github.com/theislab/single-cell-tutorial/blob/master/latest_notebook/Case-study_Mouse-intestinal-epithelium_1906.ipynb) using my own data, where they do a lot of the work in scanpy and switch to R for trajectory analysis. I opted for using Slingshot as a trajectory method, and wanted to use the tradeSeq package downstream of this. However, I'm running into issues with my counts matrix (I believe). I'm using the Slingshot object generated via the Theis workflow, and the rpy2 interface which should convert my adata object into a SingleCellExperiment object, and from that I should be able to get the counts needed for tradeSeq - or at least that's my understanding.

sds <- SlingshotDataSet(adata)
counts <- as.matrix(assays(adata)$counts)

This works so that I get a nice plot that matches my Slingshot data when I run:

plotGeneCount(curve = sds, clusters = adata$louvain)

Running the following returns the 4 expected plots allowing me to determine a nknot value:

set.seed(6)
icMat <- evaluateK(counts = counts, sds = sds, k = 3:7, nGenes = 100,
verbose = FALSE, plot = TRUE)

But when I try to run the following:

set.seed(6)
sce <- fitGAM(counts = counts, sds = sds, nknots = 4, verbose = FALSE, sce=TRUE)

I get the resulting error: could not broadcast input array from shape (2) into shape (1136).

This makes me wonder if maybe there is something wrong with adata to SCE conversion, making my counts matrix weird?

Any thoughts or advice would be greatly appreciated. Thank you!

Unable to determine root cell

Hello,

Thank you for the wonderful package, I am very excited to utilize it for my data sets!

I have created a cds object from an integrated Seurat object, but an having issues with calculating pseudotime by following the code provided by the tradeSeq tutorial. I can calculate it fine following Monocle3's method in their tutorial, but then I don't know how to modify the tradeSeq tutorial to extract the pseudotime and cellweight values.

When I run:
exprs_human<- GetAssayData(Alpha_Beta, slot="counts", assay = "RNA")
counts <- as.matrix(exprs_human)

meta_data <- [email protected]
keep <- ("CellfindR")
meta_data <- meta_data[keep]
names(meta_data)[names(meta_data)=="CellfindR"] <- "cellType"

df <- data.frame(cells = colnames(counts), cellType = meta_data)
rownames(df) <- df$cells

cds <- new_cell_data_set(counts, cell_metadata = df,
gene_metadata = data.frame(gene_short_name = rownames(counts),
row.names = rownames(counts)))

cds <- preprocess_cds(cds, method = "PCA")
cds <- reduce_dimension(cds, preprocess_method = "PCA",
reduction_method = "UMAP")

plot_cells(cds, label_groups_by_cluster = FALSE, cell_size = 1,
color_cells_by = "cellType")

umaps = Embeddings(Alpha_Beta, assay="integrated",reduction="umap")
reducedDims(cds)$UMAP <- umaps
plot_cells(cds,color_cells_by="cellType")

cds <- cluster_cells(cds, reduction_method = "UMAP")

cds <- learn_graph(cds)
plot_cells(cds, label_groups_by_cluster = FALSE, cell_size = 1,
color_cells_by = "cellType")

celltype <- as.character(meta_data$cellType)
cell_ids <- which(celltype == "3.1.0")
closest_vertex <-
cds@principal_graph_aux[["UMAP"]]$pr_graph_cell_proj_closest_vertex
closest_vertex <- closest_vertex[colnames(cds), 1]
closest_vertex <- closest_vertex[cell_ids]
closest_vertex <- paste0("Y_", closest_vertex)
root <- names(which(igraph::degree(principal_graph(cds)[["UMAP"]]) == 1))
root <- root[root %in% closest_vertex]
cds <- order_cells(cds, root_pr_nodes = root)

I get the error:
Error in data_matrix[, ((((i - 1) * block_size) + 1):(ncol(data_matrix)))] :
subscript out of bounds

Everything seems to be working well until I run the line:
root <- root[root %in% closest_vertex]

in which it seems to return an empty character string.

Any idea what might be happening? Or, alternatively, how I might be able to extract the pseudotime and cell weights from the cds object after selecting the root cell by the current method suggested by Monocle3's tutorial? I don't have the data 'root' if I use their method...

Sorry if that was confusing, happy to clarify! Thanks again.

-Sean

Placing knots, on more than two lineages

When fitting the Gam models with lineages, the end points of lineages replace the value of the closest knots (as previously defined by quantiles)
However, this does not work if several end times are close to the same knot (run into error "Can't get all knots to equal endpoints of trajectories").
I see two options:

Add them all as knots (then, we do have more knots here so we may bias the isIdenticalTest towards those times)
Add the mean as knot. Then, the knot is not really the end time but it is still quite close so it should be good. However, this impacts more tests (startVsEndTest, diffEndTest)
Ideas?

install_github fails (maybe DESCRIPTION file??)

Hey there--I finally got around to trying to install the package and can't seem to successfully install. I also just updated to R 3.6.0, so I'm not sure if it's something weird on my end or not.

> devtools::install_github("statOmics/tradeSeq")
Downloading GitHub repo statOmics/tradeSeq@master
Skipping 24 packages ahead of CRAN: annotate, AnnotationDbi, Biobase, BiocGenerics, BiocParallel, clusterExperiment, DelayedArray, edgeR, genefilter, GenomeInfoDb, GenomeInfoDbData, GenomicRanges, HDF5Array, IRanges, limma, rhdf5, Rhdf5lib, S4Vectors, SingleCellExperiment, slingshot, SummarizedExperiment, XVector, zinbwave, zlibbioc
✔  checking for file ‘/private/var/folders/y9/j0gdhnsd1mn1lj2gsfqr532h0000gp/T/Rtmp4dXEG1/remotes10b217516c8a/statOmics-tradeSeq-cd514ec/DESCRIPTION’ ...
─  preparing ‘tradeSeq’:
E  checking DESCRIPTION meta-information ...
   Authors@R field gives more than one person with maintainer role:
     Koen Van den Berge <[email protected]> [aut, cre]
     Hector Roux de Bézieux <[email protected]> [aut, cre] (<https://orcid.org/0000-0002-1489-8339>)
   
   See section 'The DESCRIPTION file' in the 'Writing R Extensions'
   manual.
   
Error in (function (command = NULL, args = character(), error_on_status = TRUE,  : 
  System command error

I'm not familiar with requirements for the DESCRIPTION file, so I'm not sure if the final error is related to the >1 maintainer role that it mentions above.

> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1        ps_1.3.0          prettyunits_1.0.2 rprojroot_1.3-2   withr_2.1.2       digest_0.6.18    
 [7] crayon_1.3.4      assertthat_0.2.1  R6_2.4.0          backports_1.1.4   magrittr_1.5      rlang_0.3.4      
[13] cli_1.1.0         curl_3.3          fs_1.3.0          remotes_2.0.4     rstudioapi_0.10   callr_3.2.0      
[19] desc_1.2.0        devtools_2.0.2    tools_3.6.0       glue_1.3.1        pkgload_1.0.2     compiler_3.6.0   
[25] processx_3.3.0    pkgbuild_1.0.3    sessioninfo_1.1.1 memoise_1.1.0     usethis_1.5.0

If this isn't reproducible, just let me know and I'll figure out what's going on on my end. Thanks!

David

Extremely long fitGAM runtime?

Hello,

I am running fitGAM with parallelization on counts from an object with 11,768 cells and 26,152 genes on a machine with 16 cores and 256 GB of RAM:
BPPARAM$workers <- 16
sce <- fitGAM(counts = counts, pseudotime = pseudotime, cellWeights = cellWeights,
nknots = 7, verbose = TRUE, BPPARAM = BPPARAM)

The count file was made from non-normalized counts of a Seurat object:
exprs_human<- GetAssayData(Alpha_Beta, slot="counts", assay = "RNA")
counts <- as.matrix(exprs_human)

Currently, it is telling me that it will take 6 days in order to run. I would assume that this is a unnaturally long wait time?

Error: package or namespace load failed for 'tradeSeq': objects 'rowSums', 'colSums', 'rowMeans', 'colMeans' are not exported by 'namespace:S4Vectors'

Hi,

I am getting the following error when I try to load tradeSeq v1.0.1 or tradeSeq v1.3.03.

Error: package or namespace load failed for 'tradeSeq':
 objects 'rowSums', 'colSums', 'rowMeans', 'colMeans' are not exported by 'namespace:S4Vectors'

My sessionInfo is as follows:

R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.4

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] grDevices datasets  parallel  stats     graphics  utils     stats4    methods   base     

other attached packages:
 [1] preprocessCore_1.46.0       impute_1.58.0               SCENIC_1.1.2-2              org.Hs.eg.db_3.8.2         
 [5] AnnotationDbi_1.46.1        BUSpaRse_0.99.25            monocle3_0.2.0              TSCAN_1.22.0               
 [9] harmony_1.0                 Rcpp_1.0.3                  DropletUtils_1.4.3          destiny_2.14.0             
[13] slingshot_1.3.2             princurve_2.1.4             scater_1.12.2               ggplot2_3.2.1              
[17] SingleCellExperiment_1.8.0  SummarizedExperiment_1.14.1 DelayedArray_0.10.0         BiocParallel_1.18.1        
[21] matrixStats_0.55.0          Biobase_2.44.0              GenomicRanges_1.36.1        GenomeInfoDb_1.20.0        
[25] IRanges_2.18.3              S4Vectors_0.24.4            BiocGenerics_0.32.0         M3Drop_3.10.4              
[29] numDeriv_2016.8-1.1         ReactomePA_1.28.0           biomaRt_2.40.5             

loaded via a namespace (and not attached):
  [1] rsvd_1.0.3               vcd_1.4-5                Hmisc_4.3-1              class_7.3-15             Rsamtools_2.0.3         
  [6] foreach_1.4.8            lmtest_0.9-37            crayon_1.3.4             laeken_0.5.1             MASS_7.3-51.4           
 [11] nlme_3.1-142             backports_1.1.5          qlcMatrix_0.9.7          GOSemSim_2.10.0          rlang_0.4.6             
 [16] XVector_0.24.0           readxl_1.3.1             irlba_2.3.3              limma_3.40.6             phylobase_0.8.8         
 [21] smoother_1.1             plyranges_1.4.4          bit64_0.9-7              glue_1.3.1               pheatmap_1.0.12         
 [26] rngtools_1.5             vipor_0.4.5              UpSetR_1.4.0             VGAM_1.1-2               DOSE_3.10.2             
 [31] haven_2.2.0              tidyselect_1.0.0         rio_0.5.16               XML_3.99-0.3             tidyr_1.0.2             
 [36] zoo_1.8-7                packrat_0.5.0            GenomicAlignments_1.20.1 xtable_1.8-4             magrittr_1.5            
 [41] bibtex_0.4.2.2           zlibbioc_1.30.0          rstudioapi_0.11          sp_1.3-2                 rpart_4.1-15            
 [46] fastmatch_1.1-0          locfdr_1.1-8             ensembldb_2.8.1          RcppEigen_0.3.3.7.0      shiny_1.4.0             
 [51] BiocSingular_1.0.0       xfun_0.12                cluster_2.1.0            caTools_1.18.0           tidygraph_1.1.2         
 [56] tibble_2.1.3             ggrepel_0.8.1            ape_5.3                  stabledist_0.7-1         Biostrings_2.52.0       
 [61] png_0.1-7                zeallot_0.1.0            withr_2.1.2              slam_0.1-47              bitops_1.0-6            
 [66] ggforce_0.3.1            ranger_0.12.1            plyr_1.8.5               cellranger_1.1.0         GSEABase_1.46.0         
 [71] sparsesvd_0.2            pcaPP_1.9-73             AnnotationFilter_1.8.0   e1071_1.7-3              dqrng_0.2.1             
 [76] pillar_1.4.3             RcppParallel_4.4.4       gplots_3.0.1.2           GenomicFeatures_1.36.4   reldist_1.6-6           
 [81] kernlab_0.9-29           scatterplot3d_0.3-41     TTR_0.23-6               graphite_1.30.0          europepmc_0.3           
 [86] DelayedMatrixStats_1.6.1 xts_0.12-0               vctrs_0.2.4              urltools_1.7.3           NMF_0.22.0              
 [91] tools_3.6.2              foreign_0.8-72           rncl_0.8.4               beeswarm_0.2.3           munsell_0.5.0           
 [96] tweenr_1.0.1             fgsea_1.10.1             proxy_0.4-23             HSMMSingleCell_1.4.0     fastmap_1.0.1           
[101] compiler_3.6.2           abind_1.4-5              httpuv_1.5.2             rtracklayer_1.44.4       pkgmaker_0.31           
[106] GenomeInfoDbData_1.2.1   gridExtra_2.3            edgeR_3.26.8             lattice_0.20-38          later_1.0.0             
[111] dplyr_0.8.4              jsonlite_1.6.1           scales_1.1.0             docopt_0.6.1             graph_1.62.0            
[116] carData_3.0-3            lazyeval_0.2.2           promises_1.1.0           car_3.0-6                doParallel_1.0.15       
[121] latticeExtra_0.6-29      R.utils_2.9.2            checkmate_2.0.0          openxlsx_4.1.4           cowplot_1.0.0           
[126] statmod_1.4.34           Rtsne_0.15               forcats_0.4.0            copula_0.999-20          BSgenome_1.52.0         
[131] igraph_1.2.4.2           HDF5Array_1.12.3         survival_3.1-8           DDRTree_0.1.5            htmltools_0.4.0         
[136] memoise_1.1.0            locfit_1.5-9.1           graphlayouts_0.5.0       viridisLite_0.3.0        digest_0.6.24           
[141] assertthat_0.2.1         mime_0.9                 rappdirs_0.3.1           densityClust_0.3         registry_0.5-1          
[146] RSQLite_2.2.0            data.table_1.12.8        blob_1.2.1               R.oo_1.23.0              RNeXML_2.4.2            
[151] fastICA_1.2-2            splines_3.6.2            Formula_1.2-3            Rhdf5lib_1.6.3           ProtGenerics_1.16.0     
[156] RCurl_1.98-1.1           monocle_2.12.0           hms_0.5.3                rhdf5_2.28.1             colorspace_1.4-1        
[161] base64enc_0.1-3          BiocManager_1.30.10      ggbeeswarm_0.6.0         nnet_7.3-12              RANN_2.6.1              
[166] ADGofTest_0.3            mclust_5.4.5             mvtnorm_1.0-12           enrichplot_1.4.0         pspline_1.0-18          
[171] VIM_5.1.0                R6_2.4.1                 grid_3.6.2               ggridges_0.5.2           lifecycle_0.1.0         
[176] acepack_1.4.1            zip_2.0.4                curl_4.3                 gdata_2.18.0             robustbase_0.93-5       
[181] DO.db_2.9                Matrix_1.2-18            howmany_0.3-1            qvalue_2.16.0            RColorBrewer_1.1-2      
[186] iterators_1.0.12         stringr_1.4.0            htmlwidgets_1.5.1        polyclip_1.10-0          triebeard_0.3.0         
[191] purrr_0.3.3              gridGraphics_0.4-1       reactome.db_1.68.0       mgcv_1.8-31              htmlTable_1.13.3        
[196] bdsmatrix_1.3-4          codetools_0.2-16         FNN_1.1.3                GO.db_3.8.2              gtools_3.8.1            
[201] prettyunits_1.1.1        gridBase_0.4-7           RSpectra_0.16-0          R.methodsS3_1.8.0        gtable_0.3.0            
[206] DBI_1.1.0                httr_1.4.1               KernSmooth_2.23-16       stringi_1.4.6            progress_1.2.2          
[211] reshape2_1.4.3           farver_2.0.3             uuid_0.1-2               annotate_1.62.0          viridis_0.5.1           
[216] ggthemes_4.2.0           xml2_1.2.2               combinat_0.0-8           rvcheck_0.1.7            bbmle_1.0.23.1          
[221] boot_1.3-23              BiocNeighbors_1.2.0      AUCell_1.6.1             ade4_1.7-15              ggplotify_0.0.4         
[226] DEoptimR_1.0-8           bit_1.1-15.2             jpeg_0.1-8.1             ggraph_2.0.1             pkgconfig_2.0.3         
[231] gsl_2.1-6                knitr_1.28

Is there a way to fix this?

Best wishes,
Lucy

Issues with associationTest results with a single lineage

Hi,

I have been running into problems with associationTest results when a single lineage is fit. Specifically, results differ wildly when changing the number of knots from 4 to 5 in the fitGAM statement, with Wald statistics using 5 or more knots being much too large to be logical.

The code to reproduce this issue is given below, with the attached PDF showing summary information on the output Wald statistics with 4 and 5 knots as well as versions of packages I am using.

In particular I am using the most recent version of the package (1.1.17). This issue seems like it might be related to the one here #17 (comment) , though it does only seem to happen if the number of knots is greater than 4.

Also note that this only seems to occur when a single lineage is fit by slingshot, I have not been able to reproduce this issue when 2 or more lineages are fit.

Thanks!

library(splatter)
library(tradeSeq)
library(SingleCellExperiment)
library(slingshot)

nGenes <- 60179
numCells <- 100
current_seed <- 328585
params <- newSplatParams()

#Simulate no genes to be DE such that we would expect a small number of rejections
SplatterSimObject <- splatSimulate(params,
                                   method="paths",
                                   nGenes=nGenes,
                                   batchCells=numCells,
                                   seed=current_seed,
                                   lib.loc=11.49293, de.prob = 0, de.facLoc = log(2)*3,
                                   out.prob = 0)

countsT <- counts(SplatterSimObject)


filt_func <- function(x){
  ncells_high_exp <- sum(x >= 10)
  return(ncells_high_exp)
}
rows_to_keep <- apply(countsT, 1, filt_func)
counts <- countsT[rows_to_keep > 10,]

FQnorm <- function(counts){
  rk <- apply(counts,2,rank,ties.method='min')
  counts.sort <- apply(counts,2,sort)
  refdist <- apply(counts.sort,1,median)
  norm <- apply(rk,2,function(r){ refdist[r] })
  rownames(norm) <- rownames(counts)
  return(norm)
}
norm_counts <- FQnorm(counts)

SCEObj <- SingleCellExperiment(assays = List(counts = counts, norm_counts = norm_counts))


pca <- prcomp(t(log1p(assays(SCEObj)$norm_counts)), scale. = FALSE)

rd <- pca$x[,1:2]

reducedDims(SCEObj) <- SimpleList(PCA = rd)

cl <- kmeans(rd, centers = 4)$cluster
colData(SCEObj)$kMeans <- cl

slingshot_results <- slingshot(SCEObj, clusterLabels = 'kMeans', reducedDim = 'PCA')

lin <- getLineages(SCEObj, clusterLabels = colData(slingshot_results)$kMeans, reducedDim = 'PCA')

crv <- SlingshotDataSet(getCurves(lin))

sce4 <- fitGAM(counts = counts, pseudotime = slingPseudotime(crv, na = FALSE),
              sds = crv, nknots = 4, sce = TRUE)

sce5 <- fitGAM(counts = counts, pseudotime = slingPseudotime(crv, na = FALSE),
               sds = crv, nknots = 5, sce = TRUE)


assoc4Knots <- associationTest(sce4)
assoc5Knots <- associationTest(sce5)

SummaryofWaldStatsAndPackageInfo.pdf

Error when trying to install tradeSeq - related to predictSmooth.R

Hi,

I am getting the following error when I try to install tradeSeq from GitHub.

Error in .install_package_code_files(".", instdir) :
files in '/tmp/Rtmp53eb56/R.INSTALL3afa38d2d837/tradeSeq/R' missing from 'Collate' field:
predictSmooth.R
ERROR: unable to collate and parse R files for package 'tradeSeq'
* removing '/t1-data/user/lgarner/py36-v1/conda-install/envs/tradeseq/lib/R/library/tradeSeq'

I can see that predictSmooth was added in a recent commit. It appears to be missing in the Collate section of the DESCRIPTION file.

Best wishes,
Lucy

plotGeneCount with Monocle pseudotime

Hello,

I am trying to use the plotGeneCount function but the first argument is a Slingshot object. Is there a way to use this plotting function downstream of Monocle? I tried cds@phenoData@data$Pseudotime but that returned:

donor1_startRes <- startVsEndTest(donor1_fibroblast_sce)
oStart <- order(donor1_startRes$waldStat, decreasing = TRUE)
sigGeneStart <- names(donor1_fibroblast_sce)[oStart[3]]

> plotGeneCount(donor1_fibroblast_cds@phenoData@data$Pseudotime, exprs(donor1_fibroblast_cds), gene = sigGeneStart)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘reducedDim’ for signature ‘"numeric", "missing"’

I would guess that the function is looking for some class of object specific to the Slingshot output, but I'm not entirely sure.

The plotSmoothers function works just fine with these data.

Thanks for you continued help.

Brian

"tidy" output for predictSmooth

If I am interpreting correctly, predictSmooth returns a matrix with y-hat values ~~for each cell~~ along the pseudotime grid with lineages as concurrent sets of columns, and genes on rows. It would be useful to also provide optional "tidy" output with gene, lineage, time and y-hat as columns of a data.frame. I started to work this up using your plotSmoothers code which works for one gene but maybe an integrated solution would be a nice feature.

https://gist.github.com/mikelove/c91d293312b1cd2f9d71657819dc41a9

plotGeneCount without SlingshotDataSet object

Currently, plotGeneCount only works in conjunction with a Slingshot object. This could be generalized to any TI method.

Pseudotime values and clustering

In the vignette there are two different values called 'Pseudotime'. 0 to ~1 based on the curve, and then after the clustering analysis 1-20 based on the nPoints used.

For easier interpretation could those points based on nPoints be labeled with the corresponding pseudotime value from above?

Also this code is not fixed in the other function i mentioned? #23 (comment)

Error running patternTest

Hello!

Thanks for putting together this package, it provides a great complement to slingshot! I was testing it out on one of my datasets, and I'm running into an error when running patternTest:

patternRes <- patternTest(sce)
Error in t(L) %*% beta : non-conformable arguments
traceback()
7: getEigenStatGAM(beta, Sigma, L)
6: FUN(X[[i]], ...)
5: lapply(seq_len(nrow(models)), function(ii) {
beta <- t(rowData(models)$tradeSeq$beta[[1]][ii, ])
Sigma <- rowData(models)$tradeSeq$Sigma[[ii]]
getEigenStatGAM(beta, Sigma, L)
})
4: .earlyDETest(models = models, global = global, pairwise = pairwise,
knots = NULL, nPoints = nPoints)
3: .local(models, ...)
2: patternTest(sce)
1: patternTest(sce)

I'm able to run all of the other differential test functions without any issues (and before this point I followed the code in the vignette exactly). Any chance you know what could be causing this?

Thanks!

Gabi

Analysis with existing slingshot object

Hello,

Thank you very much for this package. I am trying to identify DE genes among different lineages identified with slingshot, and I was wondering if you could help me adapting the initial part of the vignette to that effect?

I get the following error when I try to follow through with the vignette

counts <- assays(sce_hem)$counts %>% as.matrix()
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘assays’ for signature ‘"SlingshotDataSet"’

Please note sce_hem was made as here below, with integrated_umap.data_hem and cluster_labels_hem taken from a V3 integrated Seurat object:

sce_hem <- slingshot(integrated_umap.data.hem,clusterLabels=cluster_labels_hem)
lin1_hem <- getLineages(sce_hem, cluster_labels_hem, start.clus= 'Prohem. Inactive')
lin1_hem
crv1_hem <- getCurves(lin1_hem)
crv1_hem
plot(integrated_umap.data.hem, col = ucols.states[cluster_labels_hem], pch=16, asp = 1, cex = 0.4)
lines(lin1_hem, lwd = 3, show.constraints = TRUE, col = 'black')
plot(integrated_umap.data.hem, col = ucols.states[cluster_labels_hem], pch=16, asp = 1, cex = 0.4)
lines(crv1_hem, lwd = 3, col = 'black')

Thank you for your time!

Error when plotting with greater than 9 clusters

I would like to plot the knots for my data that contains 10 clusters, but the built in palette only allows for 9 colors, so I get the following error when trying to use plotGeneCount:

n too large, allowed maximum for palette Set1 is 9. Returning the palette you asked for with that many colors

Can you please allow for the input of a color palette that isn't limited by "Set1" (e.g. brewer.paired())?

How to change the dot size in function plotSmoothers()?

Hello,
I tried the plotSmoothers() function. I want to make the dot size smaller than the default dot size. But I can't find a parameter to tune the dot size. Usually, we use the cex parameter in plot() to control the dot size. Therefore, is it possible to add a parameter in the plotSmoothers() function to tune the dot size? Thanks.

look into updated parallelization

We are currently using BiocParallel::bpparam, but seems like DoparParam might be a better choice, e.g. drisso/zinbwave#38

subscript out of bounds error

Hi,

I am testing your tradseq R package. However, I am getting an error I cant understand.
Here is my code:

load("tradseSeq_example.RData")
lin <- getLineages(dm_tmp, clusterLabels = rep(0, nrow(dm_tmp)))
crv <- getCurves(lin)
plotGeneCount(rd = dm_tmp, curve = crv, counts = counts_tmp, gene = "Lcn2")
gamList <- fitGAM(counts_tmp, pseudotime = slingPseudotime(crv), cellWeights = slingCurveWeights(crv))

Error in pseudotime[ii, which(as.logical(wSamp[ii, ]))] : subscript out of bounds

My example data can be downloaded from here: https://drive.google.com/file/d/1SZyeXXMQzXXANsRw2wiaw1l6cKh-MOyN/view?usp=sharing

I would greatly appreciate any help!

plotSmooters error

I'm running the tutorial starting with a monocle3 generated trajectory (also from the tutorial). I get the following error for each call to plotSmoothers, however -

Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘plotSmoothers’ for signature ‘"list"’

showMethods("plotSmoothers")
Function: plotSmoothers (package tradeSeq)
models="gam"
models="SingleCellExperiment"

The object sce is a list of 240 elements, named with gene names.

Thanks for your help!
Theresa

PS - I wholeheartedly support the addition of within-trajectory comparisons across factors as suggested by jeremymsimon - we have the identical situation.

Session Info
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] cowplot_1.0.0 clusterExperiment_2.6.1 slingshot_1.4.0 princurve_2.1.4 monocle3_0.2.1
[6] tidyr_1.0.2 dplyr_0.8.5 monocle_2.14.0 DDRTree_0.1.5 irlba_2.3.3
[11] VGAM_1.1-2 ggplot2_3.3.0 Matrix_1.2-18 SingleCellExperiment_1.8.0 SummarizedExperiment_1.16.1
[16] DelayedArray_0.12.3 BiocParallel_1.20.1 matrixStats_0.56.0 GenomicRanges_1.38.0 GenomeInfoDb_1.22.1
[21] IRanges_2.20.2 S4Vectors_0.24.4 RColorBrewer_1.1-2 tradeSeq_1.0.1 Biobase_2.46.0
[26] BiocGenerics_0.32.0 knitr_1.28

loaded via a namespace (and not attached):
[1] uuid_0.1-4 NMF_0.22.0 plyr_1.8.6 igraph_1.2.5 lazyeval_0.2.2 densityClust_0.3
[7] rncl_0.8.4 fastICA_1.2-2 gridBase_0.4-7 digest_0.6.25 foreach_1.5.0 viridis_0.5.1
[13] magrittr_1.5 memoise_1.1.0 cluster_2.1.0 doParallel_1.0.15 limma_3.42.2 annotate_1.64.0
[19] docopt_0.6.1 prettyunits_1.1.1 colorspace_1.4-1 blob_1.2.1 ggrepel_0.8.2 xfun_0.13
[25] sparsesvd_0.2 crayon_1.3.4 RCurl_1.98-1.2 genefilter_1.68.0 phylobase_0.8.10 survival_3.1-12
[31] iterators_1.0.12 ape_5.3 glue_1.4.0 registry_0.5-1 gtable_0.3.0 zlibbioc_1.32.0
[37] XVector_0.26.0 kernlab_0.9-29 Rhdf5lib_1.8.0 HDF5Array_1.14.4 scales_1.1.0 pheatmap_1.0.12
[43] DBI_1.1.0 edgeR_3.28.1 rngtools_1.5 bibtex_0.4.2.2 Rcpp_1.0.4.6 viridisLite_0.3.0
[49] xtable_1.8-4 progress_1.2.2 bit_1.1-15.2 proxy_0.4-23 httr_1.4.1 FNN_1.1.3
[55] ellipsis_0.3.0 farver_2.0.3 pkgconfig_2.0.3 XML_3.99-0.3 uwot_0.1.8 locfit_1.5-9.4
[61] labeling_0.3 howmany_0.3-1 tidyselect_1.0.0 rlang_0.4.5 softImpute_1.4 reshape2_1.4.4
[67] AnnotationDbi_1.48.0 munsell_0.5.0 tools_3.6.3 RSQLite_2.2.0 ade4_1.7-15 stringr_1.4.0
[73] yaml_2.2.1 bit64_0.9-7 purrr_0.3.4 RANN_2.6.1 pbapply_1.4-2 nlme_3.1-147
[79] slam_0.1-47 xml2_1.3.1 compiler_3.6.3 rstudioapi_0.11 tibble_3.0.1 RNeXML_2.4.3
[85] stringi_1.4.6 RSpectra_0.16-0 lattice_0.20-41 HSMMSingleCell_1.6.0 vctrs_0.2.4 pillar_1.4.3
[91] lifecycle_0.2.0 BiocManager_1.30.10 combinat_0.0-8 RcppAnnoy_0.0.16 zinbwave_1.8.0 bitops_1.0-6
[97] R6_2.4.1 gridExtra_2.3 codetools_0.2-16 MASS_7.3-51.5 assertthat_0.2.1 leidenbase_0.1.0
[103] rhdf5_2.30.1 pkgmaker_0.31.1 withr_2.2.0 qlcMatrix_0.9.7 GenomeInfoDbData_1.2.2 locfdr_1.1-8
[109] mgcv_1.8-31 hms_0.5.3 grid_3.6.3 DelayedMatrixStats_1.8.0 Rtsne_0.15

Allow SingleCellExperiment input for `fitGAM`

The object would have counts in assays and the pseudotime and cell-level weights in colData.

install_github error (Error: object ‘as.phylo.dendrogram’ is not exported by 'namespace:dendextend')

Hello tradeSeq team,
I am trying to install tradeSeq, but got the following error. I am not sure if this is caused by some dependency packages or not. Could you please have a look? Thanks.

installation error:

devtools::install_github("statOmics/tradeSeq")
Downloading GitHub repo statOmics/tradeSeq@master
Skipping 2 packages ahead of CRAN: Biobase, BiocGenerics
✔ checking for file ‘/private/var/folders/x9/m8c7s8ms3dl8m7wnl60hc3fw0000gn/T/RtmpU98tv9/remotesa9fd5959d33a/statOmics-tradeSeq-46836ab/DESCRIPTION’ ...
─ preparing ‘tradeSeq’: (418ms)
✔ checking DESCRIPTION meta-information ...
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ looking to see if a ‘data/datalist’ file should be added
─ building ‘tradeSeq_0.9.0.tar.gz’

installing source package ‘tradeSeq’ ...
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** byte-compile and prepare package for lazy loading
Warning: multiple methods tables found for ‘rowSums’
Warning: multiple methods tables found for ‘colSums’
Warning: multiple methods tables found for ‘rowMeans’
Warning: multiple methods tables found for ‘colMeans’
Error: object ‘as.phylo.dendrogram’ is not exported by 'namespace:dendextend'
Execution halted
ERROR: lazy loading failed for package ‘tradeSeq’
removing ‘/Library/Frameworks/R.framework/Versions/3.6/Resources/library/tradeSeq’
Error in i.p(...) :
(converted from warning) installation of package ‘/var/folders/x9/m8c7s8ms3dl8m7wnl60hc3fw0000gn/T//RtmpU98tv9/filea9fd183ab2aa/tradeSeq_0.9.0.tar.gz’ had non-zero exit status

sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] usethis_1.5.0 devtools_2.0.2

loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 ps_1.3.0 prettyunits_1.0.2 rprojroot_1.3-2 withr_2.1.2 digest_0.6.18 crayon_1.3.4
[8] assertthat_0.2.1 R6_2.4.0 backports_1.1.4 magrittr_1.5 rlang_0.3.4 cli_1.1.0 curl_3.3
[15] fs_1.3.1 remotes_2.0.4 rstudioapi_0.10 callr_3.2.0 desc_1.2.0 tools_3.6.0 glue_1.3.1
[22] pkgload_1.0.2 compiler_3.6.0 processx_3.3.1 pkgbuild_1.0.3 sessioninfo_1.1.1 memoise_1.1.0

Change colour for plotSmoothers

Hi, how can I change the colours when using plotSmoothers() ?
When I try and use geom_point() etc etc, I just plot new points over existing ones.
Thanks!

cluster gene expression patterns within a lineage

Thank you for developing/maintaining a great package.

Would it be possible to cluster different expression pattern within a single lineage?

I am interested in which genes' expression show the same pattern as my gene of interest along a specific lineage.

What do you think is the best way to do this?

Please let me know if anything's unclear.
Thank you so much for your help.

Association tests yield weird results in single lineage case

Hi,
Thanks for this nice package! I am trying to apply tradeSeq to single lineage simulation data, but the results seem to be incorrect.
Here is my code:

suppressMessages({library(devtools)
library(tidyverse, quietly = TRUE)
library(TSCAN)
library(plyr)
library(slingshot, quietly = TRUE)
library(RColorBrewer)
library(SingleCellExperiment)
library(scales)
library(gridExtra)
library(splatter)
library(dyntoy)
library(dynwrap)
library(mgcv)
library(mclust, quietly = TRUE)
library(tradeSeq, quietly = TRUE)
library(dynplot)
library(destiny, quietly = TRUE)})

set.seed(12)

dataset <- generate_dataset(
  model = model_linear(num_milestones = 2),
  num_cells = 500,
  num_features = 5000,
 differentially_expressed_rate = 0.01,
  normalise = T
)

counts <- dataset$counts %>% as.matrix() %>% t()
colnames(counts) <- dataset$cell_ids
norms <- dataset$expression %>% t()

simu_dat <- SingleCellExperiment(assays = List(counts = counts, norms = norms))

pca <- prcomp(t(assays(simu_dat)$norms), scale. = FALSE)
rd <- pca$x[,1:3]

reducedDims(simu_dat) <- SimpleList(PCA = rd)
cl <- Mclust(rd, G = 4)$classification
colData(simu_dat)$GMM <- cl

simu_slingshot <- slingshot(simu_dat, clusterLabels = 'GMM', reducedDim = 'PCA')

lin <- getLineages(simu_dat, clusterLabels = colData(simu_slingshot)$GMM)

crv <- getCurves(lin) %>% SlingshotDataSet()

norms <- assays(simu_slingshot)$norms %>% as.matrix()

sce <- fitGAM(counts = counts, pseudotime = slingPseudotime(crv, na = FALSE),
                  sds = crv, nknots = 3, sce = FALSE)

assoRes <- associationTest(sce)
assoRes %>% as_tibble(rownames = "gene") %>% dplyr::arrange(desc(waldStat))

The results show that all genes are significant, and the Wald statistics are extremely high, which does not make sence. Would you mind check this? Thanks!

Unable to reproduce vignette tutorial

Hi, I have been trying to follow version 1.1.13 of the vignette published here: https://statomics.github.io/tradeSeq/articles/tradeSeq.html#association-of-gene-expression-with-pseudotime
I encountered several issues:

The link to the fitGAM vignette is not working as of now (https://bioconductor.org/packages/release/bioc/vignettes/tradeSeq/inst/doc/fitGAM.html)
The function returns a list when run with the parameters specified in the vignette, however a list is not accepted as input to plotSmoothers():

> sce <- fitGAM(counts = counts, pseudotime = pseudotime, cellWeights = cellWeights,
              nknots = 19, verbose = FALSE)
> class(sce)
[1] "list"
> plotSmoothers(sce, counts, gene = sigGeneStart)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘plotSmoothers’ for signature ‘"list"’

My output of startVsEndTest() varies from the vignette in that it lacks the medianLogFC column:

> head(startRes)
        waldStat df       pvalue
Acin1   5.673787  2 5.860744e-02
Actb    9.593747  2 8.255516e-03
Ak2     8.595465  2 1.359936e-02
Alad   58.379980  2 2.103873e-13
Alas1 172.851291  2 0.000000e+00
Aldoa  35.055247  2 2.442586e-08

If you could look into these that would be great!

I am running R3.6.2 and the current Bioconductor release of tradeSeq (1.0.0). More info on package versions below.

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
 [1] stats4    parallel  grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SingleCellExperiment_1.8.0  SummarizedExperiment_1.16.1 DelayedArray_0.12.2         BiocParallel_1.20.1         matrixStats_0.55.0         
 [6] GenomicRanges_1.38.0        GenomeInfoDb_1.22.0         IRanges_2.20.2              S4Vectors_0.24.3            RColorBrewer_1.1-2         
[11] tradeSeq_1.0.0              Biobase_2.46.0              BiocGenerics_0.32.0         R.filesets_2.13.0           R.oo_1.23.0                
[16] R.methodsS3_1.8.0           reshape2_1.4.3              psych_1.9.12.31             cTPnet_1.0.2                reticulate_1.14            
[21] circlize_0.4.8              ComplexHeatmap_2.2.0        dplyr_0.8.3                 scales_1.1.0                slingshot_1.4.0            
[26] princurve_2.1.4             ggplot2_3.2.1               Seurat_3.1.3               

loaded via a namespace (and not attached):
  [1] R.utils_2.9.2           tidyselect_0.2.5        RSQLite_2.2.0           AnnotationDbi_1.48.0    htmlwidgets_1.5.1       Rtsne_0.15             
  [7] RNeXML_2.4.3            munsell_0.5.0           codetools_0.2-16        mutoss_0.1-12           ica_1.0-2               future_1.16.0          
 [13] withr_2.1.2             colorspace_1.4-1        uuid_0.1-4              zinbwave_1.8.0          ROCR_1.0-7              gbRd_0.4-11            
 [19] listenv_0.8.0           NMF_0.22.0              Rdpack_0.11-1           labeling_0.3            GenomeInfoDbData_1.2.2  mnormt_1.5-5           
 [25] bit64_0.9-7             farver_2.0.1            rhdf5_2.30.1            vctrs_0.2.1             TH.data_1.0-10          R6_2.4.1               
 [31] doParallel_1.0.15       clue_0.3-57             rsvd_1.0.3              locfit_1.5-9.1          bitops_1.0-6            assertthat_0.2.1       
 [37] multcomp_1.4-12         gtable_0.3.0            phylobase_0.8.10        npsurv_0.4-0            globals_0.12.5          sandwich_2.5-1         
 [43] rlang_0.4.2             genefilter_1.68.0       zeallot_0.1.0           GlobalOptions_0.1.1     splines_3.6.2           lazyeval_0.2.2         
 [49] BiocManager_1.30.10     backports_1.1.5         tools_3.6.2             gridBase_0.4-7          gplots_3.0.1.2          ggridges_0.5.2         
 [55] TFisher_0.2.0           Rcpp_1.0.3              plyr_1.8.5              progress_1.2.2          zlibbioc_1.32.0         purrr_0.3.3            
 [61] RCurl_1.98-1.1          prettyunits_1.1.1       pbapply_1.4-2           GetoptLong_0.1.8        cowplot_1.0.0           zoo_1.8-6              
 [67] ggrepel_0.8.1           cluster_2.1.0           magrittr_1.5            data.table_1.12.8       RSpectra_0.16-0         lmtest_0.9-37          
 [73] RANN_2.6.1              mvtnorm_1.0-11          packrat_0.5.0           fitdistrplus_1.0-14     R.cache_0.14.0          hms_0.5.3              
 [79] lsei_1.2-0              xtable_1.8-4            XML_3.99-0.3            gridExtra_2.3           shape_1.4.4             compiler_3.6.2         
 [85] tibble_2.1.3            KernSmooth_2.23-16      crayon_1.3.4            htmltools_0.4.0         mgcv_1.8-31             tidyr_1.0.0            
 [91] RcppParallel_4.4.4      DBI_1.1.0               howmany_0.3-1           MASS_7.3-51.5           rappdirs_0.3.1          Matrix_1.2-18          
 [97] ade4_1.7-15             gdata_2.18.0            metap_1.3               igraph_1.2.4.2          pkgconfig_2.0.3         rncl_0.8.4             
[103] sn_1.5-5                registry_0.5-1          locfdr_1.1-8            numDeriv_2016.8-1.1     plotly_4.9.2            xml2_1.2.5             
[109] foreach_1.4.8           annotate_1.64.0         rngtools_1.5            pkgmaker_0.31           multtest_2.42.0         XVector_0.26.0         
[115] bibtex_0.4.2.2          stringr_1.4.0           digest_0.6.23           sctransform_0.2.1       RcppAnnoy_0.0.14        tsne_0.1-3             
[121] softImpute_1.4          leiden_0.3.3            edgeR_3.28.1            uwot_0.1.5              sloop_1.0.1             kernlab_0.9-29         
[127] gtools_3.8.1            rjson_0.2.20            lifecycle_0.1.0         nlme_3.1-143            jsonlite_1.6            Rhdf5lib_1.8.0         
[133] clusterExperiment_2.6.1 viridisLite_0.3.0       limma_3.42.2            pillar_1.4.3            lattice_0.20-38         httr_1.4.1             
[139] plotrix_3.7-7           survival_3.1-8          glue_1.3.1              png_0.1-7               iterators_1.0.12        bit_1.1-15.2           
[145] HDF5Array_1.14.3        stringi_1.4.4           blob_1.2.1              memoise_1.1.0           caTools_1.17.1.3        irlba_2.3.3            
[151] future.apply_1.4.0      ape_5.3

Thanks,
Jerry

fitGAM: Error in Biobase::exprs(cds) : object 'cds' not found

Hello,

I started with the same issue as #32, but after correcting the sparse matrix problem I get another error:

Error in Biobase::exprs(cds) : object 'cds' not found

I am working with the devel install of tradeSeq via GitHub, installed today.

# make Seurat object
path <- Read10X(data.dir = "./16249X1/outs/filtered_feature_bc_matrix/")
donor1 <- CreateSeuratObject(counts = path, project = "16249X1", min.cells = 0, min.features = 0)

# read in list of cell barcodes of interest and subset
keep1 = read.csv("16249X1/outs/analysis/clustering/graphclust/clusters.csv")
keep1 = keep1[keep1$Cluster == "2", "Barcode"]
keep1 = gsub("-1", "", keep1, perl = FALSE)
donor1_fibroblast = subset(donor1, cells = keep1)

# convert to Monocle format and generate pseudotime
data <- as.matrix(donor1_fibroblast@assays$RNA@data) # no sparse matrix
pd <- new('AnnotatedDataFrame', data = [email protected])
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))
fd <- new('AnnotatedDataFrame', data = fData)
donor1_fibroblast_cds <- newCellDataSet(data, phenoData = pd,  featureData = fd,
          expressionFamily = negbinomial.size())

donor1_fibroblast_cds <- estimateSizeFactors(donor1_fibroblast_cds)
donor1_fibroblast_cds <- reduceDimension(donor1_fibroblast_cds, max_components = 2)
donor1_fibroblast_cds <- orderCells(donor1_fibroblast_cds)

# run tradeSeq
library(tradeSeq)
donor1_fibroblast_sce <- fitGAM(donor1_fibroblast_cds, verbose = TRUE)
Error in Biobase::exprs(cds) : object 'cds' not found

donor1_fibroblast_cds is a valid CellDataSet object and I can generate the plot of pseudotime with Monocle.

Any help would be much appreciated.

Cheers,

Brian

fitGAM error in quantile.default

When trying to run fitGAM on raw count data (a filtered matrix of 5 highly expressed genes x ~35,000 cells), I get the following error:

gamList <- fitGAM(counts = counts, pseudotime = slingPseudotime(crv, na = FALSE), cellWeights = slingCurveWeights(crv))

Error in quantile.default(x, p = p) : missing values and NaN's not allowed if 'na.rm' is FALSE

Do you have any idea what could be causing this error? I already confirmed that there are no NAs in counts, pseudotime, or cellWeights, and there are no genes with all zero counts on this filtered list.

unable to find an inherited method for function ‘fitGAM’ for signature ‘"CellDataSet"’

Dear Tradeseq team,

Thanks for the tool.
I am working on scRNA-seq data. I installed tradeSeq and loaded it. I followed the vignette online that uses monocle. However the function fitGAM is not working. Thanks for any help in advance!

Regards,
Rahul

library(Seurat)
library(monocle)
library(tradeSeq)
library(RColorBrewer)
library(SingleCellExperiment)
library(dplyr)
library(tidyr)
set.seed(42)

omd.data <- Read10X(data.dir = "D:/plant_new/outs/filtered_feature_bc_matrix/",gene.column = 1)
omd.dataset <- CreateSeuratObject(counts = omd.data, min.cells = 2, min.features = 200, project = "OMD")
data <- as(as.matrix(omd.dataset@assays$RNA@counts), 'sparseMatrix')

pd <- new('AnnotatedDataFrame', data = [email protected])

fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))
fd <- new('AnnotatedDataFrame', data = fData)
cds <- newCellDataSet(data,
phenoData = pd,
featureData = fd)

cds <- estimateSizeFactors(cds)
cds <- reduceDimension(cds)

cds <- orderCells(cds)
cds <- orderCells(cds, root_state = 5)
sce <- fitGAM(cds, verbose = TRUE)
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘fitGAM’ for signature ‘"CellDataSet"’

Comparing trajectories of different cell types to same stimulus

Hey @HectorRDB, fantastic work on this--it looks great! I'm just getting some trajectories prepared to try this out. I'm wondering if you had any thoughts about how best to approach a situation where, for example, you have separate trajectories for Cell Type A and Cell Type B (and perhaps more) exposed to some experimental condition and you want to do tests for differential responses. The key thing here I suppose would be to ignore baseline differences at pseudotime=0 and just test for changes relative to that starting point.

Would it make sense to include cell type as a covariate? For other models, I could imagine centring the data on the values at pseudotime=0, but given that the model takes counts as input, I don't think that would work.

Any thoughts? Looking forward to trying this out!

clusterExpressionPattern producing 'spiky' output

Hi,

I've been working through your vignette using my own data and have come to the clusterExpressionPattern section which is producing some odd looking graphs.

Any idea on why they are so spiky / what I might be doing wrong?

Thanks

Error running evaluateK

Hi,

I'm getting an error running evaluateK on my data:

"Fitting terminated with step failure - check results carefully"

I believe it's because my data is zero inflated, if I add a pseudocount of one to the count matrix it runs without problems, however I'm not convinced this is a sensible thing to do.

I've tried calculating weights using the zinbwave package, however even when setting the weights parameter of evaluateK I receive the same error message.

Many thanks for your help!

cellWeights are 0/1

Hello,

I'm looking forward to testing this package - I have a similar goal as in issue #19, and was planning to calculate separate trajectories for Mutant and wildtype cells - I can see in Monocle3 that several gene trajectories are different between the genotype, but there's no way of quantifying it. Hopefully your package will help with that!

I am running into a problem with the fitGAM preparation from a Monocle CDS. I was able to find what I believe are the root nodes from the Monocle3 shiny interface using:

root <- cds@principal_graph_aux$UMAP$root_pr_nodes

(which may be helpful for #38), and otherwise I have followed your vignette exactly. However when I calculate the cellWeights, I am getting a binary data frame of 0s and 1s. When I try to make the sce object using fitGAM, I get the following error:

Error in .assignCells(cellWeights) :
Some cells have no positive cell weights.

I understand this is because my cellWeights is largely populated by 0s, but I'm not sure how I'm ending up with the wrong weights. Do you have any suggestions for what could be happening?

Thank you,

fitGAM with pre-normalized expression

I'm using Seurat to process my data and have used sctransform to transform the raw count data. I wondering how to go about using the fitGAM function with that data. I'm currently creating a offset matrix with all 1s but I wanted to see if that was appropriate.

Mike

installation problems

Hi there,

When I try to install tradeSeq, I get the following error:

Error in parse_repo_spec(repo) : Invalid git repo specification: 'statOmics/tradeSeq/'

and it is not a problem of proxy. Any idea?

Thank you!

Error running plotSmoothers with one lineage sce object

I'm getting an error running plotSmoothers:
> plotSmoothers(sce, counts, gene = 'Fasn')
Error in pseudotime[lineageID == ii, ii] : incorrect number of dimensions

The counts, pca and clusters are from Seurat:
counts <- as.matrix(so@assays$RNA@counts)
lin <- getLineages(so@[email protected][, 1:10], clusterLabels = cl, start.clus = 4, end.clus=2)
crv <- getCurves(lin)
sce <- fitGAM(counts = counts[id, ], sds = crv, nknots = 7 )

I think this is due to having only one lineage and an sce object:

In the .plotSmoothers_sce function the code pseudotime <- slingshotColData[,grep(x = colnames(slingshotColData), pattern = "pseudotime")] returns a vector not a data frame when there is just one curve.

Thanks,

add option to compare one lineage against all others

Sometimes it can be interesting to discover marker genes for a specific lineage by comparing gene expression in that lineage against the expression of all other lineages in the trajectory. It would be nice to implement an automatic way to do this, either by aggregating p-values of the pairwise comparisons or by constructing contrast matrices for omnibus tests.

statomics / tradeseq Goto Github PK

tradeseq's Introduction

R package: tradeSeq

TRAjectory Differential Expression analysis for SEQuencing data

Installation

Changes

Issues and bug reports

Usage

Cheatsheet

Contributing and requesting

tradeseq's People

Contributors

Stargazers

Watchers

Forkers

tradeseq's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs