GithubHelp home page GithubHelp logo

rezakj / icellr Goto Github PK

View Code? Open in Web Editor NEW
119.0 13.0 19.0 69.83 MB

Single (i) Cell R package (iCellR) is an interactive R package to work with high-throughput single cell sequencing technologies (i.e scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST)).

R 99.41% C++ 0.59%
singel-cell-sequencing clustering-algorithm clustering normalization umap pseudotime scrna-seq scvdj-seq cite-seq 3d

icellr's People

Contributors

rezakj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

icellr's Issues

no plots available for ADT data when using ADT + HTO

Hi all,
I hope you can help me here.
I've been following your tutorials to set up my data in the following order:

mat <- as.data.frame(as.matrix(mat.so@assays$RNA@data))
mat.adt <- as.data.frame(as.matrix(mat.so@assays$ADT@data))
mat.hto <- as.data.frame(as.matrix(mat.so@assays$HTO@data))

htos <- hto.anno(hto.data = mat.hto, cov.thr = 10, assignment.thr = 80)
htos <- subset(htos,htos$percent.match > 80)

# cell ids to hashtags
sample1 <- row.names(subset(htos,htos$assignment.annotation == "HTO1"))
sample1.rna <- mat[ , which(names(mat) %in% sample1)]
my.data <- data.aggregation(samples = c("sample1.rna","sample2.rna"), 
   condition.names = c("c","t"))
mat.icellr <- make.obj(my.data)

# Then add the adt data frame
mat.icellr <- add.adt(mat.icellr, adt.data = mat.adt)

# following with qc.stat and cell.filter

# normalize RNA
mat.icellr <- norm.data(mat.icellr, norm.method = "ranked.glsf", top.rank = 500) 
# normalize ADT
mat.icellr <- norm.adt(mat.icellr)

# 2nd QC 
mat.icellr <- qc.stats(mat.icellr,which.data = "main.data")
# gene stats
mat.icellr <- gene.stats(mat.icellr, which.data = "main.data")
# genes for PCA
# merge RNA + ADT
mat.icellr <- adt.rna.merge(mat.icellr, adt.data = "main")
# run PCA and so on. 

Using this workflow I cannot plot anything using the "ADT_ab". However, when I use the ADT data alone, I'm able to do it.
I checked the dataframe from both objects and they seem the same, just that when I use HTO I got less cells as I removed duplicates and negatives

Object with ADT + HTO

###################################
,--. ,-----.       ,--.,--.,------.
`--''  .--./ ,---. |  ||  ||  .--. '
,--.|  |    | .-. :|  ||  ||  '--'.'
|  |'  '--'\   --. |  ||  ||  |
`--' `-----' `----'`--'`--'`--' '--'
###################################
An object of class iCellR version: 1.6.5
Raw/original data dimentions (rows,columns): 32285,11236
Data conditions in raw data: o,y (5822,5414)
Row names: 0610005C13Rik,0610006L08Rik,0610009B22Rik ...
Columns names: y_AAACCCAAGCAGGCAT,y_AAACCCACAGCTCTGG,y_AAACCCATCGCTGTCT ...
###################################
   QC stats performed:TRUE, PCA performed:TRUE
   Clustering performed:FALSE, Number of clusters:0
   tSNE performed:FALSE, UMAP performed:TRUE, DiffMap performed:FALSE
   Main data dimensions (rows,columns): 32310,9840
   Data conditions in main data:o,y(4958,4882)
   Normalization factors:0.689367228711468,...
   Imputed data dimensions (rows,columns):0,0
############## scVDJ-seq ###########
VDJ data dimentions (rows,columns):0,0
############## CITE-seq ############
   ADT raw data  dimensions (rows,columns):25,13190
   ADT main data  dimensions (rows,columns):25,13190
   ADT columns names:AAACCCAAGCAGGCAT...
   ADT row names:ADT_B220...
############## scATAC-seq ############
   ATAC raw data  dimensions (rows,columns):0,0
   ATAC main data  dimensions (rows,columns):0,0
   ATAC columns names:...
   ATAC row names:...
############## Spatial ###########
Spatial data dimentions (rows,columns):0,0
########### iCellR object ##########

head([email protected])[1:3]
          AAACCCAAGCAGGCAT AAACCCAAGTATGATG AAACCCAAGTCGAATA
ADT_B220          0.000000          0.00000         0.000000
ADT_CD115         9.870273          0.00000         3.290091
ADT_CD11b         0.000000         16.85752       101.145095
ADT_CD11c         0.000000         90.52983         0.000000
ADT_CD127         0.000000          0.00000        77.677947
ADT_Flt3         27.195452         45.32575        27.195452

Object with ADT only

###################################
,--. ,-----.       ,--.,--.,------.
`--''  .--./ ,---. |  ||  ||  .--. '
,--.|  |    | .-. :|  ||  ||  '--'.'
|  |'  '--'\   --. |  ||  ||  |
`--' `-----' `----'`--'`--'`--' '--'
###################################
An object of class iCellR version: 1.6.5
Raw/original data dimentions (rows,columns): 32285,13190
Data conditions: no conditions/single sample
Row names: Xkr4,Gm1992,Gm19938 ...
Columns names: AAACCCAAGCAGGCAT,AAACCCAAGTATGATG,AAACCCAAGTCGAATA ...
###################################
   QC stats performed:TRUE, PCA performed:TRUE
   Clustering performed:FALSE, Number of clusters:0
   tSNE performed:FALSE, UMAP performed:TRUE, DiffMap performed:FALSE
   Main data dimensions (rows,columns): 32310,11278
   Normalization factors:0.640626466090974,...
   Imputed data dimensions (rows,columns):0,0
############## scVDJ-seq ###########
VDJ data dimentions (rows,columns):0,0
############## CITE-seq ############
   ADT raw data  dimensions (rows,columns):25,13190
   ADT main data  dimensions (rows,columns):25,13190
   ADT columns names:AAACCCAAGCAGGCAT...
   ADT row names:ADT_B220...
############## scATAC-seq ############
   ATAC raw data  dimensions (rows,columns):0,0
   ATAC main data  dimensions (rows,columns):0,0
   ATAC columns names:...
   ATAC row names:...
############## Spatial ###########
Spatial data dimentions (rows,columns):0,0
########### iCellR object ##########

head([email protected])[1:3]
          AAACCCAAGCAGGCAT AAACCCAAGTATGATG AAACCCAAGTCGAATA
ADT_B220          0.000000          0.00000         0.000000
ADT_CD115         9.870273          0.00000         3.290091
ADT_CD11b         0.000000         16.85752       101.145095
ADT_CD11c         0.000000         90.52983         0.000000
ADT_CD127         0.000000          0.00000        77.677947
ADT_Flt3         27.195452         45.32575        27.195452

If I do

A = gene.plot(my.obj, 
	gene = "ADT_Flt3",
	plot.data.type = "umap",
	interactive = F,
	cell.transparency = 0.5

Works as expected in the ADT only object, otherwise I get just a grey plot with no signal of expression.
I would appreciate if you can help me to point out the reason.

Thanks!

Memory management problem

Hello,

I am trying to test out iCellR, however, I can't seem to get through the data aggregation step. As soon as I load one 10x dataset, there will be a >80% system memory usage and the aggregation step will fail as it cannot allocate a vector of 274kb. This is on a system with 8gb ram with nothing else running.
Using a system with 16gb ram seems to work fine, but memory usage was still >80%. I am wondering if there is a way to use iCellR with less than 16gb ram?

Thanks!

pseudotime.tree

Hi, Rezakj,
Thanks for this powerful l tool.
I ran an issue to run the pseudotime.tree. The error is

Error in `[.data.frame`(x, r, vars, drop = drop) : 
  undefined columns selected

MyGenes are correct

> MyGenes[1:5]
[1] "Cd209a" "Cd209d" "Il4i1"  "H2.Oa"  "Ccr7" 

But when I run the colnames ()

> colnames(my.obj)
NULL

Whether it means there are no columns, therefore, I can not run the function? But I have got the marker.genes and MyGenes.

Thanks for your help,
Yale

Seeking help to create different types of plots

I would be highly grateful if you can answer the following questions-

  1. How to make a vector for more than one gene and display it in DotPlot?
  2. How to do co-expression analysis of two markers and create bicolored co-marker gradients in FeaturePlot?
  3. I want to present the average gene expression of a specific in Violin plots?
  4. How to do RidgePlot in iCellR?

convert seurat object to iCellR object

Hi:

I tried to convert a Seurat object to iCellR object with the command
my.obj <- make.obj(seurat.obj)
but got the following error. What is the reason?

Thanks in advance

Error in dimnames(x) <- dn: 'dimnames' applied to non-array
Traceback:

  1. make.obj(seurat.obj)
  2. row.names<-(*tmp*, value = gsub("-", ".", row.names(x)))
  3. row.names<-.default(*tmp*, value = gsub("-", ".", row.names(x)))
  4. rownames<-(x, value)

load10x failed

Hello,

I am trying to test out iCellR, however, I can't load my 10x data.
rm(list = ls())
options(stringsAsFactors = F)

library(iCellR)

my.data <- load10x(dir.10x ='../reads' )

Error: cannot allocate vector of size 6288.3 Gb

`
In reads, there 3 files :
image

Thanks!

Transform my seurat object into iCellR

Hello,
Is there a way of transforming my final Seurat object (after all the QC, Batch alignment and Clustering) into iCellR?
There is some information I would like to obtain but I can't get it from the Seurat vignettes (like Cell cycle pie graph, Graph bar illustrating the change between conditions in one cluster, among others).

I appreciate the help

an error message - running the run.knetl

Hi guys

I want to test the KNetL map for some geochemical purposes (sparse geochemical data 5600 rows and 19 columns). However, I have some problems, when I try to run the run.knetl, I get an error message.

#This is my code:

library(iCellR)
my.obj <- make.obj(data)

Filtering the data
my.obj <- cell.filter(my.obj)
Normalizing the data
my.obj <- norm.data(my.obj, norm.method = "ranked.glsf")

my.obj <- gene.stats(my.obj, which.data = "main.data")

Run PCA
my.obj <- run.pca(my.obj, method = "base.mean.rank", data.type = "main")
opt.pcs.plot(my.obj)

Run DR
my.obj <- run.pc.tsne(my.obj, dims = 1:10, perplexity = 6)
my.obj <- run.umap(my.obj, dims = 1:10)
my.obj <- run.knetl(my.obj, dims = 1:10, zoom = 300, dim.redux = "umap")

#I got an error here

Getting PCA data
Calculating euclidean distance ...
Finding 300 neighboring cells per cell ...
Generating graph from root to neighboring cells ...
Generating 2D Layouts ...
Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent
In addition: Warning messages:
1: In graph.data.frame(data, directed = FALSE) :
In d' NA' elements were replaced with string "NA"
2: In get(layout.2d)(g) :
Non-positive edge weight found, ignoring all weights during graph layout.

Can somebody help me?

Thank you in advance.
Kind regards

gg.cor produces incorrect plots

Thank you for the great package.
I have noticed the implementation of gg.cor produces incorrect results!
eg. When you provide gene1 = "TCF7" and gene2 = "IFNG", it will produce an identical plot as gene1 = "IFNG" and gene2 = "TCF7"

I believe this is a result of using "subset" function which sorts the data frame and you "hard-code" the columns used for x and y in the ggplot function.

Mutual Nearest Neighbor Batch correction not working

I am not able to use the MNN method right now, Here is my code followed by the error-

# Create iCellR object here
my.obj <- make.obj(ag.data)

# QC
my.obj <- qc.stats(my.obj,
s.phase.genes = s.phase, 
g2m.phase.genes = g2m.phase)

# filter
my.obj <- cell.filter(my.obj)
my.obj <- gene.stats(my.obj, which.data = "main.data")

my.obj <- make.gene.model(my.obj, my.out.put = "data",
	dispersion.limit = 1.5,
	base.mean.rank = 500,
	no.mito.model = T,
	mark.mito = T,
	interactive = F,
	no.cell.cycle = T,
	out.name = "gene.model")

library(scran)
library(BiocNeighbors)
my.obj <- run.mnn(my.obj,
    method = "gene.model",
    gene.list = [email protected],
    k=20,
    d=50)

Prepering samples ...
Running fast MNN ...
'fastMNN' is deprecated.
Use 'batchelor::fastMNN' instead.
'package:stats' may not be available when loading'package:stats' may not be available when loading'package:stats' may not be available when loading'package:stats' may not be available when loading'package:stats' may not be available when loadingpackage:stats' may not be available when loading'package:stats' may not be available when loadingError in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘queryKNN’ for signature ‘"missing", "NULL"’

data input from other object

Thanks for this wonderful tool!
Does this package have any functions to input data from other objects like seurat, cds or singlecellexperiment? Or is there any way to create an object with established metadata containing the clustering information? I am really looking forward to that.

Can not run findMarkers

Hi iCellR team,

Thanks for developing such a great package. The plots look really cool.
Recently, I am using the iCellR to make some plots, but I got error when I ran the findMarkers. It's a long description, thanks for your patience.

I made the object from the seurat object, which includes 21 samples. I choose the integrated data to transfer them. The code is here:

transfer data from seurat object

my.data <- as.data.frame(as.matrix(Seurat_object@assays[["integrated"]]@DaTa))

#Create iCellR object
my.obj <- make.obj(my.data)

make sure it's made properly

slotNames(my.obj)
my.obj

###add seurat parameter
#add main data
[email protected] <- as.data.frame(as.matrix(Seurat_object@assays[["integrated"]]@DaTa))

#add scaled data
[email protected] <- as.data.frame(as.matrix(Seurat_object@assays[["integrated"]]@scale.data))

add cluster

Idents(Seurat_object) <- Seurat_object$seurat_clusters
Clust <- as.data.frame(as.matrix([email protected]))
colnames(Clust) <- "clusters"
[email protected] <- Clust

add conditon1

Idents(Seurat_object) <- Seurat_object$treat
treat <- as.data.frame(as.matrix([email protected]))
colnames(treat) <- "treat"
[email protected] <- treat

add conditon2

Idents(Seurat_object) <- Seurat_object$cond
Cond <- as.data.frame(as.matrix([email protected]))
colnames(Cond) <- "Cond"
[email protected] <- Cond

add dim reductions

[email protected] <- as.data.frame(as.matrix(Seurat_object@reductions[["tsne"]]@cell.embeddings))
[email protected] <- as.data.frame(as.matrix(Seurat_object@reductions[["umap"]]@cell.embeddings))
[email protected] <- as.data.frame(as.matrix(Seurat_object@reductions[["pca"]]@cell.embeddings))

All these steps are no error, but when I run findMarkers, I got the following error again and again.

Finding markers for cluster: 1 ...
Finding markers for cluster: 2 ...
Finding markers for cluster: 3 ...
Finding markers for cluster: 4 ...
Finding markers for cluster: 5 ...
Finding markers for cluster: 6 ...
Finding markers for cluster: 7 ...
Finding markers for cluster: 8 ...
Finding markers for cluster: 9 ...
Finding markers for cluster: 10 ...
Finding markers for cluster: 11 ...
Finding markers for cluster: 12 ...
Finding markers for cluster: 13 ...
Finding markers for cluster: 14 ...
Finding markers for cluster: 15 ...
Finding markers for cluster: 16 ...
Finding markers for cluster: 17 ...
Finding markers for cluster: 18 ...
Finding markers for cluster: 19 ...
Finding markers for cluster: 20 ...
Finding markers for cluster: 21 ...
Finding markers for cluster: 22 ...
Finding markers for cluster: 23 ...
Error in t.test.default(x = mrgd[Cond1_Start:Cond1_End], y = mrgd[Cond2_Start:Cond2_End]) :
not enough 'x' observations
In addition: There were 22 warnings (use warnings() to see them)

At first, I thought it was the MyClusts, as after we run the following code, the cluster numbers changed to 1, 2, 3,.... from 0, 1, 2, ....

MyClusts <- as.numeric(unique(DATA$clusters))
MyClusts <- sort(MyClusts)

Then I added one step before MyClusts <- sort(MyClusts)

MyClusts <- MyClusts-1

But I still got the same error.

I also extracted the loop part and only run 1 cluster, and I got the following error:

Finding markers for cluster: 1 ...
Error in dat : object 'dat' not found

My sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] grid stats graphics grDevices utils datasets methods base

other attached packages:
[1] iCellR_1.4.5 plotly_4.9.2.1 gridExtra_2.3 gtable_0.3.0 RColorBrewer_1.1-2
[6] pheatmap_1.0.12 forcats_0.5.0 stringr_1.4.0 purrr_0.3.3 readr_1.3.1
[11] tidyr_1.0.2 tibble_3.0.0 tidyverse_1.3.0 pryr_0.1.4 magrittr_1.5
[16] data.table_1.12.8 cowplot_1.0.0 Matrix_1.2-18 dplyr_0.8.5 formattable_0.2.0.1
[21] classInt_0.4-3 xlsx_0.6.3 clustree_0.4.2 ggraph_2.0.2 ggplot2_3.3.0
[26] Seurat_3.1.4

loaded via a namespace (and not attached):
[1] reticulate_1.14 tidyselect_1.0.0 htmlwidgets_1.5.1 Rtsne_0.15 munsell_0.5.0
[6] codetools_0.2-16 mutoss_0.1-12 ica_1.0-2 future_1.17.0 withr_2.2.0
[11] colorspace_1.4-1 Biobase_2.46.0 knitr_1.28 rstudioapi_0.11 stats4_3.6.2
[16] ROCR_1.0-7 ggsignif_0.6.0 rJava_0.9-11 gbRd_0.4-11 listenv_0.8.0
[21] Rdpack_0.11-1 mnormt_1.5-6 polyclip_1.10-0 bit64_0.9-7 farver_2.0.3
[26] vctrs_0.2.4 generics_0.0.2 TH.data_1.0-10 xfun_0.13 R6_2.4.1
[31] graphlayouts_0.6.0 rsvd_1.0.3 hdf5r_1.3.2 reshape_0.8.8 bitops_1.0-6
[36] assertthat_0.2.1 promises_1.1.0 scales_1.1.0 nnet_7.3-12 multcomp_1.4-13
[41] npsurv_0.4-0 globals_0.12.5 tidygraph_1.1.2 sandwich_2.5-1 rlang_0.4.5
[46] scatterplot3d_0.3-41 splines_3.6.2 lazyeval_0.2.2 acepack_1.4.1 checkmate_2.0.0
[51] broom_0.5.6 reshape2_1.4.3 modelr_0.1.6 backports_1.1.6 httpuv_1.5.2
[56] Hmisc_4.4-0 tools_3.6.2 ellipsis_0.3.0 gplots_3.0.3 ggdendro_0.1-20
[61] BiocGenerics_0.32.0 ggridges_0.5.2 TFisher_0.2.0 Rcpp_1.0.4.6 plyr_1.8.6
[66] base64enc_0.1-3 progress_1.2.2 prettyunits_1.1.1 ggpubr_0.2.5 rpart_4.1-15
[71] pbapply_1.4-2 viridis_0.5.1 zoo_1.8-7 haven_2.2.0 ggrepel_0.8.2
[76] cluster_2.1.0 fs_1.4.1 NbClust_3.0 lmtest_0.9-37 reprex_0.3.0
[81] RANN_2.6.1 mvtnorm_1.1-0 fitdistrplus_1.0-14 hms_0.5.3 xlsxjars_0.6.1
[86] patchwork_1.0.0.9000 lsei_1.2-0 mime_0.9 evaluate_0.14 xtable_1.8-4
[91] jpeg_0.1-8.1 readxl_1.3.1 compiler_3.6.2 KernSmooth_2.23-16 crayon_1.3.4
[96] htmltools_0.4.0 later_1.0.0 Formula_1.2-3 lubridate_1.7.8 DBI_1.1.0
[101] tweenr_1.0.1 dbplyr_1.4.3 MASS_7.3-51.4 rappdirs_0.3.1 cli_2.0.2
[106] gdata_2.18.0 parallel_3.6.2 metap_1.3 igraph_1.2.4.2 pkgconfig_2.0.3
[111] sn_1.6-1 foreign_0.8-72 numDeriv_2016.8-1.1 xml2_1.3.0 multtest_2.42.0
[116] bibtex_0.4.2.2 rvest_0.3.5 digest_0.6.25 sctransform_0.2.1 RcppAnnoy_0.0.16
[121] tsne_0.1-3 rmarkdown_2.1 cellranger_1.1.0 leiden_0.3.3 htmlTable_1.13.3
[126] uwot_0.1.8 shiny_1.4.0.2 gtools_3.8.1 lifecycle_0.2.0 nlme_3.1-142
[131] jsonlite_1.6.1 viridisLite_0.3.0 fansi_0.4.1 pillar_1.4.3 lattice_0.20-38
[136] fastmap_1.0.1 httr_1.4.1 plotrix_3.7-8 survival_3.1-8 glue_1.4.0
[141] png_0.1-7 bit_1.1-15.2 ggforce_0.3.1 class_7.3-15 stringi_1.4.6
[146] latticeExtra_0.6-29 caTools_1.18.0 irlba_2.3.3 e1071_1.7-3 future.apply_1.5.0
[151] ape_5.3

Thanks for your help.

Best,
Yale

Can not run the cluster.plot and the run.diff.exp

Hello Reza,

I am using a datamatrix to make a iCellR and run the workflow. But when I run the cluster.plot and run.diff.exp, I got error again and again.

The cluster.plot error is:

Error in [.data.frame(DATA, , 2) : undefined columns selected

I tested this one and find the following code works:

[.data.frame(DATA)

But I don't know how to change it.

The run.diff.exp error is:

Error in [.data.frame(dat, , Cluster0) : object 'Cluster0' not found

As my clsuters are 1 to 11, I changed my cluster to 0 (-1) or Cluster0 (paste0("Cluster", Info$cluster-1))), it still doesn't work.

I found I always run the cluster issue, when I transfer the data from seurat, the cluster is 0-10, then I will run issue at findMarkers step because of the unmatched clusters, then if the cluster starts from 1, the findMarkers works, but the run.diff.exp doesn't work.

The print(my.obj)
###################################
An object of class iCellR version: 1.4.5
Raw/original data dimentions (rows,columns): 19831,92889
Data conditions: no conditions/single sample
Row names: RP11-34P13.7,FO538757.2,AP006222.2 ...
Columns names: AAACCTGAGACTGTAA.1,AAACCTGCAAGGTTTC.1,AAACCTGCATCTGGTA.1 ...
###################################
QC stats performed:FALSE, PCA performed:FALSE
Clustering performed:TRUE, Number of clusters:11
tSNE performed:TRUE, UMAP performed:FALSE, DiffMap performed:FALSE
Main data dimensions (rows,columns): 2468,92889
Normalization factors:,...
Imputed data dimensions (rows,columns):0,0
############## scVDJ-seq ###########
VDJ data dimentions (rows,columns):0,0
############## CITE-seq ############
ADT raw data dimensions (rows,columns):0,0
ADT main data dimensions (rows,columns):0,0
ADT columns names:...
ADT row names:...
############## scATAC-seq ############
ATAC raw data dimensions (rows,columns):0,0
ATAC main data dimensions (rows,columns):0,0
ATAC columns names:...
ATAC row names:...
############## Spatial ###########
Spatial data dimentions (rows,columns):0,0
########### iCellR object ##########

My sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] grid stats graphics grDevices utils datasets methods base

other attached packages:
[1] iCellR_1.4.5 plotly_4.9.1 hdf5r_1.3.0 gtable_0.3.0 RColorBrewer_1.1-2
[6] pheatmap_1.0.12 forcats_0.4.0 stringr_1.4.0 purrr_0.3.3 readr_1.3.1
[11] tidyr_1.0.0 tibble_2.1.3 ggplot2_3.2.1 tidyverse_1.3.0 pryr_0.1.4
[16] magrittr_1.5 data.table_1.12.6 xlsx_0.6.1 cowplot_1.0.0 Matrix_1.2-17
[21] dplyr_0.8.3 Seurat_3.1.1 formattable_0.2.0.1 classInt_0.4-2

loaded via a namespace (and not attached):
[1] rappdirs_0.3.1 R.methodsS3_1.7.1 acepack_1.4.1 bit64_0.9-7 knitr_1.26
[6] irlba_2.3.3 multcomp_1.4-11 R.utils_2.9.2 rpart_4.1-15 generics_0.0.2
[11] metap_1.2 BiocGenerics_0.30.0 TH.data_1.0-10 RSQLite_2.1.4 RANN_2.6.1
[16] europepmc_0.3 future_1.15.1 bit_1.1-14 enrichplot_1.4.0 mutoss_0.1-12
[21] xml2_1.2.2 lubridate_1.7.4 httpuv_1.5.2 assertthat_0.2.1 viridis_0.5.1
[26] xfun_0.11 hms_0.5.2 rJava_0.9-11 evaluate_0.14 promises_1.1.0
[31] fansi_0.4.0 progress_1.2.2 caTools_1.17.1.2 dbplyr_1.4.2 readxl_1.3.1
[36] igraph_1.2.4.1 DBI_1.1.0 htmlwidgets_1.5.1 reshape_0.8.8 stats4_3.6.1
[41] ggpubr_0.2.5 backports_1.1.5 gbRd_0.4-11 RcppParallel_4.4.4 vctrs_0.2.0
[46] Biobase_2.44.0 Cairo_1.5-10 ROCR_1.0-7 withr_2.1.2 ggforce_0.3.1
[51] triebeard_0.3.0 checkmate_1.9.4 sctransform_0.2.0 prettyunits_1.0.2 mnormt_1.5-5
[56] cluster_2.1.0 DOSE_3.10.2 ape_5.3 lazyeval_0.2.2 crayon_1.3.4
[61] labeling_0.3 pkgconfig_2.0.3 tweenr_1.0.1 nlme_3.1-140 nnet_7.3-12
[66] rlang_0.4.1 globals_0.12.5 lifecycle_0.1.0 sandwich_2.5-1 modelr_0.1.5
[71] rsvd_1.0.2 cellranger_1.1.0 polyclip_1.10-0 lmtest_0.9-37 graph_1.62.0
[76] urltools_1.7.3 zoo_1.8-6 reprex_0.3.0 base64enc_0.1-3 ggridges_0.5.1
[81] png_0.1-7 viridisLite_0.3.0 bitops_1.0-6 R.oo_1.23.0 KernSmooth_2.23-15
[86] blob_1.2.0 qvalue_2.16.0 jpeg_0.1-8.1 NbClust_3.0 gridGraphics_0.4-1
[91] ggsignif_0.6.0 S4Vectors_0.22.1 reactome.db_1.68.0 scales_1.1.0 memoise_1.1.0
[96] graphite_1.30.0 plyr_1.8.4 ica_1.0-2 gplots_3.0.1.1 bibtex_0.4.2
[101] gdata_2.18.0 compiler_3.6.1 lsei_1.2-0 plotrix_3.7-7 fitdistrplus_1.0-14
[106] cli_2.0.0 listenv_0.8.0 pbapply_1.4-2 htmlTable_1.13.3 Formula_1.2-3
[111] MASS_7.3-51.4 tidyselect_0.2.5 stringi_1.4.3 GOSemSim_2.10.0 latticeExtra_0.6-29
[116] ggrepel_0.8.1 fastmatch_1.1-0 tools_3.6.1 future.apply_1.3.0 parallel_3.6.1
[121] rstudioapi_0.10 foreign_0.8-71 gridExtra_2.3 scatterplot3d_0.3-41 farver_2.0.1
[126] Rtsne_0.15 ggraph_2.0.0 digest_0.6.22 rvcheck_0.1.7 BiocManager_1.30.10
[131] shiny_1.4.0 Rcpp_1.0.2 broom_0.5.3 SDMTools_1.1-221.1 later_1.0.0
[136] RcppAnnoy_0.0.13 httr_1.4.1 ggdendro_0.1-20 AnnotationDbi_1.46.1 npsurv_0.4-0
[141] Rdpack_0.11-1 colorspace_1.4-1 rvest_0.3.5 fs_1.3.1 reticulate_1.13
[146] IRanges_2.18.3 splines_3.6.1 uwot_0.1.4 sn_1.5-4 graphlayouts_0.5.0
[151] xlsxjars_0.6.1 multtest_2.40.0 ggplotify_0.0.4 xtable_1.8-4 jsonlite_1.6
[156] tidygraph_1.1.2 UpSetR_1.4.0 zeallot_0.1.0 R6_2.4.1 TFisher_0.2.0
[161] Hmisc_4.3-0 pillar_1.4.3 htmltools_0.4.0 mime_0.8 glue_1.3.1
[166] fastmap_1.0.1 BiocParallel_1.18.1 class_7.3-15 codetools_0.2-16 fgsea_1.10.1
[171] tsne_0.1-3 mvtnorm_1.0-11 lattice_0.20-38 numDeriv_2016.8-1.1 leiden_0.3.1
[176] gtools_3.8.1 ReactomePA_1.28.0 zip_2.0.4 GO.db_3.8.2 survival_2.44-1.1
[181] rmarkdown_2.0 munsell_0.5.0 e1071_1.7-3 DO.db_2.9 haven_2.2.0
[186] reshape2_1.4.3

Thanks,
Yale

Can I integrate data with iCellR?

I appreciate this great tools for analysis. Could I ask you that there is a integration function for multiple data or not.
I have about 50 patients data (all are the 10x files) which I want to merge them into one object. I find the "Combined Principal Component Alignment (CPCA)". But how can I use for my data. Thank you very much.

TPM instead of count data & no conditions

Hi @rezakj

Thank you for this great resource! I was just wondering, if I only have TPM data instead of count data, would that be okay for make.obj()?

Also, if I have no conditions like in example 1, how would I use data.aggregation()? And would it affect the later function calls?

Thanks!

How to normalize data with single condition?

Hello,

I created an object with data of single condition and tried to normalized it with ranked.glsf. But main.data is empty. I also tried to normalized it with no.norm. But main.data is still empty. Count you tell me how to fix it?

my.obj <- norm.data(my.obj, norm.method = "ranked.glsf", top.rank = 500)
my.obj <- norm.data(my.obj, norm.method = "no.norm")

data.scale question

Hi,

does the data.scale function process raw counts or normalized data ?

Thanks

VDJ analysis: can I use filtered.contig? Error when I run clono.plot

Hi rezakj, thank you for your useful package!

I have two questions:

  1. Can I run prep.vdj on filtered_contig_annotations.csv or it's better to have the all_contig_annotations.csv?
  2. I have an issue with the clono.plot function. I'll summarize the code so that maybe you can help me to find the issue:

`#Convert Seurat to iCellR object
my.data <- as.data.frame(as.matrix(sobj@assays$RNA@counts))
my.obj <- make.obj(my.data)

#Add Seurat's data
myUMAP <- as.data.frame(sobj@reductions$[email protected])
myPCA <- as.data.frame(sobj@reductions$[email protected])
Clust <- as.data.frame(as.matrix(Idents(sobj)))
colnames(Clust) <- "clusters"
[email protected] <- as.data.frame(sobj[["RNA"]]@DaTa)
[email protected] <- myUMAP
[email protected] <- myPCA
[email protected] <- Clust
my.obj@metadata <- [email protected]

#Check the new object
slotNames(my.obj)
my.obj

#Read in contig annotated files
contig_batch1 <- paste(PrimaryDirectory, "data_VDJ/20018-01", sep ='/')
contig_batch2 <- paste(PrimaryDirectory, "data_VDJ/20018-04", sep ='/')
my.vdj.1 <- read.csv(paste(contig_batch1,"06/S06.csv", sep = "/"))
my.vdj.2 <- read.csv(paste(contig_batch1,"07/S07.csv", sep = "/"))
my.vdj.3 <- read.csv(paste(contig_batch2, "01/filtered_contig_annotations.csv", sep = "/"))
my.vdj.4 <- read.csv(paste(contig_batch2, "02/filtered_contig_annotations.csv", sep = "/"))
my.vdj.5 <- read.csv(paste(contig_batch2, "03/filtered_contig_annotations.csv", sep = "/"))
my.vdj.6 <- read.csv(paste(contig_batch2, "04/filtered_contig_annotations.csv", sep = "/"))
my.vdj.7 <- read.csv(paste(contig_batch2, "05/filtered_contig_annotations.csv", sep = "/"))
my.vdj.8 <- read.csv(paste(contig_batch2, "06/filtered_contig_annotations.csv", sep = "/"))

#Run prep.vdj
my.vdj.1 <- prep.vdj(vdj.data = my.vdj.1, cond.name = "CR_bas_229")
my.vdj.2 <- prep.vdj(vdj.data = my.vdj.2, cond.name = "CR_post_229")
my.vdj.3 <- prep.vdj(vdj.data = my.vdj.3, cond.name = "NR_bas_V09")
my.vdj.4 <- prep.vdj(vdj.data = my.vdj.4, cond.name = "NR_post_V09")
my.vdj.5 <- prep.vdj(vdj.data = my.vdj.5, cond.name = "NR_bas_219")
my.vdj.6 <- prep.vdj(vdj.data = my.vdj.6, cond.name = "NR_post_219")
my.vdj.7 <- prep.vdj(vdj.data = my.vdj.7, cond.name = "CR_bas_264")
my.vdj.8 <- prep.vdj(vdj.data = my.vdj.8, cond.name = "CR_post_264")

Create vdj dataframe and add it to iCellR object
my.vdj.data <- rbind(my.vdj.1, my.vdj.2, my.vdj.3, my.vdj.4, my.vdj.5, my.vdj.6, my.vdj.7, my.vdj.8)
my.obj <- add.vdj(my.obj, vdj.data = my.vdj.data)

head(my.vdj.data)
raw_clonotype_id barcode is_cell contig_id high_confidence length chain v_gene d_gene j_gene c_gene full_length productive
1 clonotype1 CR_bas_229_TGGTTCCGTTGCTCCT.1 True TGGTTCCGTTGCTCCT-1_contig_1 True 530 TRB TRBV18 None TRBJ1-2 TRBC1 True True
2 clonotype1 CR_bas_229_GATCAGTGTTAGATGA.1 True GATCAGTGTTAGATGA-1_contig_1 True 489 TRA TRAV38-1 None TRAJ38 TRAC True True
3 clonotype1 CR_bas_229_GTTACAGTCCACGCAG.1 True GTTACAGTCCACGCAG-1_contig_2 True 530 TRB TRBV18 None TRBJ1-2 TRBC1 True True
4 clonotype1 CR_bas_229_CACCAGGAGTTAGCGG.1 True CACCAGGAGTTAGCGG-1_contig_2 True 573 TRA TRAV38-1 None TRAJ38 TRAC True True
5 clonotype1 CR_bas_229_CGATCGGTCGTCTGAA.1 True CGATCGGTCGTCTGAA-1_contig_1 True 479 TRA TRAV38-1 None TRAJ38 TRAC True True
6 clonotype1 CR_bas_229_TAAGTGCAGATAGTCA.1 True TAAGTGCAGATAGTCA-1_contig_2 True 514 TRB TRBV18 None TRBJ1-2 TRBC1 True True
cdr3 cdr3_nt reads umis raw_consensus_id my.raw_clonotype_id clonotype.Freq proportion total.colonotype
1 CASGAGPRGGYTF TGTGCCAGCGGCGCCGGGCCCCGAGGGGGCTACACCTTC 8090 10 clonotype1_consensus_1 clonotype1 153 0.05103402 2058
2 CAFMKHVISNNRKLIW TGTGCTTTCATGAAGCATGTTATCTCCAACAACCGTAAGCTGATTTGG 10824 6 clonotype1_consensus_2 clonotype1 153 0.05103402 2058
3 CASGAGPRGGYTF TGTGCCAGCGGCGCCGGGCCCCGAGGGGGCTACACCTTC 9800 12 clonotype1_consensus_1 clonotype1 153 0.05103402 2058
4 CAFMKHVISNNRKLIW TGTGCTTTCATGAAGCATGTTATCTCCAACAACCGTAAGCTGATTTGG 4944 6 clonotype1_consensus_2 clonotype1 153 0.05103402 2058
5 CAFMKHVISNNRKLIW TGTGCTTTCATGAAGCATGTTATCTCCAACAACCGTAAGCTGATTTGG 770 5 clonotype1_consensus_2 clonotype1 153 0.05103402 2058
6 CASGAGPRGGYTF TGTGCCAGCGGCGCCGGGCCCCGAGGGGGCTACACCTTC 8696 10 clonotype1_consensus_1 clonotype1 153 0.05103402 2058

clonotype.frequency <- as.data.frame(sort(table(as.character(as.matrix(([email protected])[1]))),decreasing = TRUE))

head(clonotype.frequency)
Var1 Freq
1 clonotype1 1577
2 clonotype2 870
3 clonotype3 611
4 clonotype4 516
5 clonotype7 370
6 clonotype6 361

clono.plot(my.obj, plot.data.type = "umap",
clonotype.column = 1,
barcode.column = 2,
clono = "clonotype1",
conds.to.plot = NULL,
cell.transparency = 1,
clust.dim = 2,
interactive = F)

Error in $<-.data.frame(*tmp*, "MyCol", value = "red") :
replacement has 1 row, data has 0

sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.4 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] scales_1.1.1 circlize_0.4.10 scater_1.16.2 SingleCellExperiment_1.12.0 SummarizedExperiment_1.20.0 Biobase_2.50.0 [7] GenomicRanges_1.42.0 GenomeInfoDb_1.26.1 IRanges_2.24.0 S4Vectors_0.28.0 BiocGenerics_0.36.0 MatrixGenerics_1.2.0 [13] matrixStats_0.57.0 rstudioapi_0.13 scRepertoire_0.99.16 reshape2_1.4.4 ggalluvial_0.12.2 dplyr_1.0.2 [19] cowplot_1.1.0 Seurat_3.2.0 iCellR_1.5.8 plotly_4.9.2.1 ggplot2_3.3.2 loaded via a namespace (and not attached): [1] reticulate_1.18 tidyselect_1.1.0 htmlwidgets_1.5.2 BiocParallel_1.22.0 grid_4.0.2 Rtsne_0.15 [7] munsell_0.5.0 codetools_0.2-16 ica_1.0-2 future_1.19.1 miniUI_0.1.1.1 withr_2.3.0 [13] colorspace_2.0-0 knitr_1.30 ROCR_1.0-11 ggsignif_0.6.0 tensor_1.5 listenv_0.8.0 [19] GenomeInfoDbData_1.2.4 polyclip_1.10-0 bit64_4.0.5 pheatmap_1.0.12 vctrs_0.3.5 generics_0.1.0 [25] xfun_0.19 R6_2.5.0 doParallel_1.0.16 ggbeeswarm_0.6.0 rsvd_1.0.3 VGAM_1.1-3 [31] hdf5r_1.3.3 bitops_1.0-6 spatstat.utils_1.17-0 reshape_0.8.8 DelayedArray_0.16.0 promises_1.1.1 [37] nnet_7.3-14 beeswarm_0.2.3 gtable_0.3.0 globals_0.13.1 goftest_1.2-2 rlang_0.4.8 [43] scatterplot3d_0.3-41 GlobalOptions_0.1.2 splines_4.0.2 rstatix_0.6.0 lazyeval_0.2.2 broom_0.7.2 [49] checkmate_2.0.0 abind_1.4-5 backports_1.2.0 httpuv_1.5.4 Hmisc_4.4-1 tools_4.0.2 [55] cubature_2.0.4.1 ellipsis_0.3.1 RColorBrewer_1.1-2 ggdendro_0.1.22 ggridges_0.5.2 Rcpp_1.0.5 [61] plyr_1.8.6 base64enc_0.1-3 progress_1.2.2 zlibbioc_1.36.0 purrr_0.3.4 RCurl_1.98-1.2 [67] prettyunits_1.1.1 ggpubr_0.4.0 rpart_4.1-15 deldir_0.1-29 viridis_0.5.1 pbapply_1.4-3 [73] zoo_1.8-8 haven_2.3.1 ggrepel_0.8.2 cluster_2.1.0 colorRamps_2.3 magrittr_2.0.1 [79] data.table_1.13.2 openxlsx_4.2.2 SparseM_1.78 NbClust_3.0 lmtest_0.9-38 RANN_2.6.1 [85] truncdist_1.0-2 fitdistrplus_1.1-1 gsl_2.1-6 hms_0.5.3 patchwork_1.1.0 mime_0.9 [91] xtable_1.8-4 rio_0.5.16 jpeg_0.1-8.1 readxl_1.3.1 shape_1.4.5 gridExtra_2.3 [97] compiler_4.0.2 tibble_3.0.4 KernSmooth_2.23-17 crayon_1.3.4 htmltools_0.5.0 mgcv_1.8-33 [103] later_1.1.0.1 Formula_1.2-4 tidyr_1.1.2 powerTCR_1.8.0 MASS_7.3-53 rappdirs_0.3.1 [109] Matrix_1.2-18 car_3.0-10 permute_0.9-5 evd_2.3-3 igraph_1.2.6 forcats_0.5.0 [115] pkgconfig_2.0.3 foreign_0.8-80 foreach_1.5.1 vipor_0.4.5 stringdist_0.9.6.3 XVector_0.30.0 [121] stringr_1.4.0 digest_0.6.27 sctransform_0.2.1 RcppAnnoy_0.0.17 vegan_2.5-6 spatstat.data_1.4-3 [127] Biostrings_2.56.0 cellranger_1.1.0 leiden_0.3.3 htmlTable_2.1.0 uwot_0.1.8 DelayedMatrixStats_1.10.1 [133] curl_4.3 evmix_2.12 shiny_1.5.0 lifecycle_0.2.0 nlme_3.1-149 jsonlite_1.7.1 [139] BiocNeighbors_1.6.0 carData_3.0-4 viridisLite_0.3.0 pillar_1.4.7 lattice_0.20-41 tcR_2.3.2 [145] fastmap_1.0.1 httr_1.4.2 survival_3.2-7 glue_1.4.2 zip_2.1.1 spatstat_1.64-1 [151] png_0.1-7 iterators_1.0.13 bit_4.0.4 stringi_1.5.3 ggfittext_0.9.0 BiocSingular_1.4.0 [157] latticeExtra_0.6-29 irlba_2.3.3 future.apply_1.6.0 ape_5.4-1
--
`
Any suggestion on how to solve this issue?

Thank you very much

Francesco

converting old iCellR object to new object

Hi!

I received the following error while trying to remove clusters in one of my objects:

Error in clust.rm(my.obj, clust.to.rm = 1):
no slot of name "knetl.data" for this object of class "iCellR"

Given that Knetl is a new feature on iCellR, my older object its likely incompatible. Is there a way to convert my old object into a new iCellR object?

Error: cannot allocate vector of size 6288.3 Gb

Hi there,

I tried to analyze 10x scRNA-seq data by iCellR workflow, however, I failed to load the data on Rstudio server (barcodes.tsv, features.tsv, matrix.mtx). The error popped out is 'Error: cannot allocate vector of size 6288.3 Gb'. I also tried it on my mac, still failed, with error 'Error: vector memory exhausted (limit reached?)'. I tried to change the memory size of my mac, but not working.

Then I use fread function to check the structure of my data and compare the difference between my data and the example data, the only difference is the dimensions of the data frame and my features.tsv. The example genes.tsv only has 2 columns, while mine has three, so I deleted the redundant column. Still, I failed to load the data with the same error.

  1. season info of the Rstudio server
    `R version 3.6.3 (2020-02-29)
    Platform: x86_64-pc-linux-gnu (64-bit)
    Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /home/fany/.local/share/r-miniconda/envs/r-reticulate/lib/libmkl_rt.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] Matrix_1.2-18 readr_1.3.1 data.table_1.12.8 SeuratData_0.2.1 patchwork_1.0.1
[6] stringr_1.4.0 dplyr_1.0.0 Seurat_3.1.5 iCellR_1.5.1 plotly_4.9.2.1
[11] ggplot2_3.3.2

loaded via a namespace (and not attached):
[1] Rtsne_0.15 colorspace_1.4-1 ggsignif_0.6.0 ellipsis_0.3.1 rio_0.5.16
[6] ggridges_0.5.2 htmlTable_2.0.0 base64enc_0.1-3 ggdendro_0.1-20 rstudioapi_0.11
[11] leiden_0.3.3 ggpubr_0.4.0 listenv_0.8.0 ggrepel_0.8.2 bit64_0.9-7
[16] fansi_0.4.1 codetools_0.2-16 splines_3.6.3 knitr_1.29 Formula_1.2-3
[21] jsonlite_1.7.0 ica_1.0-2 broom_0.5.6 cluster_2.1.0 png_0.1-7
[26] pheatmap_1.0.12 uwot_0.1.8 sctransform_0.2.1 shiny_1.5.0 compiler_3.6.3
[31] httr_1.4.1 backports_1.1.8 assertthat_0.2.1 fastmap_1.0.1 lazyeval_0.2.2
[36] cli_2.0.2 later_1.1.0.1 acepack_1.4.1 htmltools_0.5.0 prettyunits_1.1.1
[41] tools_3.6.3 rsvd_1.0.3 igraph_1.2.5 gtable_0.3.0 glue_1.4.1
[46] reshape2_1.4.4 RANN_2.6.1 rappdirs_0.3.1 Rcpp_1.0.4.6 carData_3.0-4
[51] cellranger_1.1.0 vctrs_0.3.1 ape_5.4 nlme_3.1-144 lmtest_0.9-37
[56] xfun_0.15 globals_0.12.5 openxlsx_4.1.5 irlba_2.3.3 mime_0.9
[61] lifecycle_0.2.0 rstatix_0.6.0 future_1.17.0 zoo_1.8-8 MASS_7.3-51.5
[66] scales_1.1.1 hms_0.5.3 promises_1.1.1 parallel_3.6.3 NbClust_3.0
[71] RColorBrewer_1.1-2 curl_4.3 pbapply_1.4-2 reticulate_1.16 gridExtra_2.3
[76] rpart_4.1-15 reshape_0.8.8 latticeExtra_0.6-29 stringi_1.4.6 checkmate_2.0.0
[81] zip_2.0.4 rlang_0.4.6 pkgconfig_2.0.3 lattice_0.20-40 ROCR_1.0-11
[86] purrr_0.3.4 htmlwidgets_1.5.1 cowplot_1.0.0 bit_1.1-15.2 tidyselect_1.1.0
[91] RcppAnnoy_0.0.16 plyr_1.8.6 magrittr_1.5 R6_2.4.1 generics_0.0.2
[96] Hmisc_4.4-0 pillar_1.4.4 haven_2.3.1 foreign_0.8-75 withr_2.2.0
[101] fitdistrplus_1.1-1 survival_3.1-11 scatterplot3d_0.3-41 abind_1.4-5 nnet_7.3-13
[106] tsne_0.1-3 tibble_3.0.1 future.apply_1.5.0 crayon_1.3.4 car_3.0-8
[111] hdf5r_1.3.2 KernSmooth_2.23-16 jpeg_0.1-8.1 progress_1.2.2 grid_3.6.3
[116] readxl_1.3.1 forcats_0.5.0 digest_0.6.25 xtable_1.8-4 tidyr_1.1.0
[121] httpuv_1.5.4 munsell_0.5.0 viridisLite_0.3.0 `

  1. season info of my mac
    `R version 3.6.3 (2020-02-29)
    Platform: x86_64-apple-darwin15.6.0 (64-bit)
    Running under: macOS Catalina 10.15.2

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] iCellR_1.5.4 plotly_4.9.2.1 ggplot2_3.3.2

loaded via a namespace (and not attached):
[1] nlme_3.1-148 bit64_0.9-7 RColorBrewer_1.1-2 progress_1.2.2
[5] httr_1.4.1 tools_3.6.3 backports_1.1.8 R6_2.4.1
[9] rpart_4.1-15 Hmisc_4.4-0 uwot_0.1.8 lazyeval_0.2.2
[13] colorspace_1.4-1 nnet_7.3-14 withr_2.2.0 tidyselect_1.1.0
[17] gridExtra_2.3 prettyunits_1.1.1 bit_1.1-15.2 curl_4.3
[21] compiler_3.6.3 htmlTable_2.0.0 hdf5r_1.3.2 ggdendro_0.1-20
[25] scales_1.1.1 checkmate_2.0.0 stringr_1.4.0 digest_0.6.25
[29] foreign_0.8-75 rmarkdown_2.3 rio_0.5.16 base64enc_0.1-3
[33] jpeg_0.1-8.1 pkgconfig_2.0.3 htmltools_0.5.0 fastmap_1.0.1
[37] readxl_1.3.1 htmlwidgets_1.5.1 rlang_0.4.6 rstudioapi_0.11
[41] shiny_1.5.0 generics_0.0.2 jsonlite_1.7.0 acepack_1.4.1
[45] dplyr_1.0.0 zip_2.0.4 car_3.0-8 magrittr_1.5
[49] Formula_1.2-3 NbClust_3.0 Matrix_1.2-18 Rcpp_1.0.4.6
[53] munsell_0.5.0 abind_1.4-5 ape_5.4 lifecycle_0.2.0
[57] scatterplot3d_0.3-41 stringi_1.4.6 yaml_2.2.1 carData_3.0-4
[61] MASS_7.3-51.6 Rtsne_0.15 plyr_1.8.6 grid_3.6.3
[65] parallel_3.6.3 promises_1.1.1 ggrepel_0.8.2 forcats_0.5.0
[69] crayon_1.3.4 lattice_0.20-41 haven_2.3.1 splines_3.6.3
[73] hms_0.5.3 knitr_1.29 pillar_1.4.4 igraph_1.2.5
[77] ggpubr_0.4.0 ggsignif_0.6.0 glue_1.4.1 evaluate_0.14
[81] latticeExtra_0.6-29 data.table_1.12.8 png_0.1-7 vctrs_0.3.1
[85] httpuv_1.5.4 cellranger_1.1.0 gtable_0.3.0 RANN_2.6.1
[89] purrr_0.3.4 tidyr_1.1.0 reshape_0.8.8 xfun_0.15
[93] openxlsx_4.1.5 mime_0.9 xtable_1.8-4 broom_0.5.6
[97] rstatix_0.6.0 later_1.1.0.1 survival_3.2-3 viridisLite_0.3.0
[101] tibble_3.0.1 pheatmap_1.0.12 cluster_2.1.0 ellipsis_0.3.1 `

  1. the screenprint
    image

image

image

I'm really puzzled. Can you kindly help me?

Best.

Fan

VDJ data loading

Hi, Thanks for this helpful tool.
I have a question to load VDJ data from 10X sequencing. I wanted to use prep.vdj function to load data, but the code has an error like this..
my.vdj.1 <- prep.vdj(vdj.data = "G3_all_contig_annotations.csv", cond.name = "Pre")

Error in subset.default(my.vdj, productive == "True") :
object 'productive' not found

In my data, 'productive' object exists with True and False. I don't know why this is not working.
Please let me know what is needed to be correct.
Many thanks.

'replacement has 1 row, data has 0' when creating iCellR object

after reading in the all_contig_anotations.csv file as a data frame and trying to use the prep.vdj( ) function on that output, exactly the same as in the tutorial, I get this error (which seems to be related to the prep.vdj function):

my.vdj <- prep.vdj(vdj.data = my.vdj, cond.name = "NULL") Error in $<-.data.frame(tmp, "total.colonotype", value = 0L) : replacement has 1 row, data has 0

Possible to cluster based on ADT data alone?

Hi I was wondering if it is possible to cluster on the ADT data alone, and if so, how to go about it? When I try using Seurat and PCA I get a warning message "The object only has information for 25 reductions" since we have only run 25 Abs, and subsequently we only get 1-2 PC's.

If I try using creating a distance matrix as per the seurat vignette the object is too large and the computer will not accept it. Is it possible in iCellR?

Coloring not working for TSNE and UMAP plots for option color by = "Cluster"

Coloring by the cluster is not working for umap and tsne plots-

> cluster.plot(my.obj,plot.type = "umap",interactive = F)

Error: Aesthetics must be either length 1 or the same as the data (36689): colour

> cluster.plot(my.obj,plot.type = "umap",interactive = F,col.by="clusters")

Error: Aesthetics must be either length 1 or the same as the data (36689): colour

conds.to.plot in gene.plot not functional

Hi Reza,

I am trying to look at gene expression of a specific condition in my UMAP from a converted Seurat object, however, whenever I try to specify the condition this plot comes up blank. I've checked to make sure the names of the conditions when I initially aggregated the data match the name that I specify in the command, however that does not seem to be the issue. When I try to plot without conds.to.plot it works, however when I add the command back in I receive the blank image. Any suggestions?

Here is the code I used when aggregating the data:
my.data <- data.aggregation(samples = c("sample1","sample2", "sample3", "sample4", "sample5"), condition.names = c("S1","S2", "S3", "S4", "S5"))

and here is the code I used to call the plot, which comes out blank

C <- gene.plot(my.obj, gene = "NFIB", plot.type = "scatterplot", conds.to.plot = "S2", interactive = F, out.name = "scatter_plot", plot.data.type = "umap")

could not generate interactive figures.

Hi,

I am trying to plot the interactive figs by following the introduction. However, I can only generate non-interactive plots but not the interactive ones. Could you please help me with this issue.

Many thanks!

My sessioninfo:

R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14.5

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] parallel stats4 grid stats graphics grDevices utils datasets methods
[10] base

other attached packages:
[1] bindrcpp_0.2.2 scatterplot3d_0.3-41 rgl_0.100.19 plot3D_1.1.1
[5] GSEABase_1.44.0 graph_1.60.0 annotate_1.58.0 XML_3.98-1.11
[9] DOSE_3.6.1 org.Hs.eg.db_3.7.0 AnnotationDbi_1.44.0 IRanges_2.14.10
[13] S4Vectors_0.18.3 Biobase_2.40.0 BiocGenerics_0.26.0 circlize_0.4.6
[17] clusterProfiler_3.10.1 ReactomePA_1.26.0 ComplexHeatmap_1.18.1 Rtsne_0.13
[21] umap_0.2.2.0 iCellR_0.99.0 plotly_4.9.0 ggplot2_3.0.0
[25] plyr_1.8.4 dplyr_0.7.6 readxl_1.1.0 readr_1.1.1

Method for gene-gene correlations

Hi iCellR,

Thanks for the great cell-cell correlation function. Do you mind tell me which method we used for the correlation analysis, eg Pearson, Spearman, etc? Also, we don't quite understand why

if (PVal == 0) {
       PVal = "2.2e-16"
   }

Is it the reason our PVal is very small?

thanks,
Yale

iCellR object fails to function when using multiple samples with same condition

I tried aggregating two different datasets into one object as they have the same conditions/groups.
The data aggregation step and iCellR object creation step were successful but the iCellR fails to run any other function (for this instance, QC stats does not work).

ag.data <- data.aggregation(samples=c("h1.data","h1.data_2","h2.data","h2.data_2","h3.data","h3.data_2","h5.data","h5.data_2","h7.data","h7.data_2"),condition.names = c("H1","H1","H2","H2","H3","H3","H5","H5","H7","H7"))

my.data <- make.obj(ag.data)
qc.obj <- qc.stats(my.data)

Error in if (mito.genes[1] != "default.genes") { :
missing value where TRUE/FALSE needed


Here is the iCellR object description

###################################
,--. ,-----. ,--.,--.,------.
--'' .--./ ,---. | || || .--. ' ,--.| | | .-. :| || || '--'.' | |' '--'\ --. | || || | --' -----' ----'--'--'`--' '--'
###################################
An object of class iCellR version: 1.1.4
Raw/original data dimentions (rows,columns): 11966,11900
Data conditions in raw data: H1,H2,H3,H5,H7 (2459,2561,3018,2527,1335)
Row names: A1CF,A2ML1,A2ML1.AS1 ...
Columns names: H7_Hashtag7,H7_Hashtag7.1,H7_Hashtag7.2 ...
###################################
QC stats performed:FALSE, PCA performed:FALSE, CCA performed:FALSE
Clustering performed:FALSE, Number of clusters:0
tSNE performed:FALSE, UMAP performed:FALSE, DiffMap performed:FALSE
Main data dimentions (rows,columns):0,0
Normalization factors:,...
Imputed data dimentions (rows,columns):0,0
############## scVDJ-Seq ###########
VDJ data dimentions (rows,columns):0,0
############## CITE-Seq ############
ADT raw data dimentions (rows,columns):0,0
ADT main data dimentions (rows,columns):0,0
ADT columns names:...
ADT row names:...
########### iCellR object ##########


how to produce a coverage corrected matrix

Thank you for sharing the pipeline!

I wonder if we can create a coverage corrected matrix and export it for other methods. The batch.aligned.data slot is empty after running combined coverage correction alignment.

Thank you,

James

qc.stats() missing value where TRUE/FALSE needed

Hi ! dear developer,iCellR is a powerful tools,but I got it wrong at the beginning of the test,my code like this :

library(iCellR)
datt=as.data.frame(t(read.table("A11_Positive_DataFrame.csv",sep=",",header = T,stringsAsFactors = F,row.names = 1)))
dim(datt)
##[1] 19148 192
head(datt)[1:2]

AAAGCAATCTGTGCAA-1 AACCGCGCATGCCTAA-1

##FO538757.3 0 0
##FO538757.2 0 0
##AP006222.2 0 0
##FAM87B 0 0
##LINC00115 0 0
##FAM41C 0 0

my.obj <- make.obj(datt)

###################################
,--. ,-----. ,--.,--.,------.
--'' .--./ ,---. | || || .--. ' ,--.| | | .-. :| || || '--'.' | |' '--'\ --. | || || | --' -----' ----'--'--'`--' '--'
###################################
An object of class iCellR version: 1.2.5
Raw/original data dimentions (rows,columns): 19148,192
Data conditions: no conditions/single sample
Row names: FO538757.3,FO538757.2,AP006222.2 ...
Columns names: AAAGCAATCTGTGCAA-1,AACCGCGCATGCCTAA-1,AACTCCCTCCAAATGC-1 ...
###################################
QC stats performed:FALSE, PCA performed:FALSE, CCA performed:FALSE
Clustering performed:FALSE, Number of clusters:0
tSNE performed:FALSE, UMAP performed:FALSE, DiffMap performed:FALSE
Main data dimentions (rows,columns):0,0
Normalization factors:,...
Imputed data dimentions (rows,columns):0,0
############## scVDJ-Seq ###########
VDJ data dimentions (rows,columns):0,0
############## CITE-Seq ############
ADT raw data dimentions (rows,columns):0,0
ADT main data dimentions (rows,columns):0,0
ADT columns names:...
ADT row names:...
########### iCellR object ##########

make iCellR object is no error ,then i make qc for my.obj,like this :
my.obj <- qc.stats(my.obj)

Error in if (mito.genes[1] != "default.genes") { :
missing value where TRUE/FALSE needed

I want to know if my gene name format is wrong,or i lose some genes?

Labelling cells with cluster

Hi @rezakj

Thanks again for this resource. I was wondering if there is any function or iCellR object parameter that can label each cell with the specific cluster?

Thank you!

Ploting colonotypes

Hi there,
I am tring to use these package to analysis my scTCR data. When I try to plot colonotypes, I am quite confused about the clono paremeters in clono.plot function. This parameter seems only recepted number rather than raw_clonotype_id in vdj data. I am curious about how can I pre-define this clonotype name.
Best,
Yan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.