zjdaye / mdseq Goto Github PK
View Code? Open in Web Editor NEWMDSeq: Gene expression mean and variability analysis for RNA-seq counts
License: Other
MDSeq: Gene expression mean and variability analysis for RNA-seq counts
License: Other
Hi, I think there is an error in the calculation of normalised counts using normalize.counts() when using TMM or RLE.
The function gets normalisation factors using edgeR's calcNormFactors() and divides the counts by the resulting factors, but the factors returned by calcNormFactors should be applied to library sizes, not directly to counts. I think that the factors (nf) calculated in the vignette to be used as offsets are the factors that should be used to directly normalise counts, i.e.:
cnf <- calcNormFactors(dat.filtered, method="TMM")
libsize <- colSums(dat.filtered)
rellibsize <- libsize/exp(mean(log(libsize)))
nf <- cnf * rellibsize
In normalize.counts(), these could be obtained as:
y$samples$norm.factors * y$samples$lib.size / exp(mean(log(y$samples$lib.size)))
rather than just using y$samples$norm.factors.
Please let me know if I've misinterpreted something here, but I think this is right. Thanks.
I tried using MDSeq(Seuratobject)
, so all parameters are default, and got this message
Error in as.data.frame.default(count) :
cannot coerce class ‘structure("Seurat", package = "Seurat")’ to a data.frame
This data is PBMC 5k and has been through a more or less "standard" Seurat pipeline, having just completed DE analysis. I am looking to find "Cluster vs All" genes, similar to Seurat's FindAllMarkers
function.
I've run MDSeq(...)
, but the results don't seem to include results of this particular test (and its p-value), although I'm not sure if I'm interpreting the results correctly.
Is it there? Is it extract.ZIMD
? Or something else?
Hi!
I would like your help. I have read your paper titled "Gene expression variability and the analysis of large-scale RNA-seq studies with the MDSeq" which is a excellent work. So I plan to use the MDSeq to detect the gene expression variability. However, I get error messages when I try to detect the outlier. It seems that all the genes were detected as outlier.
Why is that? Could you give me any hints and how to fixed this issue?
> dat.checked <- remove.outlier(dat.normalized[1:1000, ], X=covars, U=covars,
+ contrast = groups, mc.cores = 4)
4 threads are using!
Total time elapsed:3.9 seconds
> head(dat.checked$outliers)
status num.outliers
ENSG00000227232.4 2 NA
ENSG00000238009.2 2 NA
ENSG00000237683.5 2 NA
ENSG00000239906.1 2 NA
ENSG00000241860.2 2 NA
ENSG00000228463.4 2 NA
> table(dat.checked$outliers$status)
2
1000
>
> covars
GENDER AGE RACE BMI SMRIN SMTSISCH X1 X2 X3 X4
Normal1 1 63 3 29.53 5.9 1017 0.1630506541 -1.095766e-01 -0.0056186635 0.0540478146
Normal2 1 62 3 30.78 8.1 133 -0.0688636685 -1.749981e-02 0.0345506232 0.0869748239
COPD1 2 66 3 25.82 7.6 840 0.1037386933 -9.394710e-02 -0.0855297718 0.0384818582
Normal3 1 58 3 29.83 6.2 825 0.1425086228 6.758439e-02 0.0474234960 -0.0076811603
COPD2 1 58 3 27.02 6.4 1218 0.0328722772 5.871772e-02 0.0569128696 -0.0170814274
COPD3 1 62 3 29.54 5.8 991 0.1076800161 7.706005e-02 0.0719706122 -0.0013471444
COPD4 2 66 3 20.12 7.6 670 -0.0160420656 9.598426e-02 -0.0436259659 0.1806682021
Normal4 2 66 3 29.05 6.9 773 0.0994010357 3.385225e-02 -0.0921304267 0.0165312554
Normal5 2 51 3 29.35 7.2 80 -0.0787204805 -6.110502e-02 -0.0235813528 0.0084930880
Finally, my used sessionInfo()
Best wishes
Kevin
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936 LC_CTYPE=Chinese (Simplified)_China.936
[3] LC_MONETARY=Chinese (Simplified)_China.936 LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.936
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] gtools_3.5.0 edgeR_3.18.1 limma_3.32.2
[4] MDSeq_1.0.5 BiocInstaller_1.26.0 devtools_1.13.2
[7] org.Hs.eg.db_3.4.0 AnnotationDbi_1.38.1 BiocParallel_1.10.1
[10] WGCNA_1.51 fastcluster_1.1.22 dynamicTreeCut_1.63-1
[13] R.matlab_3.6.1 RColorBrewer_1.1-2 clusterProfiler_3.4.3
[16] DOSE_3.2.0 gplots_3.0.1 ggplot2_2.2.1
[19] sva_3.24.0 genefilter_1.58.1 mgcv_1.8-17
[22] nlme_3.1-131 DESeq2_1.16.1 SummarizedExperiment_1.6.3
[25] DelayedArray_0.2.7 matrixStats_0.52.2 Biobase_2.36.2
[28] GenomicRanges_1.28.3 GenomeInfoDb_1.12.2 IRanges_2.10.2
[31] S4Vectors_0.14.3 BiocGenerics_0.22.0
Install MDSeq from local source with
install.packages("MDSeq_1.0.5.tar.gz", repos=NULL, type="source")
Where can I get MDSeq_1.0.5.tar.gz from? This git has no releases. I can download the entire git (default MDSeq-master.zip) and
install.packages("MDSeq-master.zip", repos=NULL, type="source")
but I get
Installing package into ‘C:/Users/<NAME>/Documents/R/win-library/4.0’ (as ‘lib’ is unspecified)
and then
library(MDSeq) and library(MDSeq-master) both gives no package called...
respectively. Did it install correctly but named differently?
Install MDSeq from GitHub with
library(devtools)
install_github("zjdaye/MDSeq")
Gives this error
Error: Failed to install 'MDSeq' from GitHub: (converted from warning) package ‘cqn’ is not available (for R version 4.0.2)
I'm having trouble running your code with a 2 columns of groups:
>XX
DataFrame with 300 rows and 2 columns
Strain Condition
<factor> <factor>
17d RIM15 SDC
19d SLT2 SDC
20d SNF1 SDC
21d SSN3 SDC
22d STE11 SDC
... ... ...
30s YGK3 Salt
32s YPK3 Salt
8r IRE1 Rapamycin
9r KIN1 Rapamycin
15t PKC1 Tunicamycin
Which I can use to run MDSeq:
fit <- MDSeq(counts, contrast = XX, mc.cores = 30)
But now I can't run extract.ZIMD
. Here is what I have tried:
extract.ZIMD(fit, compare = list(A='StrainWT', B='StrainPBS2'))
extract.ZIMD(fit, compare = list(A='WT', B='PBS2'))
Any advice?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.