danko-lab / rtfbs_db Goto Github PK
View Code? Open in Web Editor NEWParse TF motifs from public databases, read into R, and scan using 'rtfbs'.
License: GNU General Public License v3.0
Parse TF motifs from public databases, read into R, and scan using 'rtfbs'.
License: GNU General Public License v3.0
Following the package vignette exactly but changing lowest.reads.RPKM=5
causes this error:
> tfs2 <- tfbs.selectExpressedMotifs(tfs,
+ file.hg19.twobit.chr19,
+ file.gencode.gtf.chr19,
+ file.bigwig.plus.chr19,
+ file.bigwig.minus.chr19,
+ pvalue.threshold=0.001,
+ include.DBID.missing=TRUE,
+ lowest.reads.RPKM=5,
+ seq.datatype="PRO-seq");
12758 transcripts are selected from GENCODE dataset for PRO-seq .
13591245 Reads in /home/gvillafano/R/x86_64-pc-linux-gnu-library/3.3/rtfbsdb/extdata/GSM1480327_K562_PROseq_chr19_plus.bw and /home/gvillafano/R/x86_64-pc-linux-gnu-library/3.3/rtfbsdb/extdata/GSM1480327_K562_PROseq_chr19_minus.bw
* 1851 motifs did not find DBID in the Gencode file.
* 1957 expressed TFs are selected from 1964 motifs after filtering by the gene expression.
> tfs <- tfbs.clusterMotifs(tfs2, method="agnes", pdf.heatmap="heatmap.pdf", ncores=11)
Error in 1 - mat : non-numeric argument to binary operator
In addition: Warning message:
In mclapply(1:tfbs@ntfs, function(i) { :
all scheduled cores encountered errors in user code
The reason is include.DBID.missing=TRUE
here results in NULL values, and any NULL
values in the tfs.filt@pwm
list are problematic because they then cause tfbs.clusterMotifs()
to fail.
> sum(sapply(tfs2@pwm, is.null))
[1] 1851
As a workaround one can set include.DBID.missing=FALSE
Not sure what the best solution here would be ... perhaps subset the pwm
matrix list to remove the NULL values and store the problematic DBID indices in another S4 slot for later inspection?
Using git version 5dbf18d of the rtfbs_db
package
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.2 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid parallel stats4 stats graphics grDevices utils datasets
[9] methods base
other attached packages:
[1] rGADEM_2.22.0 seqLogo_1.40.0 BSgenome_1.42.0 rtracklayer_1.34.2
[5] TFBSTools_1.12.2 rtfbsdb_0.4.0 Biostrings_2.42.1 XVector_0.14.1
[9] GenomicRanges_1.26.4 GenomeInfoDb_1.10.3 IRanges_2.8.2 S4Vectors_0.12.2
[13] BiocGenerics_0.20.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.10 lattice_0.20-34 GO.db_3.4.0
[4] png_0.1-7 Rsamtools_1.26.2 gtools_3.5.0
[7] digest_0.6.12 R6_2.2.0 plyr_1.8.4
[10] RSQLite_1.1-2 httr_1.2.1 ggplot2_2.2.1
[13] zlibbioc_1.20.0 lazyeval_0.2.0 annotate_1.52.1
[16] rtfbs_0.3.5 R.utils_2.5.0 R.oo_1.21.0
[19] Matrix_1.2-8 apcluster_1.4.3 splines_3.3.2
[22] BiocParallel_1.8.2 readr_1.1.0 stringr_1.2.0
[25] CNEr_1.10.2 bigWig_0.2-9 RCurl_1.95-4.8
[28] munsell_0.4.3 DirichletMultinomial_1.16.0 SummarizedExperiment_1.4.0
[31] KEGGREST_1.14.1 tibble_1.3.0 XML_3.98-1.6
[34] TFMPvalue_0.0.6 GenomicAlignments_1.10.1 bitops_1.0-6
[37] R.methodsS3_1.7.1 xtable_1.8-2 gtable_0.2.0
[40] DBI_0.6-1 magrittr_1.5 scales_0.4.1
[43] stringi_1.1.5 reshape2_1.4.2 latticeExtra_0.6-28
[46] vioplot_0.2 RColorBrewer_1.1-2 tools_3.3.2
[49] Biobase_2.34.0 poweRlaw_0.70.0 hms_0.3
[52] rphast_1.6.5 AnnotationDbi_1.36.2 colorspace_1.3-2
[55] cluster_2.0.6 caTools_1.17.1 memoise_1.0.0
[58] VGAM_1.0-3
>
Hello.
Please consider inclusion of motifs from the biggest and very well performing human&mouse motif database Hocomoco-10 (http://hocomoco.autosome.ru/). It contains motifs for 601 human TFs (and where possible these motifs are non-redundant).
On downloads page (http://hocomoco.autosome.ru/downloads) you can get motifs in several formats.
Hope you will find our database interesting and useful for your project.
Best, Ilya.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.