cxzhu / simple-seq Goto Github PK
View Code? Open in Web Editor NEWJoint analysis of 5mC and 5hmC from single cells
Joint analysis of 5mC and 5hmC from single cells
Dear authors:
I am currently using the SIMPLE-seq preprocessing pipeline available on GitHub to process the data mentioned in the article, GSE197740 (mESC_SIMPLE, PBMC_SIMPLE, and mouse_brain_SIMPLE). However, this preprocessing pipeline is only applicable for processing paired-end sequencing data. Therefore, I have only processed GSM7291623 (mouse_brain_SIMPLE). It would be great if you could provide a workflow for processing SIMPLE-seq single-end data.
While processing the GSM7291623 (mouse_brain_SIMPLE) data, I encountered some issues. It would be great if you could help me resolve them. During the mapping to the genome step, while using Bowtie2 for alignment, I encountered the following error:
"Error:ReadSRR24423085.6972:AAATTGCTAG:TTAATACAACTATGTGGTCCAGCGGCCACTATCCCACTCCTATCCAAGGTCGAGGGCCAACATTAGGAAGGCTGCCGCCACCACTGTGAGGGCGCCTGACCATGACCACTGCTTTAGAACAAATGGAAGCACTTGTTTGCAAATTGTCCG:F:,,F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF:45:72:04 has more quality values than read characters." Later, I located this particular read and examined it, as shown in the following figure.
.
With a trial-and-error mindset, I replaced symbols with corresponding letters in the second or third line, and the script ran successfully. However, during the step "zcat SRR24423085_mapped.sam.gz | samtools sort - -o SRR24423085_mapped_sorted.bam", I encountered the following error: "query name too long," as shown in the figure below.
I have downloaded and aligned the samples from the nature paper using this pipeline, but I have the impression that around %20 of the runtime is spent on (de)compression of the raw/intermediatery files.
Given that, most of the HPC environments have relatively slow I/O operations, it could be better to just work with uncompressed files directly.
In here
SIMPLE-seq/perlscripts/03.bam2srf.pl
Line 106 in 4856b32
and here
SIMPLE-seq/perlscripts/03.bam2srf.pl
Line 95 in 4856b32
you are missing a "\t" after $bc.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.