Comments (7)
- The cluster step trims polyA tails. It also identifies concatemers by searching for the SMRT Bell hairpin sequence.
- UMIs are not supported. You must trim them before going into the isoseq pipeline. Clustering with UMIs is beyond this bioconda support channel.
from pbbioconda.
from pbbioconda.
1] We trained a HMM for polyA detection, residual errors are allowed.
2a] SMRT Bell hairpin sequence detection is performed across the whole read.
2b] Wrong orientations are not written to the output file by lima
, so they don't even make it into the clustering step.
3] Yes, at least 20 As, but if the read has to start with a polyA. The Viterbi may allow a few residuals, but likely not your full UMI.
4] no-polish takes the subreads, creates a partial-order-alignment, and calls a consensus sequence. This is noisier, but of sufficient quality, compared to the much longer taking polished CCS. Motivation is purely speed. You can also take fully polished CCS as input.
5] I don't trust a single molecule. I'd rather be concerned by a ton of FPs than your FNs. To counter the FN argument, sequence more. If you use polished CCS as input, there is no need to polish the FLNCs. In the upcoming version 3.1, we will introduce a pre-processing step that has CCS as input and generates FLNCs that will be used for clustering.
from pbbioconda.
from pbbioconda.
That's exactly the reason I don't trust a single molecule. What are the chances that you detect the same chimeric read twice? The human genome contains homopolymer A stretches >20bp, so looking for polyA in the middle of the read is not the best approach. One could look for primers, but I don't, because as I said initially, those molecules will unlikely form clusters. My goal is to create meaningful clustering results. If your goal is to refine FLNCs without running clustering then you are on your own for now.
from pbbioconda.
from pbbioconda.
If you happen to find a massive amount of chimeric reads, go and find your sample prep person :)
from pbbioconda.
Related Issues (20)
- pbsv multisample poor perfomance ? HOT 1
- mistake in isoseq refine documentation for single cell HOT 2
- pbfusion
- PBSV to detect heterogeneity?
- Default PBSV output incompatible with Hiphase due to IUPAC HOT 7
- jasmine ERROR: Missing HiFi kinetics. HOT 1
- Pigeon report script fails HOT 2
- isoseq groupdedup std::get: wrong index for variant HOT 1
- isoseq3 installation on macOS HOT 1
- Pigeon classify fails HOT 3
- How to extract INS.DUP type REF and ALT sequence HOT 1
- pigeon make-seurat error: could not open plain text file HOT 1
- isoseq collapse error: Could not find length of CCS read HOT 2
- Lima execution seems unexpectedly slow on MASseq run HOT 4
- Question on Quantification & Differential Expression Workflow HOT 1
- isoseq collapse results in redundant isoforms HOT 1
- pigeon make-seurat : cell count is different from the result from isoseq correct HOT 1
- Where can I find pbsv source code? HOT 2
- # recalladapter issue HOT 1
- pbfusion::bam_scanner unknown data type in input HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pbbioconda.