Comments (2)
Dear Juan
This is an interesting case which shows that V9 in some case is not really a good marker. I aligned the sequences that matched at 100% your Tara barcode. This set of sequence contains only one Fragilariopsis strain (sequence in bold in the list). Other are Fragiliariopsis environmental sequences (so they have been annotated as Fragilariopsis based on their similarity to the strain (see in bold in list of sequences below). Outside the V9 in the V4 region, there are at least 3 sequences annotated as Raphid-pennate_X_sp. which have a very different signature from Fragilariopsis .
The Dino sequence is interesting because it is 100% similar to your barcode in the V9 region but very different outside V9. In fact, doing a BLAST shows that it is most similar (still lower than 96% to some dino sequences). I suspect it could be a chimera and I will quarantine for the next version of PR2.
When you use dada2 assignTaxonomy (same as RDP classifier) or DECIPHER using your sequence against PR2, it finds correctly Fragilariopsis...
Final word concerning "mistakes" in PR2. It is a bit more complex. PR2 rely on expert annotation for specific groups as it is the best way to have a solid database. In this process some sequences are flagged as suspicious and removed or reassigned. In the case of the dinoflagellate even doing a BLAST against the database will not work because it will hit other dinoflagellate. One solution would be to run a chimera detection program, but the output needs to be thoroughly processed to make sure that bona fide sequences are not excluded. So the PR2 annotation should always be taken with a grain of salt... and we are welcoming experts to help us improving it. Diatoms were one of the group on which people worked during the last EukRef workshop but unfortunately things were not finalized....
from pr2database.
The dino sequence will be labelled as chimera in version 5.0.0 soon to be released.
from pr2database.
Related Issues (20)
- Microsporidia
- Training set derived from PR2 for IDTAXA (DECIPHER) HOT 5
- more chimera detected HOT 7
- entries that may be in the wrong orientation (reverse, complement or both) HOT 6
- amphibian wallaby HOT 2
- removed ncbi entries HOT 1
- How to use PR2 with assignTaxonomy from DADA2? HOT 2
- pr2_version_4.14.0_SSU.decipher.trained.rds HOT 4
- Typo in taxonomy HOT 2
- Training a custom database for classification HOT 1
- Cryothecomonas aestivalis HOT 1
- non-ascii character in PR2 5.0 HOT 3
- DADA2 assignSpecies/addSpecies HOT 1
- Ranks vector missing in Decipher trainset version 5.0 HOT 1
- makeblastdb fails with full database
- Query fails on online DB HOT 5
- PR2database annotated species information not working properly in RStudio HOT 1
- Arthropoda level seems to be incorrect HOT 4
- Errors in taxonomy ranks HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pr2database.