Comments (2)
Indeed, currently StringTie only shows the matching reference transcripts in the reference_id attribute of the output GTF. It seems that you'd like to see some recognizable gene IDs or gene names from the annotation - in other words, to have StringTie better "annotate" its own output.. This is a valid suggestion and we would consider adding this in a future version. We hesitated adding this because it could get complicated for those assemblies which may span multiple gene regions, but for those which match exactly the reference transcripts (hence the reference_id), indeed it is very easy to add the gene_name and/or gene_id attributes as well - I'll add these in the next version, likely as a new ref_gene_id attribute, I hope this would help?
Unfortunately I cannot assume that the annotation file would even have gene_id attributes, if it's a NCBI GFF3 or another annotation source, what you call "gene_id" in your case, could be called differently in other files. Transcript ID (called "ID" in GFF3) is the most stable identifier that we can expect to find in the reference annotation file, that's why we've only provided that reference ID.
Meanwhile you can probably use other programs or scripts which could quickly annotate StringTie transcripts based on another GTF/GFF file (e.g. cuffcompare or gffread from the Cufflinks suite), though probably in your case, the way you are doing it now (looking up the reference_id back in the annotation) is still the easiest way, it just needs to be scripted somehow.
from stringtie.
The idea is that when using cufflinks, the gene ids that are found in
the reference transcriptome GTF are annotated in the cufflinks output
files so I was able to grep easily for them since I'm expecting to see a
down regulation of these genes in case of KD or so. I'm not sure how is
this done in cufflinks, but sure it is more convenient than needing to
look for the reference id.
On 3/27/15 4:42 AM, Geo Pertea wrote:
Indeed, currently StringTie only shows the matching reference
transcripts in the /reference_id/ attribute of the output GTF. It
seems that you'd like to see some recognizable gene IDs or gene names
from the annotation - in other words, to have StringTie better
"annotate" its own output.. This is a valid suggestion and we would
consider adding this in a future version. We hesitated adding this
because it could get complicated for those assemblies which may span
multiple gene regions, but for those which match exactly the reference
transcripts (hence the reference_id), indeed it is very easy to add
the gene_name and/or gene_id attributes as well - I'll add these in
the next version, likely as a new /ref_gene_id/ attribute, I hope this
would help?
Unfortunately I cannot assume that the annotation file would even have
/gene_id/ attributes, if it's a NCBI GFF3 or another annotation
source, what you call "gene_id" in your case, could be called
differently in other files. Transcript ID (called "ID" in GFF3) is the
most stable identifier that we can expect to find in the reference
annotation file, that's why we've only provided that reference ID.Meanwhile you can probably use other programs or scripts which could
quickly annotate StringTie transcripts based on another GTF/GFF file
(e.g. cuffcompare or gffread from the Cufflinks suite), though
probably in your case, the way you are doing it now (looking up the
reference_id back in the annotation) is still the easiest way, it just
needs to be scripted somehow.—
Reply to this email directly or view it on GitHub
#5 (comment).
Dr. Abdullah H. Sahyoun
Computational Biologist
Chromatin Regulation/Bioinformatics and Deep Sequencing Unit
Max Planck Institute of Immunbiology and Epigenetics
Stübeweg 51, D-79108 Freiburg, Germany
Tel.: +49 761 5108 707
from stringtie.
Related Issues (20)
- Segmentation fault only in "-mix" mode.
- when dealing with big genomes, the error occurs: "the input alignment file is not sorted!" HOT 3
- "The -c and -m parameters of StringTie are not effective. HOT 3
- "Segmentation fault" error
- "Segmentation fault" Error using StringTie HOT 1
- Segmentation Error with Stringtie mix. The bundle appears too large.
- Error when running prepDE.py HOT 1
- Springtie installation instructions fail on "make release"
- StringTie for single-end reads HOT 1
- Not compiling SuperReads_RNA HOT 1
- I wonder 'else' clause should be commented out (or deleted) in prepDE.py3, in the block of badGenes check
- Whether stringtie will include novel transcripts into ballgown file? What other tools can be used to visualize novel transcripts?
- StringTie --merge
- <class 'str'> with high reads after running prepDE.py3 HOT 1
- problem with python prepDE.py3 -i sample_lst.txt HOT 1
- Stringtie --merge -c
- Segmentation fault (core dumped) and problems of gene orientations HOT 3
- how do you get all the potential processed transcripts HOT 1
- input file format
- Use of StringTie to assemby of unannotated intron retention transcripts HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stringtie.