Comments (21)
Please have a look if those work for you
#549 (comment)
from pbbioconda.
Weirdly I am having a new issue, lima
is having this issue but it seems like pigeon
and isoseq3
are not (I think).
I created a new VM with Ubuntu 20.04 because that is listed as supported. I can install from bioconda and run pigeon --version
and isoseq3 --version
(which I couldn't do before, in #549) but lima --isoseq
segfaults as well as lima --log-level TRACE
.
At this point my suspicion is that these tools are relying on system libraries that are not being properly checked/versioned for compatibility? lima
is surprising because it hasn't received an update since I started using it, so it hasn't changed but something else must have.
from pbbioconda.
I think I'm tracking down my issue as something else...maybe bad data or bad output that is causing lima to fail (unfortunately without a useful error message). When I know more I'll open a separate issue.
from pbbioconda.
@armintoepfer thank you. This work for me. However, I notice a minor difference in this new version of Isoseq3, particularly for isoseq3 collapse
. I have now a Warning message '>|> 20230106 03:24:09.947 -|- WARN -|- Run -|- 0x7f88b4d36d40|| -|- Transcripts do not contain quality values, will not output Id.fastq'
I should mentioned that the warning did not appear before in previous version of Isoseq3 using the same data.
from pbbioconda.
@armintoepfer also I just notice that Pigen classify split scaffolds into _classification and _junction but the program seems stuck in the scaffold 45 for hours. My genome ref have 49 scaffolds.
from pbbioconda.
Let me ping @derekwbarnett @jmattick for support
from pbbioconda.
Hi @gushiro This sounds like it might be a multithreading issue. Can you re-rerun and see if it hangs at the same point in the data? Let me know what you see. Also, you should know the answer long before 2 days of runtime.
from pbbioconda.
@derekwbarnett thank you for your reply. Yes, it still hangs at the same point.
from pbbioconda.
In that case, is it possible to provide a minimal working dataset to reproduce the issue? https://github.com/PacificBiosciences/pbbioconda#file-sharing
from pbbioconda.
@armintoepfer yes, where can I send them to you?
from pbbioconda.
Please click on the link, everything is explained how to upload data to us
from pbbioconda.
@gushiro We've received the data and will triage. Once we have more information, we will update here.
from pbbioconda.
I am getting the same warning in isoseq3 collapse. Interestingly, after pigeon classify not a single transcript is classified as 'coding'. Any ideas why that would happen? Is it only column 30 annotations are incorrect or all of them? There are many among them annotated as 'full-splice_match' and 'reference_match' and 'canonical' but none of them as 'coding'.
from pbbioconda.
There are many among them annotated as 'full-splice_match' and 'reference_match' and 'canonical' but none of them as 'coding'.
'coding' is not one of the options
from pbbioconda.
I'm sorry, I made it unclear. I mean the column 30 "coding". Every isoform is annotated as "non_coding". Even though the column 6 (which you linked the options for) has many "full-splice_match" isoforms. I am confused how is it possible for every transcript to be "non_coding"?
from pbbioconda.
@armintoepfer thanks. To add a little:
I think for non-model species with long UTRs, maybe will be good for pigeon to count genes if the read fall XXXX bp downstream the last exon. I can see in my data a lot of short cDNA and cDNA in utrs that would be classify as intergenic (and therefore not producing a count?). I can see the importance of classifying transcript before (removing antisense reads for example), but I wonder what would happen during the gene count matrix generation if the transcript does fall only in the UTR region.
from pbbioconda.
@gushiro I have identified the problem. In your genome annotations file, the following record is out-of-order:
chr46 rRNA gene 15556622 15556733 . + . gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316";
chr46 rRNA exon 15556622 15556733 . + . gene_id "SSU_316"; transcript_id "LSU_316.1";
chr46 rRNA transcript 15556622 15556733 . + . gene_id "SSU_316"; transcript_id "LSU_316.1"; exon_number "1";
The exon entry appears before the transcript. pigeon
is expecting exon records to be "children" of the transcript. The rest of the file follows this convention. It's just this one record that is different.
Re-ordering to this, allows pigeon
to run to completion:
chr46 rRNA gene 15556622 15556733 . + . gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316";
chr46 rRNA transcript 15556622 15556733 . + . gene_id "SSU_316"; transcript_id "LSU_316.1"; exon_number "1";
chr46 rRNA exon 15556622 15556733 . + . gene_id "SSU_316"; transcript_id "LSU_316.1";
This situation triggered an internal error state that was not properly handled and suspended processing. I will include a fix in the next release here.
from pbbioconda.
@derekwbarnett thanks, it worked perfect now.
I still have many "intergenic" transcript. When I check my data, I can see that each gene have many of these small intergenic transcript downstream the gene, spanning between 1000 - 3000 bp. Would it be possible for pigeon make-seurat or any step before to count these short transcript as part of the gene if the gene is "1000 bp" upstream, for example?
from pbbioconda.
I have uploaded a new version. If that still segfaults, but not the github binary itself, then conda is corrupting the binary.
from pbbioconda.
Just tested and it looks like they work now, thanks for fixing the issue
from pbbioconda.
Okay great. Conda is weird sometimes.
from pbbioconda.
Related Issues (20)
- pbmerge - overlapping movie/zmw combination HOT 2
- Example Kinnex data set error 404 HOT 3
- TRGT v1.0.0 HOT 1
- How do I control HOT 4
- Certain reads with missing methylation modification tags HOT 3
- [pbfusion] Error from visualize_fusion.py HOT 1
- Compatability with VACmap
- Auto thread detection in pbmm2 too high on slurm HPC HOT 3
- the number of reads pre and post running "isoseq refine" HOT 4
- Request for improved pbmm2 error message around zipped references HOT 1
- pbfusion panicked at CIGAR HOT 2
- isoseq3 refine filters many reads HOT 1
- pbmerge unable to merge BAM files from Sequel II and Revio HOT 1
- LIMA --min-scoring-regions 1 in isoseq mode HOT 1
- [pbccs/pbtk] Simulating low coverage hifi read for pipeline validation HOT 1
- Pigeon classify error: error loading reference annotations for reference; vector::reserve
- Unable to rename biosamples with lima
- Isoseq collapse : collapsing extra 5p exons results in incorrect isoforms.
- FL_TPM
- skera adapeter.fasta for kinnex single-cell sequencing: where to download?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pbbioconda.