Operating system Linux64 Package name</

Please have a look if those work for you <a class="issue-link js-issue-link" data-

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Let me ping <a class="user-mention notranslate" data-hovercard-type="user" data-hoverc

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Latest version of Isoseq3 and Pigeon segmentation fault about pbbioconda HOT 21 CLOSED

gushiro commented on September 27, 2024

Latest version of Isoseq3 and Pigeon segmentation fault

from pbbioconda.

Comments (21)

armintoepfer commented on September 27, 2024

Please have a look if those work for you
#549 (comment)

from pbbioconda.

jamestwebber commented on September 27, 2024

Weirdly I am having a new issue, lima is having this issue but it seems like pigeon and isoseq3 are not (I think).

I created a new VM with Ubuntu 20.04 because that is listed as supported. I can install from bioconda and run pigeon --version and isoseq3 --version (which I couldn't do before, in #549) but lima --isoseq segfaults as well as lima --log-level TRACE.

At this point my suspicion is that these tools are relying on system libraries that are not being properly checked/versioned for compatibility? lima is surprising because it hasn't received an update since I started using it, so it hasn't changed but something else must have.

from pbbioconda.

jamestwebber commented on September 27, 2024

I think I'm tracking down my issue as something else...maybe bad data or bad output that is causing lima to fail (unfortunately without a useful error message). When I know more I'll open a separate issue.

from pbbioconda.

gushiro commented on September 27, 2024

@armintoepfer thank you. This work for me. However, I notice a minor difference in this new version of Isoseq3, particularly for isoseq3 collapse. I have now a Warning message '>|> 20230106 03:24:09.947 -|- WARN -|- Run -|- 0x7f88b4d36d40|| -|- Transcripts do not contain quality values, will not output Id.fastq'

I should mentioned that the warning did not appear before in previous version of Isoseq3 using the same data.

from pbbioconda.

gushiro commented on September 27, 2024

@armintoepfer also I just notice that Pigen classify split scaffolds into _classification and _junction but the program seems stuck in the scaffold 45 for hours. My genome ref have 49 scaffolds.

from pbbioconda.

armintoepfer commented on September 27, 2024

Let me ping @derekwbarnett @jmattick for support

from pbbioconda.

derekwbarnett commented on September 27, 2024

Hi @gushiro This sounds like it might be a multithreading issue. Can you re-rerun and see if it hangs at the same point in the data? Let me know what you see. Also, you should know the answer long before 2 days of runtime.

from pbbioconda.

gushiro commented on September 27, 2024

@derekwbarnett thank you for your reply. Yes, it still hangs at the same point.

from pbbioconda.

armintoepfer commented on September 27, 2024

In that case, is it possible to provide a minimal working dataset to reproduce the issue? https://github.com/PacificBiosciences/pbbioconda#file-sharing

from pbbioconda.

gushiro commented on September 27, 2024

@armintoepfer yes, where can I send them to you?

from pbbioconda.

armintoepfer commented on September 27, 2024

Please click on the link, everything is explained how to upload data to us

from pbbioconda.

armintoepfer commented on September 27, 2024

@gushiro We've received the data and will triage. Once we have more information, we will update here.

from pbbioconda.

astulaaa commented on September 27, 2024

I am getting the same warning in isoseq3 collapse. Interestingly, after pigeon classify not a single transcript is classified as 'coding'. Any ideas why that would happen? Is it only column 30 annotations are incorrect or all of them? There are many among them annotated as 'full-splice_match' and 'reference_match' and 'canonical' but none of them as 'coding'.

from pbbioconda.

jamestwebber commented on September 27, 2024

There are many among them annotated as 'full-splice_match' and 'reference_match' and 'canonical' but none of them as 'coding'.

'coding' is not one of the options

from pbbioconda.

astulaaa commented on September 27, 2024

I'm sorry, I made it unclear. I mean the column 30 "coding". Every isoform is annotated as "non_coding". Even though the column 6 (which you linked the options for) has many "full-splice_match" isoforms. I am confused how is it possible for every transcript to be "non_coding"?

from pbbioconda.

gushiro commented on September 27, 2024

@armintoepfer thanks. To add a little:
I think for non-model species with long UTRs, maybe will be good for pigeon to count genes if the read fall XXXX bp downstream the last exon. I can see in my data a lot of short cDNA and cDNA in utrs that would be classify as intergenic (and therefore not producing a count?). I can see the importance of classifying transcript before (removing antisense reads for example), but I wonder what would happen during the gene count matrix generation if the transcript does fall only in the UTR region.

from pbbioconda.

derekwbarnett commented on September 27, 2024

@gushiro I have identified the problem. In your genome annotations file, the following record is out-of-order:

chr46   rRNA    gene    15556622        15556733        .       +       .       gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316";
chr46   rRNA    exon    15556622        15556733        .       +       .       gene_id "SSU_316"; transcript_id "LSU_316.1";
chr46   rRNA    transcript      15556622        15556733        .       +       .       gene_id "SSU_316"; transcript_id "LSU_316.1"; exon_number "1";

The exon entry appears before the transcript. pigeon is expecting exon records to be "children" of the transcript. The rest of the file follows this convention. It's just this one record that is different.

Re-ordering to this, allows pigeon to run to completion:

chr46   rRNA    gene    15556622        15556733        .       +       .       gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316"; gene_id "LSU_316";
chr46   rRNA    transcript      15556622        15556733        .       +       .       gene_id "SSU_316"; transcript_id "LSU_316.1"; exon_number "1";
chr46   rRNA    exon    15556622        15556733        .       +       .       gene_id "SSU_316"; transcript_id "LSU_316.1";

This situation triggered an internal error state that was not properly handled and suspended processing. I will include a fix in the next release here.

from pbbioconda.

gushiro commented on September 27, 2024

@derekwbarnett thanks, it worked perfect now.
I still have many "intergenic" transcript. When I check my data, I can see that each gene have many of these small intergenic transcript downstream the gene, spanning between 1000 - 3000 bp. Would it be possible for pigeon make-seurat or any step before to count these short transcript as part of the gene if the gene is "1000 bp" upstream, for example?

from pbbioconda.

armintoepfer commented on September 27, 2024

I have uploaded a new version. If that still segfaults, but not the github binary itself, then conda is corrupting the binary.

from pbbioconda.

jamestwebber commented on September 27, 2024

Just tested and it looks like they work now, thanks for fixing the issue

from pbbioconda.

armintoepfer commented on September 27, 2024

Okay great. Conda is weird sometimes.

from pbbioconda.

Latest version of Isoseq3 and Pigeon segmentation fault about pbbioconda HOT 21 CLOSED

Comments (21)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs