GithubHelp home page GithubHelp logo

Comments (6)

ylipacbio avatar ylipacbio commented on May 27, 2024 1

negative inf in mutation testing: is just debugging information for consensus developers, and should be harmless to results.

from pbbioconda.

splaisan avatar splaisan commented on May 27, 2024 1

Thanks. I guess this can be closed then

from pbbioconda.

armintoepfer avatar armintoepfer commented on May 27, 2024
  1. Those are only warnings and should not lead to empty output.

  2. Is your input to CCS a single subreads.bam file, since I see many ccs input files, but you use one subreads.bam input.

  3. Is your subreads.bam file accompanied by a .pbi file?

  4. How long does the polish step take?

from pbbioconda.

splaisan avatar splaisan commented on May 27, 2024

My commands were:

# merging CCS data to one (worked fine) => 352'214 reads
samtools merge -@24 -n -O BAM -b ./ccs_bam.list merged_ccs.bam

# demux (worked fine) => 285'173 reads
lima --isoseq --dump-clips --no-pbi -j 48 merged_ccs.bam primers.fasta demux.bam

# cluster (OK) => 3'901 reads
isoseq3 cluster demux.P5--P3.bam unpolished.bam -j 32

# convert (OK 1GB output) => 285'170 reads
bamtools convert -format fasta -in unpolished.flnc.bam > flnc.fasta

-rw-r--r--  1 u0002316 domain users 6.1M Oct  3 15:56 unpolished.bam
-rw-r--r--  1 u0002316 domain users  17K Oct  3 15:56 unpolished.bam.pbi
-rw-r--r--  1 u0002316 domain users  18M Oct  3 15:56 unpolished.cluster
-rw-r--r--  1 u0002316 domain users  14M Oct  3 15:56 unpolished.fasta
-rw-r--r--  1 u0002316 domain users   72 Oct  3 14:43 unpolished.filter_summary.json
-rw-r--r--  1 u0002316 domain users 393M Oct  3 14:44 unpolished.flnc.bam
-rw-r--r--  1 u0002316 domain users 2.4M Oct  3 14:44 unpolished.flnc.bam.pbi
-rw-r--r--  1 u0002316 domain users 1.3K Oct  3 14:44 unpolished.flnc.consensusreadset.xml
-rw-r--r--  1 u0002316 domain users 1.3K Oct  3 15:56 unpolished.transcriptset.xml
  1. only one CCS merged from the 24 retrieved bam files in the CCS output folder of the SMRTlink CCS job

bams in job folder

./tasks/pbccs.tasks.ccs-1/ccs.bam
./tasks/pbccs.tasks.ccs-2/ccs.bam
./tasks/pbccs.tasks.ccs-3/ccs.bam
./tasks/pbccs.tasks.ccs-4/ccs.bam
./tasks/pbccs.tasks.ccs-5/ccs.bam
./tasks/pbccs.tasks.ccs-6/ccs.bam
./tasks/pbccs.tasks.ccs-7/ccs.bam
./tasks/pbccs.tasks.ccs-8/ccs.bam
./tasks/pbccs.tasks.ccs-9/ccs.bam
./tasks/pbccs.tasks.ccs-10/ccs.bam
./tasks/pbccs.tasks.ccs-11/ccs.bam
./tasks/pbccs.tasks.ccs-12/ccs.bam
./tasks/pbccs.tasks.ccs-13/ccs.bam
./tasks/pbccs.tasks.ccs-14/ccs.bam
./tasks/pbccs.tasks.ccs-15/ccs.bam
./tasks/pbccs.tasks.ccs-16/ccs.bam
./tasks/pbccs.tasks.ccs-17/ccs.bam
./tasks/pbccs.tasks.ccs-18/ccs.bam
./tasks/pbccs.tasks.ccs-19/ccs.bam
./tasks/pbccs.tasks.ccs-20/ccs.bam
./tasks/pbccs.tasks.ccs-21/ccs.bam
./tasks/pbccs.tasks.ccs-22/ccs.bam
./tasks/pbccs.tasks.ccs-23/ccs.bam
./tasks/pbccs.tasks.ccs-24/ccs.bam

merged_ccs.bam header

@HD     VN:1.5  SO:unknown      pb:3.0.1
@RG     ID:83ce91c5     PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-260D0ACC    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-247F3B98    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-7D5BE60E    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-27C20FD7    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-3B0DFDD8    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-46D6FF96    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-BFFAF09     PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-3D549618    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-257A8961    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-6D1417D6    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-493E1710    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-3956BE17    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-3DD952D8    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-39D4D228    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-54AD16E1    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-5DD10239    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-27EA6E19    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-5855E9D4    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-5CE45245    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-50E967FA    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-30BB00A9    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-EE0FCDC     PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@RG     ID:83ce91c5-775FAF50    PL:PACBIO       DS:READTYPE=CCS;BINDINGKIT=101-365-900;SEQUENCINGKIT=101-309-400;BASECALLERVERSION=5.0.0;FRAMERATEHZ=80.000000  PU:m54094_180927_125111 PM:SEQUEL
@PG     ID:ccs-3.0.0    PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-35184AC6   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-4FD8F07    PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-3751DD58   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-7CD52484   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-4CEBA27A   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-39AE5771   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-4DB78682   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-364747FC   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-53693C07   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-4A82D30B   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-73D12D9F   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-6BC74ED3   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-39D589C3   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-625B3E8A   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-323CEACF   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-5BD44F34   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-47D43D26   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-66600F9A   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-4E31A573   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-3872F55A   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-4D70490A   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-23616117   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
@PG     ID:ccs-3.0.0-37DCFA6C   PN:ccs  VN:3.0.0        DS:Generate circular consensus sequences (ccs) from subreads.   CL:ccs 
  1. yes for unpolished.bam and subreads...bam,
    no for merged_ccs.bam and later bam's but this was used several commands before

  2. It runs multi-threads for minutes while the CCS run took a lot of time (21Gb subreads)

real    16m44.805s
user    857m38.528s
sys     13m28.972s

from pbbioconda.

splaisan avatar splaisan commented on May 27, 2024

HI,
could this be because I polished during the SMRTLInk CCS step while the full CLI tutorial tells not to at CLI?
I am rerunning ccs now at CLI but it will take a day to finish...

from pbbioconda.

splaisan avatar splaisan commented on May 27, 2024

I answer myself since there was no further reaction....

After running the CCS at CLI as well with command from the tutorial (vs using SMRTLink into CCS for my first attempt) I now got polishing to work.

What is weird is that I got no error telling me something was wrong in the first place.
During the polishing, I got multiple warnings like below (-v) which I hope are OK

many more of the second kind
>|> 20181011 09:13:55.574 -|- WARN       -|- operator() -|- 0x7f75bb7e6700|| -|- No subreads for cluster transcript/4231 in window 0-500
>|> 20181011 09:14:00.398 -|- INFO       -|- Polish -|- 0x7f75be7ec700|| -|- negative inf in mutation testing: 'm54094_180927_125111/6095081/114882_117024'
>|> 20181011 09:14:00.849 -|- INFO       -|- Polish -|- 0x7f75b4fd9700|| -|- negative inf in mutation testing: 'm54094_180927_125111/68485256/107706_109725'
>|> 20181011 09:14:11.151 -|- INFO       -|- Polish -|- 0x7f75b6fdd700|| -|- negative inf in mutation testing: 'm54094_180927_125111/27591428/40848_42828'
>|> 20181011 09:15:16.477 -|- INFO       -|- Polish -|- 0x7f75affcf700|| -|- negative inf in mutation testing: 'm54094_180927_125111/28508729/127119_128264'
>|> 20181011 09:16:04.578 -|- INFO       -|- Polish -|- 0x7f75b67dc700|| -|- negative inf in mutation testing: 'm54094_180927_125111/71172803/68421_69750'

from pbbioconda.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.