GithubHelp home page GithubHelp logo

covstats supporting cram about goleft HOT 10 OPEN

brentp avatar brentp commented on May 27, 2024
covstats supporting cram

from goleft.

Comments (10)

brentp avatar brentp commented on May 27, 2024 1

Actually, I forgot what it was doing. It makes a system call to samtools which converts cram to bam, then parses the bam. So the sampling stuff (the parts that you notice are filled out from covstats) work fine with cram. But the mapped reads is taken from the bam index and not present in the cram index so it can't estimate coverage.

from goleft.

hdashnow avatar hdashnow commented on May 27, 2024

P.S. I think I've figured out how to calculate median coverage from the mosdepth output. It's not as fast as covstats, but still faster than most other tools out there.

from goleft.

brentp avatar brentp commented on May 27, 2024

I think it should be straight-forward to support CRAM in covstats, I'll just do a system call to samtools view. I'll have a look at this next week if not sooner.

from goleft.

hdashnow avatar hdashnow commented on May 27, 2024

Much appreciated, thank you!

from goleft.

brentp avatar brentp commented on May 27, 2024

I just had a look at this and the cram index does not store the total number of mapped reads so it's not possible to estimate the coverage like that as it is for the bam index. we could iterate over a few cram slices, count reads, and not the byte offsets in the index, then use that rate and the total file size to estimate. this will be relatively accurate, but less-so than even the bam index estimate.

from goleft.

aofarrel avatar aofarrel commented on May 27, 2024

Are there any updates on this? I've been trying to implement a workflow involving covstats, and when it runs on CRAM files, the coverage reports as zero. Other stats appear to be accurate.

from goleft.

brentp avatar brentp commented on May 27, 2024

given the lack of cram parser in go, I don't plan to support this. It's possible to get actual coverage for a 30X WGS cram in < 5 minutes using mosdepth.

from goleft.

aofarrel avatar aofarrel commented on May 27, 2024

Thanks for your reply. I've noticed that covstats seems to run slowly on crams, even small ones. When you say there's no cram parser in go... what is it doing for the other stats? Apologies if this is a silly question, I'm very new to go.

In other words -- is covstats running on crams supported, just not for coverage? Or should I consider crams as not supported, period?

from goleft.

jmonlong avatar jmonlong commented on May 27, 2024

What would you recommend to very quickly extract mean/median coverage (+the insert size info) for CRAMs?
5 mins with mosdepth is pretty good but say we don't want that much info, just the median depth.

Would it make sense to slice the CRAM to a few random regions, convert to a small BAM and then use covstats?
We'd need to do this manually outside covstats, right? I see a "Regions" option but I'm not sure how it's used in the code, including if the full CRAM would be converted to BAM before computing coverage on these regions or not.

from goleft.

brentp avatar brentp commented on May 27, 2024

I suppose I could make it calculate the mean for the first part of the chromosome with covstats.
As soon as you starting converting to bam, you're going to be going up in time, even if it's only part of the file.

from goleft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.