Comments (5)
Hi, I tagged a release here: https://github.com/brentp/goleft/releases/tag/v0.2.6
let me know if you have any further suggestions. I see what you mean about increasing the number of sampled reads.
from goleft.
thanks for the clear report and diagnosis.
I have pushed a fix. would you try the attached binary (gunzip, chmod +x and ./goleft_linux64 covstats ...
)
and verify it looks good for you? If so I will make a new release.
goleft_linux64.gz
from goleft.
Updated version
coverage | insert_mean | insert_sd | insert_5th | insert_95th | template_mean | template_sd | pct_unmapped | pct_bad_reads | pct_duplicate | pct_proper_pair | read_length | bam | sample |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4.34 | -0.44 | 17.25 | -29 | 29 | 299.57 | 17.24 | 95.33 | 0 | 0 | 4.6 | 150 | SAMN18146222_to_H37Rv.bam | SAMN18146222 |
3.93 | -0.5 | 17.25 | -28 | 28 | 299.47 | 17.47 | 97.35 | 0 | 0 | 2.6 | 150 | SAMN18146202_to_H37Rv.bam | SAMN18146202 |
18.61 | -0.55 | 17.35 | -29 | 29 | 299.45 | 17.36 | 0.2 | 0 | 0 | 98.7 | 150 | SAMN18146198_to_H37Rv.bam | SAMN18146198 |
Previous version
coverage | insert_mean | insert_sd | insert_5th | insert_95th | template_mean | template_sd | pct_unmapped | pct_bad_reads | pct_duplicate | pct_proper_pair | read_length | bam | sample |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4.34 | -0.44 | 17.25 | -29 | 29 | 299.57 | 17.24 | 2043.06 | 0 | 0 | 98.6 | 150 | SAMN18146222_to_H37Rv.bam | SAMN18146222 |
3.93 | -0.5 | 17.25 | -28 | 28 | 299.47 | 17.47 | 3671.93 | 0 | 0 | 98.9 | 150 | SAMN18146202_to_H37Rv.bam | SAMN18146202 |
18.61 | -0.55 | 17.35 | -29 | 29 | 299.45 | 17.36 | 0.2 | 0 | 0 | 98.9 | 150 | SAMN18146198_to_H37Rv.bam | SAMN18146198 |
TBProfiler
TBProfiler is was also run on the fastqs before variants were called. Because it uses a different aligner, it can be expected that its results may differ from covstats, but they should be similar since they align to the same reference genome (H37Rv).
sample | % mapped | 100 - % mapped | median coverage (always rounds to integer) |
---|---|---|---|
SAMN18146222 | 18.45% | 81.55% | 4 |
SAMN18146202 | 16.8% | 83.2% | 4 |
SAMN18146198 | 99.83% | 0.17% | 18 |
The fact TBProfiler seems to be saying 81.55% of a sample is unmapped while neo-covstats says 95.33% of a sample is unmapped is worth noting, but I don't think it's unreasonable, especially considering these are samples specifically designed to be ornery.
from goleft.
Hi, thanks for following up. You could try increasing the number of sampled reads, e.g.:
goleft covstats -n 10000000 ...
to see if it converges to match TBProfiler a bit better.
from goleft.
Sorry for the slow response, this fell off my radar. The samples I'm processing vary hugely in size -- we're running this on almost every Illumina-processed tuberculosis sample on SRA, and some of that is in a bit of a rough state -- so we're concerned adjusting the number of sampled reads may cause issues with the smaller samples.
In any case, this is definitely much more accurate than what we were seeing earlier. Would it be appropriate to make a release?
from goleft.
Related Issues (20)
- indexcov support for csi indexes HOT 2
- ndexcv: excluding chromosome: NC_031965 because of exclude-pattern: ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d HOT 2
- goleft_linux64 covstats error: panic: EOF HOT 7
- coverage of the customized genome HOT 2
- error: bam is required HOT 4
- link to golang typo
- suggestion: zoomable chromosome depth plots in indexcov HOT 1
- indexcov shows stepwise coverage distribution HOT 4
- Feature request: New option --exclude for covstats HOT 2
- Indexcov errors HOT 2
- error on depth, seems to make temp file it can't find
- Negative alnSpan error HOT 2
- No longer possible to build from source or include in a Docker image HOT 2
- Running Indexcov on cram files with different paths HOT 2
- What is the value of the insert size ? HOT 2
- indexcov: usage with long reads ? HOT 1
- Indexcov searching for cram.bai HOT 1
- indexcov everything is DEL or DUP HOT 2
- Release Linux aarch64 binary HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from goleft.