brentp / goleft Goto Github PK
View Code? Open in Web Editor NEWgoleft is a collection of bioinformatics tools distributed under MIT license in a single static binary
License: MIT License
goleft is a collection of bioinformatics tools distributed under MIT license in a single static binary
License: MIT License
When dealing with many samples, the bam coverage plots in indexcov quickly become rather noisy. Currently I can try and highlight tracks to get the sample, but this doesn't always distinguish the track from all the others. Curious if there might be a way to deselect sample tracks or rather just select the desired samples that you want to view for more individualized inspection? In my case I was hoping to view specific samples for potential large-scale duplications or deletions. Or maybe some kind of zoom in feature or something like that would suffice.
I interrupted goleft indexcov
after 43 minutes, whereas samtools depth
completed in 2 minutes.
❯❯❯ time goleft indexcov foo.bam
^C interrupt
real 42m56.588s
user 44m41.414s
sys 0m26.671s
❯❯❯ time samtools depth -a foo.bam
real 2m11.683s
user 2m3.448s
sys 0m3.460s
❯❯❯ du -h foo.bam
3.0G foo.bam
The BAM file is of reads aligned to a de novo assembled draft genome with 205 Mbp in 1.6 million contigs. The files reside on an NFS file system. Any thoughts?
Testing goleft
on Ubuntu 16.04 under zsh
shell 5.1.1.
I was very confused with this error:
depth: invalid option -- 'd'
open: No such file or directory
depth: invalid option -- 'd'
open: No such file or directory
ERROR with command: Command('echo 'chrM:1-16571'; samtools depth -Q 1 -d 2500 -r 'chrM:1-16571' '/data/...', , stdout[:20]: '/data/', exit-code: -1, error: signal: segmentation fault (core dumped), run-time: 1.616101091s)
Some checking I found out that I have serveral versions of samtools lying around with various paths. I want to run goleft with a lastest samtools from bioconda as following:
export PATH="/path/to/bionconda/samtools:$PATH".
then call goleft.
Goleft, however, does not know about the samtools in "/path/to/bioconda/samtools". It keeps using samtools in /usr/local/bin. When I removed the samtools in /usr/local/bin, it then uses the one in /usr/bin. I had to remove all other old version of samtools.
Should there be an option to specify full samtools path?
Hi,
I got the following error messages when running goleft_linux64 covstats
.
$ goleft_linux64 covstats CN000245.marked.realigned.recal.bam
coverage insert_mean insert_sd insert_5th insert_95th template_mean template_sd pct_unmapped pct_bad_reads pct_duplicate pct_proper_pair read_length bam sample
panic: EOF
goroutine 1 [running]:
github.com/brentp/goleft/covstats.pcheck(...)
/home/brentp/go/src/github.com/brentp/goleft/covstats/covstats.go:30
github.com/brentp/goleft/covstats.Main()
/home/brentp/go/src/github.com/brentp/goleft/covstats/covstats.go:249 +0xdd1
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:68 +0x179
Could you please help me with this? Please tell me if you need other information.
Bests,
Yiwei Niu
With input bam files, the scaled coverage in indexcov is about 1.0 but with cram input the coverage is about 50000.
I've some maternal blood cfDNA (containing both maternal and fetal DNAs) sequenced bam files. Whether this method can be used for predicting fetal sex?
Greetings. I was tasked with installing STRetch (https://github.com/Oshlack/STRetch/wiki/Installing-STRetch) This "goleft" is listed as a dependency.
Given a Python 3 at /python3.6.1/, how does one install this? I am not using Conda. Is this a Python module, or standalone? Is there source that can be compiled with configure & make, with the output in PATH at the time STRetch is used?
Environment is Centos 6.9.
Hi Brent,
I would like to propose a new option for goleft indexsplit
(and would have implemented it myself if my Go was any good): --exclude
to be able to exclude chromosomes. A typical use case would be unplaced contigs, decoys etc. They are part of the BAM header, but usually you don't want to call variants on them. Sure, I can always remove them after running indexsplit
, but then I cannot control the number of splits N
properly.
Thanks,
Andreas
PS: Thanks for Goleft!
Wanted to check coverage on hg19/chrM. Ran the following command w/ latest master goleft:
~/code/goleft/bin/goleft indexcov --directory possorted_bam possorted_bam.bam
chrM isn't in the HTML output, even though it doesn't appear to match the default --excludepatt.
chrM does appear in the bed file, but the results seem to be incorrect. Is there a problem getting good results on small chromosomes?
$ zmore possorted_bam/possorted_bam-indexcov.bed.gz | grep chrM
chrM 0 16384 6
$ samtools view possorted_bam.bam chrM:0-20000 | wc -l
212317980
Dear Brent;
I build a customized genome sequence by adding a new sequence "VectorA" (~ 10kb) to hg19, and I would like to have coverage of the VectorA via your tool indexcov. I have tried different combination of the command arguments, including --includegl and -c. However, none of them to have the VectorA depth either in the plot or in the file indexcov-indexcov.bed.gz. The vector is less than 16k, so it is reasonable there is no plot. But could you tell me how to have the result in *-indexcov.bed.gz file?
Thanks,
Wei
$ wget https://github.com/brentp/goleft/releases/download/v0.1.9/goleft_linux64 -O goleft
$ chmod +x goleft
$ goleft -h
goleft Version: 0.1.6
depth : parallelize calls to samtools in user-defined windows
Seems like +00:00
is breaking the parsing of the time RG tag. Not sure if +
is allowed here, I would assume it is.
$ goleft indexcov --directory goleft_output/ my.bam
panic: parsing time "2017-08-28T14:53:35.804417+00:00": extra text: +00:00: line 88: "@RG\tID:MY_SAMPLE\tSM:MY_SAMPLE\tLB:MY_SAMPLE\tPL:ILLUMINA\tPU:HNNKFAFXX-L004\tCN:GL\tDT:2017-08-28T14:53:35.804417+00:00"
goroutine 1 [running]:
github.com/brentp/goleft/indexcov.RefsFromBam(0x7fff82d67923, 0x38, 0x0, 0x0, 0x1, 0x0, 0xc42011ee10)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:294 +0x509
github.com/brentp/goleft/indexcov.getReferences(0x7fff82d67914, 0xe, 0xc420120f01)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:311 +0xb9
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:340 +0x20e
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:68 +0x17f
Hi all,
I have some bam files and corresponding bai files (of unknown gender).
I used the indexcov function. But didn't get any X-Y plot. What are the arguments to be supplied for generating X-Y plot. Getting some error as:
ngslab@ngslab-OptiPlex-3050:~/Downloads/goleft-master/indexcov/input$ goleft indexcov --directory /home/ngslab/Downloads/goleft-master/indexcov/ '/home/ngslab/Downloads/goleft-master/indexcov/input/NA-180.bam' 2018/10/19 16:18:02 indexcov: running on 1 indexes 2018/10/19 16:18:04 indexcov: found chromosome "chrX", wanted "X" please use exact chromosome names for --sex. 2018/10/19 16:18:04 indexcov: found chromosome "chrY", wanted "Y" please use exact chromosome names for --sex. (WARNING) indexcov: expected 2 sex chromosomes, found: 0. you can set the expected with --sex '' 2018/10/19 16:18:04 sex chromosomes not found. 2018/10/19 16:18:04 got: 1 principal components 2018/10/19 16:18:04 indexcov: 1 principal components, not plotting indexcov finished: see /home/ngslab/Downloads/goleft-master/indexcov//index.html for overview of output
Please help me to resolve the issue.
Brent;
Thanks for the new goleft version. I'm testing on some hg38 runs and getting an issue parsing the BAM reference headers:
bash run_header_problem.sh
panic: runtime error: index out of range
goroutine 1 [running]:
github.com/biogo/hts/sam.equalRefs(0xc42028d5e0, 0xc420464bd0, 0xc420466260)
/home/brentp/go/src/github.com/biogo/hts/sam/reference.go:292 +0x4ad
github.com/biogo/hts/sam.(*Header).AddReference(0xc420132c60, 0xc420464bd0, 0x0, 0x0)
/home/brentp/go/src/github.com/biogo/hts/sam/header.go:400 +0xbf
github.com/biogo/hts/sam.(*Header).DecodeBinary(0xc420132c60, 0xc640c0, 0xc42015a200, 0x0, 0x0)
/home/brentp/go/src/github.com/biogo/hts/sam/parse_header.go:72 +0x49b
github.com/biogo/hts/bam.NewReader(0xc64e80, 0xc420108098, 0x2, 0x0, 0x0, 0x110)
/home/brentp/go/src/github.com/biogo/hts/bam/reader.go:50 +0x115
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:245 +0x302
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:64 +0x191
Normally the naming on the HLA contigs is the issue since they have both colon and asterisks which can mess up different parsing assumptions:
@SQ SN:HLA-A*01:01:01:01 LN:3503 AH:*
Here is a reproducible test case:
https://s3.amazonaws.com/chapmanb/testcases/goleft_indexcov_hg38.tar.gz
Thanks much for looking at this, I need to try to get a biogo development environment setup so I can provide fixes directly.
Running the latest release of index cov on this test bam fails
tiny_bam.zip
user@d8fb9fcad6fb:/data$ goleft indexcov --directory tmp/muster/NA12878_TinyTest out/NA12878_TinyTest.dupmarked.realigned.recalibrated.bam
2017/01/22 04:51:01 indexcov: running on 1 indexes
2017/01/22 04:51:02 got: 1, principal components
2017/01/22 04:51:02 indexcov: 1 principal components, not plotting
panic: runtime error: index out of range
goroutine 1 [running]:
github.com/brentp/goleft/indexcov.writeIndex(0xc42023cd50, 0xc420258030, 0x1, 0x1, 0xebfc00, 0x2, 0x2, 0xc4202e0190, 0x1, 0x1, ...)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:541 +0x29f5
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:288 +0x857
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:64 +0x191
user@d8fb9fcad6fb:/data$ goleft --version
goleft Version: 0.1.11
covmed : calculate median coverage on a bam by sampling
depth : parallelize calls to samtools in user-defined windows
depthwed : matricize output from depth to n-sites * n-samples
indexcov : quick coverage estimate using only the bam index
thanks!
Hi,
an issue similar to #33 ?
I got the following message for indexcov :
panic: sam: malformed header line: line 89: "@PG\tID:bwa\tPN:bwa\tVN:0.7.12-r1039\tCL:/commun/data/packages/bwa/bwa-0.7.12/bwa mem -t 10 -M -H @CO\t20180608.isidor.: Mapping de bams pour BI/ CHU-Nantes. Les Bams viennent de [email protected] -R @RG\\tID:18D0609\\tLB:18D0609\\tSM:18D0609\\tPL:illumina\\tCN:Nantes /commun/data/pubdb/broadinstitute.org/bundle/1.5/b37/index-bwa-0.7.12/human_g1k_v37.fasta /mnt/beegfs/lindenb/WORK/2018/20180607.XXX.YY/FASTQS/18D0609_S1_R1_001.fastq.gz /mnt/beegfs/lindenb/WORK/2018/20180607.XX.YY/FASTQS/18D0609_S1_R2_001.fastq.gz"
goroutine 24 [running]:
github.com/brentp/goleft/indexcov.readIndex(0x7fffffffda38, 0x76, 0x39, 0xc4204d81e0, 0xc421e88590, 0x5, 0x35)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:460 +0x6ba
github.com/brentp/goleft/indexcov.Main.func1(0xc4201627e0, 0xc4200c1c00, 0x3a, 0x3a, 0xc42028a000, 0x3a, 0x3a, 0xc420229340)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:365 +0x8a
created by github.com/brentp/goleft/indexcov.Main
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:363 +0x386
$ ./goleft -v
goleft Version: 0.1.19
samtools view -c 18D0609.bam
works fine
I'm running indexcov on three test .bam files. My code is
goleft indexcov -d ./ *.bam --sex ''
I get the following output
2017/08/09 16:52:39 indexcov: running on 3 indexes
2017/08/09 16:52:42 sex chromosomes not found.
2017/08/09 16:52:43 got: 3 principal components
indexcov finished: see .//index.html for overview of output
No other files except index.html file are produced and the coverage plots are missing from the .html output.
The .bam.bai files can be found here https://osf.io/c4bdx/
Any help would be greatly appreciated.
Works fine with a single crai but appears to fail when running multiple crais. Found this error before?
~/src/go/bin/goleft indexcov -d SEQCAP_WGS_GDAP_Uganda/goleft --sex "chrX,chrY" --fai fasta/Homo_sapiens.GRCh38_full_analysis_set_plus_decoy_hla.fa.fai 21601_6.cram.crai 21722_8.cram.crai
panic: runtime error: makeslice: cap out of range
goroutine 23 [running]:
github.com/brentp/goleft/indexcov/crai.makeSizes(0xc421132f00, 0x1, 0x10, 0xc4204224d8, 0x0, 0x1)
/nfs/users/nfs_j/jpm/src/go/src/github.com/brentp/goleft/indexcov/crai/crai.go:64 +0x13d
github.com/brentp/goleft/indexcov/crai.(*Index).Sizes(0xc4202a5b20, 0xc4201340f0, 0x50, 0x50)
/nfs/users/nfs_j/jpm/src/go/src/github.com/brentp/goleft/indexcov/crai/crai.go:48 +0xa6
github.com/brentp/goleft/indexcov.(*Index).init(0xc4201340f0)
/nfs/users/nfs_j/jpm/src/go/src/github.com/brentp/goleft/indexcov/indexcov.go:80 +0x4cb
github.com/brentp/goleft/indexcov.readIndex(0x7ffd139db8b1, 0x33, 0x0, 0x1, 0x0, 0x0, 0x0)
/nfs/users/nfs_j/jpm/src/go/src/github.com/brentp/goleft/indexcov/indexcov.go:417 +0x67d
github.com/brentp/goleft/indexcov.Main.func1(0xc42012a3c0, 0xc4202a5ae0, 0x2, 0x2, 0xc4201299f0, 0x2, 0x2, 0xc420129a00)
/nfs/users/nfs_j/jpm/src/go/src/github.com/brentp/goleft/indexcov/indexcov.go:351 +0xaa
created by github.com/brentp/goleft/indexcov.Main
/nfs/users/nfs_j/jpm/src/go/src/github.com/brentp/goleft/indexcov/indexcov.go:356 +0x3b9
we want to be able to filter on the flag.
❯❯❯ goleft_linux64 indexcov -d indexcov foo.bam
2017/05/18 15:33:16 indexcov: running on 1 indexes
(WARNING) indexcov: expected 2 sex chromosomes, found: 0.
you can set the expected with --sex ''
2017/05/18 15:33:17 sex chromosomes not found.
2017/05/18 15:33:17 got: 1 principal components
2017/05/18 15:33:17 indexcov: 1 principal components, not plotting
panic: runtime error: index out of range
goroutine 1 [running]:
github.com/brentp/goleft/indexcov.plotSex(0xc4203f8840, 0xc4201591a0, 0x2, 0x2, 0xc420294600, 0x1, 0x1, 0x0, 0x0, 0x0, ...)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/plot.go:371 +0xc22
github.com/brentp/goleft/indexcov.writeIndex(0xc4203f8840, 0xc42000c130, 0x1, 0x1, 0xc4201591a0, 0x2, 0x2, 0xc420294600, 0x1, 0x1, ...)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:742 +0x1cda
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:382 +0x8ca
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:68 +0x191
❯❯❯ goleft_linux64 --version
goleft Version: 0.1.16
The BAM is of reads mapped to an assembly of those reads. I can send along the BAI file if it helps troubleshooting.
and remove profiling.
Hi,
I wanted to test indexcov on a sample I have, but it seems this tool does not function without a sex chromosome present. It wasn't clear in the documentation that sex chromosomes need to be present for the tool to function, and it's also not clear how to provide an argument to the --sex parameter.
$ goleft_linux64_v0.1.18 indexcov --directory VirusA_results VirusA_bwa_alignment.bam
2018/03/27 10:48:12 indexcov: running on 1 indexes
(WARNING) indexcov: expected 2 sex chromosomes, found: 0.
you can set the expected with --sex ''
2018/03/27 10:48:12 sex chromosomes not found.
2018/03/27 10:48:12 got: 0 principal components
2018/03/27 10:48:12 indexcov: 0 principal components, not plotting
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x8fe719]
goroutine 1 [running]:
github.com/brentp/goleft/indexcov.plotBins(0xc42000c148, 0x1, 0x1, 0xc4200672f0, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0, ...)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/plot.go:194 +0x5c9
github.com/brentp/goleft/indexcov.writeIndex(0xc420075e60, 0xc42000c148, 0x1, 0x1, 0xc4200f06c0, 0x2, 0x2, 0xc4200672f0, 0x1, 0x1, ...)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:697 +0x2b1
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:396 +0x82f
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:68 +0x17f
$ goleft_linux64_v0.1.18 indexcov --sex 0 --directory VirusA_results VirusA_bwa_alignment.bam
2018/03/27 10:48:30 indexcov: running on 1 indexes
2018/03/27 10:48:30 (FATAL) indexcov: expected 1 sex chromosomes, found: 0.
you can set the expected with --sex ''
$ goleft_linux64_v0.1.18 indexcov --sex false --directory VirusA_results VirusA_bwa_alignment.bam
2018/03/27 10:49:07 indexcov: running on 1 indexes
2018/03/27 10:49:07 (FATAL) indexcov: expected 1 sex chromosomes, found: 0.
you can set the expected with --sex ''
Regards,
Mahesh.
Great tool - we've been using it to look at coverage on some plant genome samples. One minor issue is that right now we have to specify dummy sex chromosomes. It would be great to have a flag to just ignore that option.
Hi, I am trying to run indexsplit for around 2000 samples with cram indices as input
Commandline works perfectly fine for 10 samples. Even for 20 samples, I get the following error.
Is there a limitation for number of samples? Thanks.
goleft indexsplit --n 5000 --fai Homo_sapiens_assembly38.fasta.fai 1.crai 2.crai 3.crai.......2000.crai
`panic:runtime error: index out of range
goroutine 19 [running]:
github.com/brentp/goleft/indexsplit.Split.func1(0xc4201c2200, 0x14, 0x20, 0xc4203aa000, 0xd26, 0xd26, 0xc4201c4060, 0x1388, 0x0)
/home/brentp/go/src/github.com/brentp/goleft/indexsplit/indexsplit.go:101 +0x1058
created by github.com/brentp/goleft/indexsplit.Split
/home/brentp/go/src/github.com/brentp/goleft/indexsplit/indexsplit.go:84 +0xc3
`
Regards
Hi there,
I'm using "goleft depth" for estimating the sequencing depth of sliding window (10 kb window and 2 kb step) on the genome. I tried to use bed file "--bed" with sliding window information, however, the result looks like this:
Scaffold_3 0 10000 1
Scaffold_3 2000 10000 0
Scaffold_3 4000 10000 0
Scaffold_3 6000 10000 0
Scaffold_3 8000 10000 0
Scaffold_3 10000 12000 0
Scaffold_3 10000 14000 0
Scaffold_3 10000 16000 0
So I was wondering if I use the bed file properly. My command is as follows:
goleft_linux64 depth --reference genome.fasta --prefix sample1 sample1_sorted.bam --processes 20 --windowsize 10000 --mincov 0 --bed genome.windows.bed &
Thank you for any advice on this!
YY
Brent;
I'm running into a goleft indexcov error when running small integration tests with goleft indexcov
. I'm guessing this is due to the file being too small to have reasonable bins but it would be great if it didn't fail for these edge cases so I don't have to add checks about when to run it.
Here is a small test case with a run.sh
to reproduce the problem:
wget https://s3.amazonaws.com/chapmanb/testcases/goleft_indexcov_small.tar.gz
Let me know if I'm doing anything wrong on my side, looking forward to having these coverage estimates integrated.
2017/01/18 14:23:13 indexcov: running on 1 indexes
panic: runtime error: index out of range
goroutine 1 [running]:
github.com/brentp/goleft/indexcov.(*Index).init(0xc4200177d0)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:70 +0x3df
github.com/brentp/goleft/indexcov.(*Index).NormalizedDepth(0xc4200177d0, 0x0, 0x0, 0x40bb, 0x0, 0x30d40, 0x1)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:86 +0x304
github.com/brentp/goleft/indexcov.run(0xc42010a420, 0x2, 0x2, 0xc42000c0d8, 0x1, 0x1, 0xc42010a650, 0x1, 0x1, 0x0, ...)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:315 +0x73c
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:263 +0x99d
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:65 +0x191
I've noticed that covmed estimates higher median coverage than other tools. For example for a particular whole genome covmed estimates 33.4, while Picard CollectWgsMetrics estimates 27.
I've performed similar calculations on exomes where I get median coverage of 199.71 with covmed (using the region argument) compared with 189 using bedtools (take the median of counts per base over target region). I've found consistently higher results from covmed compared with picard and bedtools across a number of exomes and genomes. The size of the difference is variable.
I wonder if you have any idea why this is occurring?
One possibility that springs to mind for exomes in particular is that reads outside the target region could be counted and so cause it to overestimate the coverage.
Hi Brent,
I try to run goleft indexcov
on two small BAM files, that we store on GitHub here.
But unfortunately I get this error (I cutted it, first line below is repeated from 0th to 85th):
2018/10/10 15:07:33 no reference stats found for 79th reference
2018/10/10 15:07:33 no reference stats found for 80th reference
2018/10/10 15:07:33 no reference stats found for 81th reference
2018/10/10 15:07:33 no reference stats found for 82th reference
2018/10/10 15:07:33 no reference stats found for 83th reference
2018/10/10 15:07:33 no reference stats found for 84th reference
2018/10/10 15:07:33 no reference stats found for 85th reference
2018/10/10 15:07:33 indexcov: running on 1 indexes
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x8568f6]
goroutine 1 [running]:
github.com/brentp/goleft/indexcov/crai.(*Index).Sizes(0x0, 0x0, 0x0, 0x0)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/crai/crai.go:46 +0x26
github.com/brentp/goleft/indexcov.(*Index).init(0xc0001a6410)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:83 +0x471
github.com/brentp/goleft/indexcov.(*Index).NormalizedDepth(0xc0001a6410, 0x0, 0x30d40, 0xc00038e000, 0x0)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:124 +0x1ed
github.com/brentp/goleft/indexcov.run(0xc0002ac000, 0x56, 0x80, 0xc0000de0d0, 0x1, 0x1, 0xc00019e6a0, 0x1, 0x1, 0xc00034c580, ...)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:559 +0x829
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:402 +0x51e
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:68 +0x179
I have no idea about what is happening here...
Brent;
Thanks again for this implementation. I've got it integrated into bcbio and working on checking differences with current callable calculation output. I found a couple of differences I wasn't able to resolve with my poor golang skills. An example run is here:
wget https://s3.amazonaws.com/chapmanb/testcases/goleft_callable.tar.gz
when comparing against current bcbio output I get two major differences. The first is that initial blocks defined by the input BED file have an extra base relative the input regions (99 instead of 100 for the first block here):
@@ -1,6 +1,6 @@
-chrM 100 1000 CALLABLE
-chrM 2000 5000 CALLABLE
-chr22 14250 14257 LOW_COVERAGE
+chrM 99 1000 CALLABLE
+chrM 1999 5000 CALLABLE
+chr22 14249 14257 LOW_COVERAGE
chr22 14257 14258 NO_COVERAGE
chr22 14258 14270 LOW_COVERAGE
chr22 14270 14277 CALLABLE
This is caused by the -1
when retrieving from the cache, but I couldn't figure out the right way to push these initial blocks onto the cache with a +1
for the start.
The second is that the depth output is missing some NO_COVERAGE
regions. I guess this is due to not using -a
for depth:
@@ -15,6 +15,5 @@
chr22 14487 14495 LOW_COVERAGE
chr22 14495 14496 CALLABLE
chr22 14496 14595 LOW_COVERAGE
-chr22 14595 15068 NO_COVERAGE
chr22 15068 15128 LOW_COVERAGE
chr22 15128 15500 CALLABLE
Thanks for any thoughts and suggestions.
should take a vcf and a .ped and show transmission rates, rates of denovos for various cutoffs.
a cutoff can be variant quality, genotype quality, depth, etc.
should output in a way that makes it possible to make something like a ROC curve.
Hi,
I'm trying to run indexcov
on my WGS samples. Here is the error I got when I execute
panic: parsing time "2010-10-19T00:00:00.000+00:00": extra text: +00:00: line 88: "@RG\tID:DKFZ:100630_SN143_0256_A15006043_5\tPL:ILLUMINA\tCN:DKFZ\tPI:353\tDT:2010-10-19T00:00:00.000+00:00\tLB:WGS:DKFZ:ICGC_BL12\tSM:52c198b4-7bda-4f81-8101-a322787a10a6\tPU:DKFZ:100630_SN143_0256_A15006043_5\tPG:fastqtobam"
goroutine 1 [running]:
github.com/brentp/goleft/indexcov.RefsFromBam(0x7fff9331bb5e, 0x48, 0x0, 0x0, 0x1, 0x0, 0xc420174e10)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:294 +0x509
github.com/brentp/goleft/indexcov.getReferences(0x7fff9331bb52, 0xb, 0xc420177001)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:311 +0xb9
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:340 +0x20e
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:68 +0x17f
Is something wrong with BAM files or any dependencies are missing in my machine? Help would be greatly appreciated.
Thank you.
Hi I am getting this error while running indexCov with a folder of BAM files. Is their a way to find out that which BAM has a problem?
goleft indexcov --directory indexCovQGP/ indexQGP/*.bam
panic: bam: magic number mismatch
goroutine 12 [running]:
github.com/brentp/goleft/indexcov.readIndex(0x7fff6666d84a, 0x27, 0x129, 0xc454a61101, 0xc45995ab20, 0x12, 0x125)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:313 +0x2d5
github.com/brentp/goleft/indexcov.Main.func1(0xc420032840, 0xc4202e8000, 0xcef, 0xcef, 0xc42017f500, 0xcef, 0xcef, 0xc420213c10)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:270 +0xaa
created by github.com/brentp/goleft/indexcov.Main
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:275 +0x5e4
I ran into this error:
$ goleft_linux64 indexcov --prefix merged/30J.rmdup.bam
panic: runtime error: index out of range
goroutine 1 [running]:
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:218 +0xe70
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:65 +0x191
Could this be due to the fact that our chromosome names are not in "chr" format? Is there any way to change the defaults so that we can use genomes without chromosomal scaffolds? Here is the output of samtools idxstats on the "30J.rmdup.bam" alignment file:
CM002977.3 225584828 8338981 465136
CM002980.3 204787373 7537439 390998
CM002984.2 185818997 6517117 332638
CM002983.2 172585720 6076609 306025
CM002981.2 190429646 6355739 298694
CM002982.3 180051392 6521339 342746
CM002991.3 169600520 6305670 350299
CM002985.3 144306982 5341132 285889
CM002987.3 129882849 4826300 276518
CM002992.3 92844088 3560246 217738
CM002989.3 133663169 4869039 254070
CM002979.2 125506784 4406793 224347
CM002978.2 108979918 4114634 231117
CM002988.2 127894412 4823764 274264
CM002986.2 111343173 4088611 220325
CM002994.2 77216781 2713115 147784
CM002990.2 95684472 3447325 168426
CM002995.2 70235451 2527018 132779
CM002996.3 53671032 1790224 86788
CM002993.2 74971481 2857223 168165
CM002997.3 149150640 5750773 315248
CM003438.1 11753682 38388 3424
Thanks,
Noah
Trying to run indexcov (v0.1.18 from bioconda) on long read WGS (fai and crai attached) but keep getting an obscure error message:
$ goleft indexcov --excludepatt '[a-zA-VZ]' --fai hs37d5_viral.fa.fai --directory test 9370NK.filtered.sorted.cram.crai
2018/04/26 09:39:33 -19749 16384 {2355931 707216 13182576 1395 3515035} 2355931 2375680
panic: logic error
goroutine 16 [running]:
github.com/brentp/goleft/indexcov/crai.makeSizes(0xc420211500, 0x13c, 0x220, 0xc420686000, 0xd5e, 0xd5e)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/crai/crai.go:83 +0x958
github.com/brentp/goleft/indexcov/crai.(*Index).Sizes(0xc420184060, 0x7feb6fc45070, 0x450920, 0x7feb6fc9f000)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/crai/crai.go:48 +0xc4
github.com/brentp/goleft/indexcov.(*Index).init(0xc420260050)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:81 +0x465
github.com/brentp/goleft/indexcov.readIndex(0x7ffc2b92815b, 0x2d, 0x0, 0x0, 0x0, 0x0, 0x0)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:431 +0x5a9
github.com/brentp/goleft/indexcov.Main.func1(0xc420256000, 0xc4201641a0, 0x1, 0x1, 0xc42025a000, 0x1, 0x1, 0xc4204ac140)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:365 +0x8a
created by github.com/brentp/goleft/indexcov.Main
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:363 +0x386
Hi Brent,
I've installed version 0.1.17 as binary and ran indexcov on 100 BAM files (and for testing also on 100 corresponding crai files). I get all expected files except interactive depth html files. indexcov completed without or error or warning. Digging in the code, it seems that maxSamples=100 is a hardcoded cutoff for these files, which is fine (would be nice to have a command line option), but there must be an off by one error, because if it's exactly 100 no warning is printed and no html files are created.
Thanks a lot for goleft!
Andreas
Hi @brentp,
Love the indexcov tool! It would be very useful for cancer BAMs, but the limitation of allowing only one @rg record is a blocker for this. We tag each sequencing lane with its own @rg. Is this a restriction that can be lifted?
panic: bam reagroup: more than one RG for tumour.bam
cheers,
Mark
Any chance of covstats supporting cram format? Unsurprisingly, when I tried I got an error:
$ goleft covstats input.cram
panic: gzip: invalid header
goroutine 1 [running]:
github.com/brentp/goleft/covstats.pcheck(0xa8ae60, 0xc4200d8410)
/home/brentp/go/src/github.com/brentp/goleft/covstats/covstats.go:28 +0x4a
github.com/brentp/goleft/covstats.Main()
/home/brentp/go/src/github.com/brentp/goleft/covstats/covstats.go:231 +0x913
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:68 +0x17f```
Brent;
The new indexcov command and outputs are fabulous, and a brilliant way to get a quick overview of coverage across chromosomes. What would you think about outputting raw data we could incorporate into MultiQC (https://github.com/ewels/MultiQC)?
MultiQC consolidates all the outputting reporting in bcbio into a single HTML and allows displays like tabbing which we could use to provide the multiple chromosome plots. It has built in interactive charts, cutomized hiding of samples and other nice features to scale up for bigger sample sizes.
Practically this would require dumping the outputs you currently plot into tab delimited files we could suck up as a MultiQC module for indexcov (http://multiqc.info/docs/#plotting-functions). I haven't compiled and run the latest indexcov but looking at the code it looks like you might already do some of this. Thanks for considering this approach for making indexcov outputs available.
Hi Brent
I tried to run indexcov on my bam files but it excludes all my chromosomes but one which has a different naming.
I am working with an NCBI reference genome and chromosomes are named NC_12344. How can I overwrite the default setting which excludes this pattern?
I tried this way:
./goleft_linux64 indexcov --chrom NC_031983 --directory /scicore/home/salzburg/boehne/test/ /scicore/home/salzburg/boehne/*realn.bam
which did nor work and the same command without the --chrom flag
Any help would be great
Thanks
Astrid
Is it somehow possible to disable the sex chromosome plotting or reporting ? Having worked well on mammals I'd like to apply it to bacteria (for global missing regions detection).
Is that in scope ?
Thanks
Colin
Hi Brent , here are two suggestions for indexcov
using a file containing the path to the bams (to avoid something like xargs
)
if we could include the fact that some samples are 'cases' or 'controls', would it improve your algorithm ?
thanks
Hi @brentp. Any chance to add support for CSI indexes for indexcov? Generating lots of assemblies, and many have larger scaffolds, which require the CSI rather than BAI...
Have you compared goleft to Sambamba?
There are a few small issues with it in terms of flexibility, but would you say goleft performs better or provides orthogonal functionality? (For example, one issue is that sambamba view can't take bitwise flags, unlike samtools view.)
Thanks! A formula for Homebrew-science would be great.
Brent;
Would it be possible to add a general option to limit goleft indexcov output to standard chromosomes (1-22 + gender)? The current GL removal works for GRCh37 but not hg19 or hg38 with chr
prefixes and won't handle other non-standard alt contigs. I'd be happy to pass a list of chromosomes to the command line so the tool itself doesn't need to hard code these, but it seems like the cur --chr
option is meant for specifying a single chromosome only. Thanks for any thoughts/suggestions.
In the bin plot description here:
https://github.com/brentp/goleft/blob/master/docs/indexcov/help-bin.md
you say that samples:
* to the left of the plot have many regions with low or missing coverage.
Shouldn't that be to the right of the plot? (Or maybe I've just totally misunderstood - equally possible).
I noticed this in I'm observing differential error checking depending on whether or not the input file is a CRAM or a BAM. The files in question have multiple sample names listed (a different one for each @rg line). When the input is a CRAM, no error is thrown. When the input is a BAM, I see: panic: bam reagroup: more than one RG for /build/test.bam
At the moment, it seems as if indexcov doesn't check CRAM headers? i.e. https://github.com/brentp/goleft/blob/master/indexcov/indexcov.go#L202-L231
I assume this error is thrown because the assumption is that there is a single sample for the whole file and there isn't handling of multiple samples. What is being reported when these problem CRAMs are provided? Stats for all the samples pooled together?
I am trying to run goleft indexcov on crai files but it doesn't seem to work. Has anyone else found the same problem?
jpm@farm3-head3> goleft_linux64 indexcov --sex "chrX,chrY" -d goleft/ --fai fasta/Homo_sapiens.GRCh38_full_analysis_set_plus_decoy_hla.fa.fai 21772_6.cram.crai
panic: runtime error: index out of range
goroutine 22 [running]:
github.com/brentp/goleft/indexcov/crai.ReadIndex(0xc80c80, 0xc420160580, 0xc420160580, 0x0, 0x0)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/crai/crai.go:159 +0x5f5
github.com/brentp/goleft/indexcov.readIndex(0x7ffe764c7913, 0x11, 0x0, 0x1, 0x0, 0x0, 0x0)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:412 +0x5e1
github.com/brentp/goleft/indexcov.Main.func1(0xc4201203c0, 0xc42011fa40, 0x1, 0x1, 0xc4201240a8, 0x1, 0x1, 0xc42011fa50)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:351 +0xaa
created by github.com/brentp/goleft/indexcov.Main
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:356 +0x3b9
31 samples work fine but with 32 the output ends in:
2
017/11/06 10:57:13 indexcov: running on 32 indexes
panic: runtime error: index out of rangegoroutine 1 [running]:
github.com/brentp/goleft/indexcov.(*Index).NormalizedDepth(0xc4200b6820, 0x19, 0x0, 0xc488f646c0, 0x46)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:113 +0x1c6
github.com/brentp/goleft/indexcov.run(0xc4200f2840, 0x56, 0x56, 0xc4200d4300, 0x20, 0x20, 0xc4200d6600, 0x20, 0x20, 0xc48801d710, ...)
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:520 +0x8fb
github.com/brentp/goleft/indexcov.Main()
/home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:364 +0x529
main.main()
/home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:68 +0x17f
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.