GithubHelp home page GithubHelp logo

czc / nb_distribution Goto Github PK

View Code? Open in Web Editor NEW
26.0 5.0 6.0 3.29 MB

novoBreak: local assembly for breakpoint detection in cancer genomes

License: MIT License

Perl 97.68% Shell 2.32%
bioinformatics-pipeline structural-variations

nb_distribution's People

Contributors

czc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

nb_distribution's Issues

Running novobreak on a single sample

Hi,

In the paper Novel sequences, structural variations and gene presence variations of Asian cultivated rice authors run novoBreak on each of 3,010 rice accessions independently with default parameters.

I want to ask you how is it possible to run the tool on a single sample. The bash script ./run_novoBreak.sh is asking for control and treated sample.

I have also send an email to the corresponding author of the paper, but I thought that I may get a faster response here.

Thanks!

germline sv detection

Can I run novoBreak to detect germline SV? I was reading #2, should I mock up a "tumor" bam and then use that as input? Thanks.

Running without normal bam

Hi is it possible to run novoBreak with just the tumor bam and without the normal bam? if yes, how do i do it?

Thanks

infer_bp_v4.pl line 13 error

Hi,

Tried to run the tool using the run_novoBreak.sh script and got the following error at the end of the log file, any ideas?:

1433560 kmers passed the minimum frequency cutoff in tumor (3) and maximum frequency cutoff in normal (3)
Program exit normally
[M::bam2fq_mainloop] processed 1007748 reads
begin kmer2id ...
kmer2id takes 2 seconds
begin id2pair...
id2pair takes 151 seconds
begin sorting ids...
sorting ids takes 5 seconds
begin output results...
Outputting results takes 2 seconds
Finished
0.txt
10.txt
11.txt
12.txt
13.txt
14.txt
15.txt
1.txt
2.txt
3.txt
4.txt
5.txt
6.txt
7.txt
8.txt
9.txt
[E::bwa_idx_load] fail to locate the index files
x??
No such file or directory at /mnt/local_tools/nb_distribution/infer_bp_v4.pl line 13.

test

This is just a test.

Novobreak will get some duplicated results?

for example,

chr11	70503195	N	.	<TRA>	60	PASS	PRECISE;CT=5to3;CIPOS=-10,10;CIEND=-10,10;SOMATIC;CONSENSUS=GTCTTATGAGAAGTCCCAAGGAAGGGAGGGCCTTGTAGAAAGGTAGAGAACAAAAGACATAAAGTGACAGAAAGACAAAATGTGTTATTTTAACAAAAGGAAGCATTGGCCAGGTCTAGCAGATGGCTTCAGAACAAATGAAGCACATGTATATGAGGTTGAAACAACTAGGGGGATTGGAATTATTTTGAGAGTGAAAGAAGACATTATTAATGATTAATCCTGGGCTCTGGGAATACACCGTGACTGTCTTGTACAAATGAGACTTATGGTCACCCTAGCAATAGCTGACCCCACAGTGAAGGAAGTGCAGTGCCTTTGGGGTGGCGTCGTGGGAGTGTGGTTGATGCCCGCATCCTGGCTATTATAGGGGAGGTCATTATTGAAAACATTAAGAATTTGTACGGTGAAGGCAGAAACCTTGCCAGCTTTATTTAACCACTGCAGAATTAAAAGTGTTTTAGGTTAATACGTTGTATAACTATAACGTGTAAATATATCTACCATTTACAGAAGGGTTTGAATGAAAGAATGTACCAACCTCTTGTTTCATGAATTGAATAAATTGGGGACTTAGGTGTTTATTCCACAGCTGCTGCAGAGGAAGGGCTGATTTCTATTTAACCTGAGCATCCTTAGGTGCTTACTTTTACAGGCCTGGGGCAATTTTCAGGGCCACCAAGAGACTGCATGTGAAGAGCCTGACAGGCGAGTTCCAGGGCTGGGCTGCAGTGCCAGGCTTTGCATCCTCACAGGAGGAGGAAACGTGGCGCTTAATTGCTTCTCATGTATTCTGGGAATGCACATCACACTGAAGCAGCCCCTTTCCTCCCACTCCCTCCCTCCCTGCTCTCCCGCTCCAGAGGCCTGGGAAGGCAGAAAAGCCTGGCCCAGCAGGGTGCTTACAAGAGTGGGGGCCCTGTCCCTGTGCTGGTCTGTACAACTAGCTGCCCCAGTGC;SVTYPE=TRA;CHR2=chr2;END=16369412;SVLEN=0	GT	./.	E35384	contig2	size959	377	cov58.82	45	418	59.877990430622	418	59.877990430622	29	0	0	0	0	1826	464	60	464	60	49	0	0	0	0	564	4	580	3
chr11	70503199	N	.	<TRA>	60	PASS	PRECISE;CT=5to3;CIPOS=-10,10;CIEND=-10,10;SOMATIC;CONSENSUS=TTCATGAATTGAATAAATTGGGGACTTAGGTGTTTATTCCCCAGCTGCTGCAGAGGAAGGGCTGATTTCTATTTAACCTGAGCATCCTTAGGTGCTTACTTTTACAGGCCTGGGGCAATTTTCAGGGCCACCAAGAGACTGCATGTGAAGAGCCTGA;SVTYPE=TRA;CHR2=chr2;END=16369412;SVLEN=0	GT	./.	E35384	contig66	size157	2	cov1.91	492	7	60	7	60	30	0	0	0	0	1826	464	60	464	60	49	0	0	00	564	4	580	3
chr2	16369412	N	.	<TRA>	60	PASS	PRECISE;CT=3to5;CIPOS=-10,10;CIEND=-10,10;SOMATIC;CONSENSUS=ACTGCAGAATTAAAAGTGTTTTAGGTTAATACGTTGTATAACTATAACGTGTAAATATATCTACCATTTACAGAAGGGTTTGAATGAAAGAATGTACCAACCTCTTGTTTCATGAATTGAATAAATTGGGGACTTAGGTGTTTATTCCACAGCTGCTGCAGAGGAAGGGCTGATTTCTATTTAACCTGTGCATCCTTAGGTGCTTACTTTTACAGGCCTGGGGCAATTTTCAGGGCCACCAAGAGACTGCATGTGAAGGGCCTGACAGGCGAGTTCCAGGGCT;SVTYPE=TRA;CHR2=chr11;END=70503195;SVLEN=0	GT	./.	E35384	contig4	size283	5	cov2.65	1826	464	60	464	60	49	0	0	0	0	45	418	59.877990430622	418	59.877990430622	29	0	0	0	0	580	3	564	4

actually they are the same events for :

chr2	16369412	16369413	chr11	70503195	70503196	N	377	+	-	TRA	NovoBreak

I run manta, delly and svaba, they all get only one unique result, but Novobreak get four's

fail to locate the index file

I have tried all my ways to run novoBreak, but this error always occurred,[E::bwa_idx_load] fail to locate the index files. Just as some closed issues.
I'm sure I have indexed my reference by bwa, and even put all the input files in the software folder. It still doesn't work. I wander to know if there some rules in the naming of input files? especially the index files.
All errors output as follows :

Program exit normally
[M::bam2fq_mainloop] processed 9822 reads
begin kmer2id ...
kmer2id takes 0 seconds
begin id2pair...
id2pair takes 1 seconds
begin sorting ids...
sorting ids takes 0 seconds
begin output results...
Outputting results takes 1 seconds
Finished
[E::bwa_idx_load] fail to locate the index files
No such file or directory at /public/home/zhangjing1/software/novoBreak/nb_distribution/infer_bp_v4.pl line 13.

Inclusion in Bioconda

Dear authors,
I would like to provide novobreak via bioconda. Is there a source code repository that could be used to build the tool? Pre-compiled binaries are problematic because we cannot guarantee minimal OS requirements in this case. Also, we would refer to other packages for bwa and samtools. Are there any version requirements for these two?

cant find bwa index

HI !
I pretty sure I made a reference by BWA , and its directory is same as my reference fasta file , and the only different between index name with my reference name is its postfix,which one is .fa another is bwt and amb....
I also pretty sure my environment is very very right ,I can type SSMAKE and type enter get right content
And i install this through git clone 2 days ago , mybe it can prove it is not version problem

THANKS for you job but its really hard for me to deal with this bug

x??

Error running novoBreak

I'm getting an error trying to run novoBreak. It's in my path (placed export PATH=$PWD/nb_distribution/:$PATH in ~/.bashrc and sourced it. I've also made each of the scripts executable with chmod +x *

[moldach@cdr767 alignment]$ bash /scratch/moldach/bin/nb_distribution/run_novoBreak.sh /scratch/moldach/bin/nb_distribution/ ~/projects/def-mtarailo/common/indexes/WS265_wormbase/c_elegans.PRJNA13758.WS265.genomic.fa 470.sorted.dedupped.bam control.bam 1 novoBreak
[Tue May  5 17:04:43 2020]
Building reference kmer...
Calculating number of paired reads ...
[fillin_bitvec] processed   72635794 reads
There are 23315106 reference kmers and 59276238 non-reference kmers
Finished reference kmer building
[Tue May  5 17:06:58 2020]
Freed reference hash and begin building reads hash table...
[Tue May  5 17:06:58 2020]
/scratch/moldach/bin/nb_distribution/run_novoBreak.sh: line 26: 15470 Killed                  $novobreak -i $tumor_bam -c $normal_bam -r $ref -o kmer.stat
[M::bam2fq_mainloop] processed 0 reads
begin kmer2id ...
kmer2id takes 0 seconds
begin id2pair...
id2pair takes 0 seconds
begin sorting ids...
sorting ids takes 0 seconds
begin output results...
Outputting results takes 0 seconds
Finished
(standard_in) 1: parse error
*.txt
No such file or directory at /scratch/moldach/bin/nb_distribution/run_ssake.pl line 11.
awk: fatal: cannot open file `../group_reads/split/*.ssake.asm.out' for reading (No such file or directory)
[main] Version: 0.7.10-r806-dirty
[main] CMD: /scratch/moldach/bin/nb_distribution/bwa mem -t 1 -M /project/6013424/common/indexes/WS265_wormbase/c_elegans.PRJNA13758.WS265.genomic.fa ssake.fa
[main] Real time: 1.141 sec; CPU: 0.237 sec
x??
No such file or directory at /scratch/moldach/bin/nb_distribution/infer_bp_v4.pl line 13.

It give me this error but the file infer_bp_v4.pl is clearly there?

[moldach@cdr767 alignment]$ ll /scratch/moldach/bin/nb_distribution/
total 5008
-rwxr-x--- 1 moldach moldach  955882 May  5 15:17 bwa
-rwxr-x--- 1 moldach moldach    2774 May  5 15:17 fetch_discordant.pl
-rwxr-x--- 1 moldach moldach     980 May  5 15:17 filter_sv2.pl
-rwxr-x--- 1 moldach moldach    1436 May  5 15:17 filter_sv.bak.pl
-rwxr-x--- 1 moldach moldach    1171 May  5 15:17 filter_sv_icgc.pl
-rwxr-x--- 1 moldach moldach    1405 May  5 15:17 filter_sv.pl
-rwxr-x--- 1 moldach moldach    4004 May  5 15:17 group_bp_reads.pl
-rwxr-x--- 1 moldach moldach    4512 May  5 15:17 infer_bp.pl
-rwxr-x--- 1 moldach moldach    4897 May  5 15:17 infer_bp_v4.pl
-rwxr-x--- 1 moldach moldach   13282 May  5 15:17 infer_sv.pl
-rwxr-x--- 1 moldach moldach    1069 May  5 15:17 LICENSE
-rwxr-x--- 1 moldach moldach  702112 May  5 15:17 novoBreak
-rwxr-x--- 1 moldach moldach    5986 May  5 15:17 README.md
-rwxr-x--- 1 moldach moldach    2853 May  5 15:17 run_novoBreak.sh
-rwxr-x--- 1 moldach moldach    2276 May  5 15:17 run_ssake.pl
-rwxr-x--- 1 moldach moldach 3299767 May  5 15:17 samtools
-rwxr-x--- 1 moldach moldach   83364 May  5 15:17 SSAKE

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.