wglab / doc-annovar Goto Github PK

Documentation for the ANNOVAR software

Home Page: http://annovar.openbioinformatics.org

doc-annovar's Introduction

ANNOVAR Documentation

ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, hg38, as well as mouse, worm, fly, yeast and many others).

This is the GitHub repository for the documentation of the ANNOVAR software, described in the paper listed below. Any edit to this repository will be reflected at ANNOVAR home page at http://annovar.openbioinformatics.org instantly.

If you like this repository, please click on the "Star" button on top of this page, to show appreciation to the repository maintainer. If you want to receive notifications on changes to this repository, please click the "Watch" button on top of this page.

Reference

Wang K, Li M, Hakonarson H. ANNOVAR: Functional annotation of genetic variants from next-generation sequencing data Nucleic Acids Research, 38:e164, 2010

doc-annovar's People

Contributors

Stargazers

Watchers

Forkers

coco90417 chenyu600 ecosyseteam sahilseth d0ugal zilhua polymerase2010 qqss88 alicedg tintingli minghao2016 happyshi0402 liuhepk 0820ll congrongssh linkluck tianyunwang daxian248 jjpray quanc1989 iradaniel whiyoo y842739756 liuweigaung xiexiaobang htnani wave-wu hammermann wy2160640 jil7003 xiaoxiaoh16 yahiabioinfo yachenhu pwiner88 yangyangclover shishanfu shaw95 xlm32 nuraktener lh12565 euniceboo11 realgoldace lss20103278 bresf guochaocheng wsxlzhangshixuan adiamb shujch eyherabh taretea ywdqw hanwenjuan mstafatmz vincntzhang zhuhenan cxf514 virginiaveltre likunlink shiyangs babaref annaalekseevna vilinsky yangzixu wangx9chop whiterate jinpuli michua1958 abomczhao duanshumeng kibanez drcemre tbi-kjs austinday shuijingnvhaifl ljm9982 lidd77 zzangyinhyug803 cubetime lllllllai27 geniusphil yhcai27 darked89 huimeiwang eyupsvs mit2021 living1069 roselucia lakshmantejag qquuzhao gsyulinwei jondoe3389 pachchek philthefeel genescha suyanxun xuexiaohua-bio zhangyafeng1 wangyayunyy remimathevet dtcdtcdtcdtc

doc-annovar's Issues

-xref annotation issues with new version

Hello,

I am trying to run table_annovar.pl with the new xref and polish options and am running into a problem. Update it appears to be a polish problem, without that switch it runs without issue.

table_annovar.pl avinput.temp /ghi/butlerr/opt/annovar/humandb/ -buildver hg19 -out rslist -remove -protocol refGene,avsnp147,dbnsfp33a,exac03,gnomad_genome,intervar_20170202 -operation gx,f,f,f,f,f -nastring "-" -polish -xref /ghi/butlerr/opt/annovar/example/gene_fullxref.txt
-----------------------------------------------------------------
NOTICE: Processing operation=gx protocol=refGene

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg19 -dbtype refGene -outfile rslist.refGene -exonsort avinput.temp /ghi/butlerr/opt/annovar/humandb/>
NOTICE: Output files were written to rslist.refGene.

variant_function, rslist.refGene.exonic_variant_function
NOTICE: Reading gene annotation from /ghi/butlerr/opt/annovar/humandb/hg19_refGene.txt ... Done with 63481 transcripts (including 15216 without coding sequence annotation) for 27720 unique genes
NOTICE: Processing next batch with 377 unique variants in 377 input lines
NOTICE: Reading FASTA sequences from /ghi/butlerr/opt/annovar/humandb/hg19_refGeneMrna.fa ... Done with 22 sequences
WARNING: A total of 405 sequences will be ignored due to lack of correct ORF annotation

NOTICE: Running with system command <coding_change.pl rslist.refGene.exonic_variant_function.orig /ghi/butlerr/opt/annovar/humandb//hg19_refGene.txt /ghi/butlerr/opt/annovar/humandb//hg19_refGeneMrna.fa -alltranscript -out rslist.refGene.fa -newevf rslist.refGene.exonic_variant_function>
Error: invalid record found in exonic_variant_function file (exonic format error): <line2       frameshift substitution CFTR:NM_000492:exon1:c.-13_10G  7       117120135 117120158       GCGCCCGAGAGACCATGCAGAGGT        G       rs397508136> at /ghi/butlerr/opt/annovar/coding_change.pl line 51, <EVF> line 2.
Error running system command: <coding_change.pl rslist.refGene.exonic_variant_function.orig /ghi/butlerr/opt/annovar/humandb//hg19_refGene.txt /ghi/butlerr/opt/annovar/humandb//hg19_refGeneMrna.fa -alltranscript -out rslist.refGene.fa -newevf rslist.refGene.exonic_variant_function>

It can run with other avinput files, just not this one (the second line seems to be the issue). the body of the file was generated from avsnp147 lines (below):

3 15676984 15676990 GCGGCTG TCC rs80338684
7 117120135 117120158 GCGCCCGAGAGACCATGCAGAGGT G rs397508136
7 117120136 117120158 CGCCCGAGAGACCATGCAGAGGT - rs397508136
7 117120149 117120149 A G rs397508328
7 117120159 117120159 C A rs397508173
7 117120159 117120159 C T rs397508173
7 117120191 117120192 CT C rs397508742
7 117120192 117120192 T - rs397508742
7 117120202 117120202 G T rs397508746
7 117144332 117144332 G A rs397508796
7 117144332 117144332 G C rs397508796
7 117144332 117144332 G T rs397508796
7 117144368 117144368 C T rs397508168
7 117144390 117144390 C A rs151020603
7 117144390 117144390 C T rs151020603
7 117144418 117144418 G A rs397508243
7 117144418 117144418 G C rs397508243
7 117144418 117144418 G T rs397508243
7 117149087 117149087 G A rs397508249
7 117149089 117149089 G A rs397508256
7 117149093 117149093 G A rs397508279
7 117149094 117149094 G A rs121909025
7 117149097 117149097 - A rs397508294
7 117149097 117149097 T TA rs397508294
7 117149101 117149101 G A rs77284892
7 117149101 117149101 G T rs77284892
7 117149123 117149123 C T rs368505753
7 117149146 117149146 C T rs121908749
7 117149150 117149150 G GT rs397508360
7 117149150 117149150 - T rs397508360

Database warning

when I use annovar, I get the warning:

WARNING: Your index file hg19_gnomad_exome.txt.idx is out of date and will not be used. ANNOVAR can still generate correct results without index file.

what should I do?

Unable to download regsnpintron

Hi,
I am unable to download the regsnpintron database. The command & the corresponding error are shown below. I was able to download other databases without any problems. Is the regsnpintron database still available?
Thanks,
Mike

./annotate_variation.pl -downdb -buildver hg19 -webfrom annovar regsnpintron humandb/
NOTICE: Web-based checking to see whether ANNOVAR new version is available ... Done
NOTICE: Downloading annotation database http://www.openbioinformatics.org/annovar/download/hg19_regsnpintron.txt.gz ... Failed
NOTICE: Downloading annotation database http://www.openbioinformatics.org/annovar/download/hg19_regsnpintron.txt.idx.gz ... Failed
WARNING: Some files cannot be downloaded, including http://www.openbioinformatics.org/annovar/download/hg19_regsnpintron.txt.idx.gz, http://www.openbioinformatics.org/annovar/download/hg19_regsnpintron.txt.gz

Annovar & Bioconda

Dear Mr. Wang,

Would it be possible to get to some kind of licensing agreement to make it possible to make ANNOVAR available as a package in de conda package manager.

Conda is being widely used throughout the scientific community, and adding your excellent software to this system would be of great advantage to everyone. It would also greatly improve the visibility and ease of use of your software.

Thank you for your time
Matthias De Smet and all of the bioconda community

PS: check us out at https://bioconda.github.io

minqueryfrac is not compatible with geneanno

Hello,

I wonder if you let me know how to set *minqueryfrac when using geneanno.
Shown below are two scripts that I tested.

Thank you in advance,

Run successfully
./annotate_variation.pl -geneanno -buildver hg19 -dbtype knownGene -outfile myanno.knownGene -exonsort ./avinput.txt ./annovar/humandb/

Failed
./annotate_variation.pl -geneanno -buildver hg19 -dbtype knownGene -outfile myanno.knownGene --minqueryfrac 0.1 -exonsort ./avinput.txt ./annovar/humandb/

How to handle SV annotation of big DEL or DUP?

When trying to annotate a SV vcf file, big DUP, DEL or INV are not annotated or just the first matching gene in the region is annotated. I have run the command shown below:

perl ${path}/annovar/table_annovar.pl file.vcf ${path}/annovar/humandb/ -buildver hg19 --regionanno -out final_annotation -remove -protocol refGene,clinvar_20170905,exac03,ALL.sites.2015_0_mod8,esp6500siv2_all,avsnp150 -operation g,f,f,f,f,f -nastring . -vcfinput

The VCF does not contain information in the ALT column (region alteration) when the region have a big size and then this column is filled up with DEL,DUP or INV. Example of the VCF file:

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
chr1 1044050 - TCACCACAGCCACCATGTC TC 65 PASS END=; GT:GQ:PR:SR 0/1:65:7,0:13,4
chr1 1431164- G DEL 56 PASS END=1469606 GT:GQ:PR:SR 0/1:56:73,4:62,17

Thinking that this may be the problem, I have run the command below, using as input a file which contains the start and end position of the region so annovar could know the length of the region. The problem appear again big DUP, DEL or INV are not annotated.

perl ${path}/annovar/table_annovar.pl file.avinput ${path}/annovar/humandb/ -buildver hg19 -out test -remove -protocol refGene,clinvar_20170905,exac03,ALL.sites.2015_0_mod8,esp6500siv2_all,avsnp150 -operation g,f,f,f,f,f -nastring .

Example of the avinput file:
1 537588 537647 TTCTCTCCATCCCCCCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCATCCC T
1 1431164 1469606 G <DEL>

I know that one solution may be to fill up the ALT column of the VCF file (as we know the length of the region) but this column are empty because of a reason, as these regions are very big (some have size of 217,860,463), this will create a big file that will be computationally expensive.

I would like to know if there is exits a way to handle annotation of big SV.

Thank you.

downdb argument fails

annotate_variation.pl --downdb gnomad_genome -buildver hg38 humandb/
NOTICE: Web-based checking to see whether ANNOVAR new version is available ... Done
NOTICE: Downloading annotation database http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/gnomad_genome.txt.gz ... Failed
WARNING: Some files cannot be downloaded, including http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/gnomad_genome.txt.gz

The same error is true for other databases too

convert vcf4 to avinput

Hi, developers
I am confused when I convert vcf(vcf4) to avinput format.
for instance,
In the output file I see conversion like this:
origin record:
chr1 9671257 . CTTTT C,CT,CTT,CTTT,CTTTTT
convertion:
chr1 9671259 9671261 TTT -
chr1 9671260 9671261 TT -
chr1 9671261 9671261 T -
chr1 9671261 9671261 - T
the end positions are same.

but in some records like this:
origin record:
chr1 36203712 . AAAATATATATAT A,AAT,AATAT,AATATAT
conversion:
chr1 36203713 36203724 AAATATATATAT -
chr1 36203713 36203722 AAATATATAT -
chr1 36203713 36203720 AAATATAT -
chr1 36203713 36203718 AAATAT -
However，the start positions are same in this conversion.

Dr wang have mentioned ANNOVAR will left-align both input vcf and database in the documation. I also compare the results from gatk4 LeftAlignAndTrimVariants tool. the conversions about the two records like this:

chr1 9671257 . CTTTT C
chr1 9671257 . CTTT C
chr1 9671257 . CTT C
chr1 9671257 . CT C
chr1 9671257 . C CT

chr1 36203712 . AAAATATATATAT A
chr1 36203712 . AAAATATATAT A
chr1 36203712 . AAAATATAT A
chr1 36203712 . AAAATAT A

Despite the way to show ref and alt bases, I saw that both conversions have same POS.

I wonder know does these conversion make sense? and could you please give me a brief description about the algrithm that be used in the convertion from vcf4 to avinput?(I'am not familar with Perl language.so, it is hard for me to read source code.)

Best regards!
xinchang zheng

Creating an avinput from avsnp file error

I am having an issue with convert2annovar.pl from an rsID list. Using the following:

convert2annovar.pl -format rsid rslist.txt -dbsnpfile ${annovar}humandb/hg19_avsnp147.txt > rslist.avinput

I get:
NOTICE: Scanning dbSNP file /opt/annovar/humandb/hg19_avsnp147.txt...
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 1.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 1.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 1.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 1.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 2.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 2.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 2.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 2.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 3.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 3.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 3.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 3.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 4.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 4.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 4.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 4.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 5.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 5.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 5.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 5.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 6.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 6.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 6.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 6.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 7.

And so forth for all lines (I stopped checking after more than a million lines). Input file looks like:
rs150829393
rs112039851
rs1800562
rs200401432
rs118161496
rs1799945

And does the same with or without the rs prefix. Is there a different dbSNP file to use?

Not updated exonic variant funcion in documentation

There are missing values in the documentation of annovar for the exonic variant function: http://annovar.openbioinformatics.org/en/latest/user-guide/gene/

The values that I found looking around in annotate_variation.pl (I am not sure if its all there):

frameshift insertion
frameshift deletion
frameshift substitution
stopgain
stoploss
nonframeshift insertion
nonframeshift deletion
nonframeshift substitution
nonsynonymous SNV
synonymous SNV
unknown

issue annotating GATK Haplotype Caller VCFs with table_annovar.pl

Hello,

I'm attempting to annotate a vcf produced by GATK's haplotype caller using the following command:
##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller --contamination-fraction-to-filter 0.0 --output K000049_1_lane_dupsFlagged_sm_tagged.vcf.gz --intervals /projects/trans_scratch/pedigree_calling/iTARGET_quad/vcfs/cromwell-executions/run_haplotypecaller_on_directory/509c4a23-7811-4d20-abc9-ce5bd7fe6d45/call-HaplotypeCallerGvcf_GATK4/shard-3/haplotypecaller.HaplotypeCallerGvcf_GATK4/3b9dfdc2-7cd5-4212-830c-d89263fe84fa/call-HaplotypeCaller/shard-97/inputs/1764290932/0097-scattered.intervals --input /projects/trans_scratch/pedigree_calling/iTARGET_quad/vcfs/cromwell-executions/run_haplotypecaller_on_directory/509c4a23-7811-4d20-abc9-ce5bd7fe6d45/call-HaplotypeCallerGvcf_GATK4/shard-3/haplotypecaller.HaplotypeCallerGvcf_GATK4/3b9dfdc2-7cd5-4212-830c-d89263fe84fa/call-HaplotypeCaller/shard-97/inputs/-1081452564/K000049_1_lane_dupsFlagged_sm_tagged.bam --reference /projects/trans_scratch/pedigree_calling/iTARGET_quad/vcfs/cromwell-executions/run_haplotypecaller_on_directory/509c4a23-7811-4d20-abc9-ce5bd7fe6d45/call-HaplotypeCallerGvcf_GATK4/shard-3/haplotypecaller.HaplotypeCallerGvcf_GATK4/3b9dfdc2-7cd5-4212-830c-d89263fe84fa/call-HaplotypeCaller/shard-97/inputs/-533456238/GRCh37-lite.fa --emit-ref-confidence NONE --gvcf-gq-bands 1 --gvcf-gq-bands 2 --gvcf-gq-bands 3 --gvcf-gq-bands 4 --gvcf-gq-bands 5 --gvcf-gq-bands 6 --gvcf-gq-bands 7 --gvcf-gq-bands 8 --gvcf-gq-bands 9 --gvcf-gq-bands 10 --gvcf-gq-bands 11 --gvcf-gq-bands 12 --gvcf-gq-bands 13 --gvcf-gq-bands 14 --gvcf-gq-bands 15 --gvcf-gq-bands 16 --gvcf-gq-bands 17 --gvcf-gq-bands 18 --gvcf-gq-bands 19 --gvcf-gq-bands 20 --gvcf-gq-bands 21 --gvcf-gq-bands 22 --gvcf-gq-bands 23 --gvcf-gq-bands 24 --gvcf-gq-bands 25 --gvcf-gq-bands 26 --gvcf-gq-bands 27 --gvcf-gq-bands 28 --gvcf-gq-bands 29 --gvcf-gq-bands 30 --gvcf-gq-bands 31 --gvcf-gq-bands 32 --gvcf-gq-bands 33 --gvcf-gq-bands 34 --gvcf-gq-bands 35 --gvcf-gq-bands 36 --gvcf-gq-bands 37 --gvcf-gq-bands 38 --gvcf-gq-bands 39 --gvcf-gq-bands 40 --gvcf-gq-bands 41 --gvcf-gq-bands 42 --gvcf-gq-bands 43 --gvcf-gq-bands 44 --gvcf-gq-bands 45 --gvcf-gq-bands 46 --gvcf-gq-bands 47 --gvcf-gq-bands 48 --gvcf-gq-bands 49 --gvcf-gq-bands 50 --gvcf-gq-bands 51 --gvcf-gq-bands 52 --gvcf-gq-bands 53 --gvcf-gq-bands 54 --gvcf-gq-bands 55 --gvcf-gq-bands 56 --gvcf-gq-bands 57 --gvcf-gq-bands 58 --gvcf-gq-bands 59 --gvcf-gq-bands 60 --gvcf-gq-bands 70 --gvcf-gq-bands 80 --gvcf-gq-bands 90 --gvcf-gq-bands 99 --indel-size-to-eliminate-in-ref-model 10 --use-alleles-trigger false --disable-optimizations false --just-determine-active-regions false --dont-genotype false --max-mnp-distance 0 --dont-trim-active-regions false --max-disc-ar-extension 25 --max-gga-ar-extension 300 --padding-around-indels 150 --padding-around-snps 20 --kmer-size 10 --kmer-size 25 --dont-increase-kmer-sizes-for-cycles false --allow-non-unique-kmers-in-ref false --num-pruning-samples 1 --recover-dangling-heads false --do-not-recover-dangling-branches false --min-dangling-branch-length 4 --consensus false --max-num-haplotypes-in-population 128 --error-correct-kmers false --min-pruning 2 --debug-graph-transformations false --kmer-length-for-read-error-correction 25 --min-observations-for-kmer-to-be-solid 20 --likelihood-calculation-engine PairHMM --base-quality-score-threshold 18 --pair-hmm-gap-continuation-penalty 10 --pair-hmm-implementation FASTEST_AVAILABLE --pcr-indel-model CONSERVATIVE --phred-scaled-global-read-mismapping-rate 45 --native-pair-hmm-threads 4 --native-pair-hmm-use-double-precision false --debug false --use-filtered-reads-for-annotations false --bam-writer-type CALLED_HAPLOTYPES --dont-use-soft-clipped-bases false --capture-assembly-failure-bam false --error-correct-reads false --do-not-run-physical-phasing false --min-base-quality-score 10 --smith-waterman JAVA --use-new-qual-calculator false --annotate-with-num-discovered-alleles false --heterozygosity 0.001 --indel-heterozygosity 1.25E-4 --heterozygosity-stdev 0.01 --standard-min-confidence-threshold-for-calling 10.0 --max-alternate-alleles 6 --max-genotype-count 1024 --sample-ploidy 2 --num-reference-samples-if-no-call 0 --genotyping-mode DISCOVERY --genotype-filtered-alleles false --output-mode EMIT_VARIANTS_ONLY --all-site-pls false --min-assembly-region-size 50 --max-assembly-region-size 300 --assembly-region-padding 100 --max-reads-per-alignment-start 50 --active-probability-threshold 0.002 --max-prob-propagation-distance 50 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --minimum-mapping-quality 20 --disable-tool-default-annotations false --enable-all-annotations false",Version=4.0.10.0,Date="March 5, 2019 4:08:02 PM PST">

Some of the resulting vcf records, when run through annovar have their chromosome removed. As an example:
Before annovar
1 9407759 . AC . 86.73 . AN=2;DP=34;MQ=60.0 GT:AD:DP 0/0:34:34
After annovar
9407759 . AC . 86.73 . AN=2;DP=34;MQ=60.00 GT:AD:DP;ANNOVAR_DATE=2018-04-16;cosmic70=.;Func.refGene=intronic;Gene.refGene=SPSB1;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;esp6500siv2_all=.;1000g2015aug_all=.;avsnp147=.;SIFT_score=.;SIFT_converted_rankscore=.;SIFT_pred=.;Polyphen2_HDIV_score=.;Polyphen2_HDIV_rankscore=.;Polyphen2_HDIV_pred=.;Polyphen2_HVAR_score=.;Polyphen2_HVAR_rankscore=.;Polyphen2_HVAR_pred=.;LRT_score=.;LRT_converted_rankscore=.;LRT_pred=.;MutationTaster_score=.;MutationTaster_converted_rankscore=.;MutationTaster_pred=.;MutationAssessor_score=.;MutationAssessor_score_rankscore=.;MutationAssessor_pred=.;FATHMM_score=.;FATHMM_converted_rankscore=.;FATHMM_pred=.;PROVEAN_score=.;PROVEAN_converted_rankscore=.;PROVEAN_pred=.;VEST3_score=.;VEST3_rankscore=.;MetaSVM_score=.;MetaSVM_rankscore=.;MetaSVM_pred=.;MetaLR_score=.;MetaLR_rankscore=.;MetaLR_pred=.;M-CAP_score=.;M-CAP_rankscore=.;M-CAP_pred=.;CADD_raw=.;CADD_raw_rankscore=.;CADD_phred=.;DANN_score=.;DANN_rankscore=.;fathmm-MKL_coding_score=.;fathmm-MKL_coding_rankscore=.;fathmm-MKL_coding_pred=.;Eigen_coding_or_noncoding=.;Eigen-raw=.;Eigen-PC-raw=.;GenoCanyon_score=.;GenoCanyon_score_rankscore=.;integrated_fitCons_score=.;integrated_fitCons_score_rankscore=.;integrated_confidence_value=.;GERP++_RS=.;GERP++_RS_rankscore=.;phyloP100way_vertebrate=.;phyloP100way_vertebrate_rankscore=.;phyloP20way_mammalian=.;phyloP20way_mammalian_rankscore=.;phastCons100way_vertebrate=.;phastCons100way_vertebrate_rankscore=.;phastCons20way_mammalian=.;phastCons20way_mammalian_rankscore=.;SiPhy_29way_logOdds=.;SiPhy_29way_logOdds_rankscore=.;Interpro_domain=.;GTEx_V6_gene=.;GTEx_V6_tissue=.;CLINSIG=.;CLNDBN=.;CLNACC=.;CLNDSDB=.;CLNDSDBID=.;ExAC_ALL=.;ExAC_AFR=.;ExAC_AMR=.;ExAC_EAS=.;ExAC_FIN=.;ExAC_NFE=.;ExAC_OTH=.;ExAC_SAS=.;dbscSNV_ADA_SCORE=.;dbscSNV_RF_SCORE=.;Interpro_domain=.;rmsk=.;Func.ensGene=intronic;Gene.ensGene=ENSG00000171621;GeneDetail.ensGene=.;ExonicFunc.ensGene=.;AAChange.ensGene=.;Func.knownGene=intronic;Gene.knownGene=SPSB1;GeneDetail.knownGene=.;ExonicFunc.knownGene=.;AAChange.knownGene=.;ALLELE_END 0/0:34:34

The annovar command used to generate this file is as follows
perl /projects/tvira_prj/tools/annovar/table_annovar.pl error_testing_vcf.vcf /projects/tvira_prj/tools/annovar/humandb/ -buildver hg19 -vcfinput -out /projects/trans_scratch/pedigree_calling/iTARGET_quad/pedcall/K000049/K000049_error_records.vcf_Annovar -remove -protocol cosmic70,refGene,esp6500siv2_all,1000g2015aug_all,avsnp147,dbnsfp33a,clinvar_20170905,exac03,dbscsnv11,dbnsfp31a_interpro,rmsk,ensGene,knownGene -operation f,g,f,f,f,f,f,f,f,f,f,g,g

I've attached a vcf which when annotated with the above command reproduces the error. Please advise if there is an option I am missing to handle the vcf records formatted by gatk.

K000049_error_records.txt

word Error: (obselete!) -> obsolete

One word is not correct:

1000 Genomes Project (2012 April) annotations (obselete!) -> obsolete

url: https://github.com/WGLab/doc-ANNOVAR/blob/master/docs/user-guide/filter.md

Died at /work/Software/Download/Variant_Package/annovar/coding_change.pl line 553, <FASTA> line 149454.

Hi Developer!
I test the latest annovar, and get a error.

log record as below:

$ table_annovar.pl $Sample.combined.vcf $Anno_db --vcfinput -buildver hg38 -out $Sample --checkfile --otherinfo -remove -polish -protocol cytoBand,refGeneWithVer,ensGene,knownGene -operation r,g,g,g -nastring .

NOTICE: Running with system command <convert2annovar.pl -includeinfo -allsample -withfreq -format vcf4 N0202G2.combined.vcf > N0202G2.avinput>
NOTICE: Finished reading 365921 lines from VCF file
NOTICE: A total of 362523 locus in VCF file passed QC threshold, representing 346476 SNPs (239967 transitions and 106509 transversions) and 16547 indels/substitutions
NOTICE: Finished writing allele frequencies based on 346476 SNP genotypes (239967 transitions and 106509 transversions) and 16547 indels/substitutions for 1 samples

NOTICE: Running with system command </work/Software/Download/Variant_Package/annovar/table_annovar.pl N0202G2.avinput /work/Database/Annovar_db/hg38_20180130 -buildver hg38 -outfile N0202G2 --checkfile --otherinfo -remove -polish -protocol cytoBand,refGeneWithVer,ensGene,knownGene -operation r,g,g,g -nastring . -otherinfo>

NOTICE: Processing operation=r protocol=cytoBand

NOTICE: Running with system command <annotate_variation.pl -regionanno -dbtype cytoBand -buildver hg38 -outfile N0202G2 N0202G2.avinput /work/Database/Annovar_db/hg38_20180130>
NOTICE: Output file is written to N0202G2.hg38_cytoBand
NOTICE: Reading annotation database /work/Database/Annovar_db/hg38_20180130/hg38_cytoBand.txt ... Done with 1293 regions
NOTICE: Finished region-based annotation on 363002 genetic variants
NOTICE: Variants with invalid input format were written to N0202G2.invalid_input

NOTICE: Processing operation=g protocol=refGeneWithVer

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGeneWithVer -outfile N0202G2.refGeneWithVer -exonsort N0202G2.avinput /work/Database/Annovar_db/hg38_20180130>
NOTICE: Output files were written to N0202G2.refGeneWithVer.variant_function, N0202G2.refGeneWithVer.exonic_variant_function
NOTICE: Reading gene annotation from /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVer.txt ... Done with 74727 transcripts (including 18443 without coding sequence annotation) for 28059 unique genes
NOTICE: Processing next batch with 363002 unique variants in 363002 input lines
NOTICE: Reading FASTA sequences from /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVerMrna.fa ... Done with 21138 sequences
WARNING: A total of 526 sequences will be ignored due to lack of correct ORF annotation
NOTICE: Variants with invalid input format were written to N0202G2.refGeneWithVer.invalid_input

NOTICE: Running with system command <coding_change.pl N0202G2.refGeneWithVer.exonic_variant_function.orig /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVer.txt /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVerMrna.fa -alltranscript -out N0202G2.refGeneWithVer.fa -newevf N0202G2.refGeneWithVer.exonic_variant_function>
Died at /work/Software/Download/Variant_Package/annovar/coding_change.pl line 553, line 149454.
Error running system command: <coding_change.pl N0202G2.refGeneWithVer.exonic_variant_function.orig /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVer.txt /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVerMrna.fa -alltranscript -out N0202G2.refGeneWithVer.fa -newevf N0202G2.refGeneWithVer.exonic_variant_function>
Error running system command: </work/Software/Download/Variant_Package/annovar/table_annovar.pl N0202G2.avinput /work/Database/Annovar_db/hg38_20180130 -buildver hg38 -outfile N0202G2 --checkfile --otherinfo -remove -polish -protocol cytoBand,refGeneWithVer,ensGene,knownGene -operation r,g,g,g -nastring . -otherinfo>

And then I check temp file $Sample.refGeneWithVer.fa file and found :

$ tail $Sample.refGeneWithVer.fa
LNLGIFASRLYYHWCKPQQKGLRLLCGSQVPVEVMGFPEFADCWENFVDHEKPLSFNPYKMLEELDKNSRAIKRRLERIKQS*

line343937 NM_004711.4 WILDTYPE
MEGGAYGAGKAGGAFDPYTLVRQPHTILRVVSWLFSIVVFGSIVNEGYLNSASEGEEFCIYNRNPNACSYGVAVGVLAFLTCLLYLALDVYFPQISSVKD
RKKAVLSDIGVSAFWAFLWFVGFCYLANQWQVSKPKDNPLNEGTDAARAAIAFSFFSIFTWAGQAVLAFQRYQIGADSALFSQDYMDPSQDSSMPYAPYV
EPTGPDPAGMGGTYQQPANTFDTEPQGYQSQGY*
line343937 NM_004711.4 c.605_606insCAA p.P202_T203insN protein-altering (position 202-203 has insertion N)
MEGGAYGAGKAGGAFDPYTLVRQPHTILRVVSWLFSIVVFGSIVNEGYLNSASEGEEFCIYNRNPNACSYGVAVGVLAFLTCLLYLALDVYFPQISSVKD
RKKAVLSDIGVSAFWAFLWFVGFCYLANQWQVSKPKDNPLNEGTDAARAAIAFSFFSIFTWAGQAVLAFQRYQIGADSALFSQDYMDPSQDSSMPYAPYV
EPNTGPDPAGMGGTYQQPANTFDTEPQGYQSQGY*
WARNING: invalid triplets found in DNA sequence to be translated: in

Then I get line343937 info from $Sample.refGeneWithVer.exonic_variant_function.orig file, but don't not found some problem.

$ grep "NM_004711" $Sample.refGeneWithVer.exonic_variant_function.orig
line343937 nonframeshift insertion SYNGR1:NM_004711.4:exon4:c.605_606insCAA:p.P202delinsPN chr22 39381817 39381817 - CAA 1 9966.73 223 chr22 39381817 rs149306472 C CCAA 9966.73 PASS AC=2;AF=1.00;AN=2;DB;DP=230;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.000;MQ=60.00;QD=44.69;SOR=1.179;set=variant2 GT:AD:DP:GQ:PL 1/1:0,223:223:99:10004,673,0

Downloading annotation database hg38_dbnsfp33a.txt.gz ... Failed

perl annotate_variation.pl -buildver hg38 -downdb -webfrom annovar dbnsfp33a humandb/
NOTICE: Web-based checking to see whether ANNOVAR new version is available ... Done
NOTICE: Downloading annotation database http://www.openbioinformatics.org/annovar/download/hg38_dbnsfp33a.txt.gz ... Failed
NOTICE: Downloading annotation database http://www.openbioinformatics.org/annovar/download/hg38_dbnsfp33a.txt.idx.gz ... OK
NOTICE: Uncompressing downloaded files
NOTICE: Finished downloading annotation files for hg38 build version, with files saved at the 'humandb' directory
WARNING: Some files cannot be downloaded, including http://www.openbioinformatics.org/annovar/download/hg38_dbnsfp33a.txt.gz

Any alternatives to download dbnsfp33a?

Creating our own indexes

Hello,
I'm looking into creating our own in-house database of variant VCFs that we can use as an input for ANNOVAR annotation. I was wondering if there is a way to generate indexes for files to improve the speed of annotation.

The only thing I've seen online is this thread from seqansers:
http://seqanswers.com/forums/showthread.php?t=23535

Is there any other way or should I give this random perl script a try?

Thanks a ton,
Phil Richmond

annotation for plant genome

Can I annotate my plant data using annovar ?? I have soybeam data. how i create database of that genome ?

table_annovar.pl : Output file inconsistencies if --onetranscript is not used

Hello,

I found that if I use default parameter for hgvs annotation some lines have more fields than others.
This is because when the variant is on multiple NM, multiple hgvs are separated by a tabulation.
This is problematic for scripts that parse annovar output.

Bests regards

AA change conflicts with cdna change

7 55242464 . AGGAATTAAGAGAAGC A . . DP=1185;ECNT=1;POP_AF=4.06e-06;TLOD=45.87;ANNOVAR_DATE=2018-04-16;Func.refGene=exonic;Gene.refGene=EGFR;GeneDetail.refGene=.;ExonicFunc.refGene=nonframeshift_deletion;AAChange.refGene=EGFR:NM_005228:exon19:c.2235_2249del:p.745_750del;ALLELE_END

I got an nonframeshift_deletion , 15bp in cdna but 6aa in protein.

It should be p.746_750del
Is there any thing wrong?

Input file renaming

I haven't noticed this before and I'm trying to figure out what is happening. A recent file that I'm processing with ANNOVAR is being renamed (filename is truncated) during processing, resulting in an unexpected output file name.

The command being run:

time perl /media/joannaprzybyl/4/software/annovar/table_annovar.pl \
    -vcfinput NPC.HK.12PY0019T-DNA.12PY0019-ensemble.temp3.vcf \
    /media/joannaprzybyl/4/software/annovar/humandb/ \
    -buildver hg19 \
    -out NPC.HK.12PY0019T-DNA.12PY0019-ensemble.temp3.vcf \
    -protocol refGene,ensGene,clinvar_20150330,popfreq_all_20150413,cosmic70,snp129,snp132,snp138,avsift \
    -operation g,g,f,f,f,f,f,f,f \
    -otherinfo \
    &>annovar.log

results in NPC.HK.12PY0019T-DNA.12PY0019.avinput and consequently NPC.HK.12PY0019T-DNA.12PY0019.hg19_multianno.vcf. Any idea why the -ensemble.temp3 is removed?

A question about clinvar 20180603 lose one significant position!!

Honored Gentlemen, I found a significant position(chr9:136501794) doesn't appear in 20180603 version clinvar, but it still been found in 20170905 version clinvar. When I go to Clinvar website, there has the position information, and show its a pathogenetic dbsnp, so I'm confused!

wANNOVAR

Hi could you give me an indication when the server for wANNOVAR is going to be up again? will it still be accessed from http://wannovar.wglab.org/
thank you for your assistance
Elena

How could I Convert this format to avinput

Dear Dr.Wang
I use the Breakdancer to identify SVs and the output format like this
#Chr1 Pos1 Orientation1 Chr2 Pos2 Orientation2 Type Size Score num_Reads num_Reads_lib danban.bam

original_scaffold_197 5227 21+21- original_scaffold_197 5355 21+21- INS -360 33 21 danban.bam|21 NA
original_scaffold_197 6370 33+33- original_scaffold_197 6664 33+33- INS -328 43 34 danban.bam|34 NA
original_scaffold_197 8436 0+0- original_scaffold_197 9379 16+16- INS -327 38 16 danban.bam|16 5.85

for example ,the 1st line means An 360 bp insertion detected by BreakDancer between scaffold_197:5227 and scaffold_197:5355 with 21 supporting read pairs,and a confidence score of 33. this software also said that "Real SV breakpoints are expected to reside within the predicted boundaries."

①how could i convert this format to avinput,may i use this format to describe this INS like the annovar README and worked in gene-based annonation?

original_scaffold_197 5227 5355 0 0 comments:a 360bp insertion between there

②and how could i describe the inter-chromosomal translocation between different Chr ,and intra-chromosomal translocation?
③I noticed that BreakDancer could use the .bed format output ,.i tried to convert .bed to .vcf but failed ,and annovar don't said support .bed format, so i can't use convert2annovar.pl convert bedfile to avinput ,right?

                                                                                                        Thanks for your time

update the gnomAD database in ANNOVAR

Recently, gnomAD released version r2.1. Is there a plan to update the gnomAD database in ANNOVAR, or how to make my own gnomAD databases for use in ANNOVAR, just like the ClinVar?

why ANNOVAR report a 3-bp deletion as 2 amino-acid deletion

For example "chr9 139390944 TGTG T" (hg19 coordinate) is annotated as "AAChange.refGene=NOTCH1:NM_017617:exon34:c.7244_7246del:p.2415_2416del".But I think it should be annotated as "AAChange.refGene=NOTCH1:NM_017617:exon34:c.7244_7246:p.2415_2416delinsQ"

LoFTool in table_annovar.pl

I am not sure I have the correctly formatted loftool database. I downloaded the file from the readthedocs website (LoFtool_scores.txt) and renamed it to /opt/annovar/humandb/hg19_loftool.txt but when I run table_annovar.pl with this command:

table_annovar.pl rslist.avinput ${annovar}humandb/ -buildver hg19 -out rslist -remove -protocol dbnsfp33a,exac03,gnomad_exome,clinvar_20170130,loftool -operation f,f,f,f,g -nastring "-"

I get correct outputs for every other operation but loftool:
NOTICE: Processing operation=g protocol=loftool

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg19 -dbtype loftool -outfile rslist.loftool -exonsort rslist.avinput /opt/annovar/humandb/>
NOTICE: Output files were written to rslist.loftool.variant_function, rslist.loftool.exonic_variant_function
NOTICE: Reading gene annotation from /opt/annovar/humandb/hg19_loftool.txt ... Error: invalid record in /opt/annovar/humandb/hg19_loftool.txt (>=11 fields expected in loftool gene definition file):
Error running system command: <annotate_variation.pl -geneanno -buildver hg19 -dbtype loftool -outfile rslist.loftool -exonsort rslist.avinput /opt/annovar/humandb/>

Do I need to process the Loftool file prior to running annovar (it only has two columns, << 11 it is looking for)?

ANNOVAR filter an important complext mutation

I use ANNOVAR to annotate a vcf , but it filter an important mutaion, which is as follows:

chr7 55242468 . ATTAAGAGAAGCAA AC

How to integrate hg38_multianno.csv back to bcf file?

Does ANNOVAR offer the possibility to integrate hg38_multianno.csv back into original bcf file?

Thanks,
Serghei

gff3toGenePred command not found

Dear Kai Wang,

Follow up to this How to build database for virus in annovar.

I created gff3 file from each gene's genbank file of a virus. As you mentioned, I referred following section in website and the recent nature protocol paper.

What about GFF3 file for new species?
Then go to http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/, download the gff3ToGenePred tool, and convert the GFF3 file to a format that ANNOVAR can read. Everything else is the same as above.

As per the link, I downloaded gff3toGenePred program in to my local machine.
gff3ToGenePred - convert a GFF3 file to a genePred file
usage:
gff3ToGenePred inGff3 outGp

I received following message when executing this command

$gff3ToGenePred gff_files/input.gff outputGP
gff3ToGenePred: command not found

Annotated VCF output botched for ChrM and ChrY

Hi,

I've been using ANNOVAR with great results for a long time now, thank you for great work. I recently decided to switch from using the *multianno.txt output format to the more cross-compatible *multianno.vcf. However, I've run into a formatting issue in the output for lines beginning with "chrM" and "chrY".

See example below:

Input line:
chrM 73 . G A 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0

Incorrectly formatted output *multianno.vcf line:
185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0 ;ANNOVAR_DATE=2014-07-22;Func.refGene=;Gene.refGene=;GeneDetail.refGene=;ExonicFunc.refGene=;AAChange.refGene=;Func.ensGene=;Gene.ensGene=;GeneDetail.ensGene=;ExonicFunc.ensGene=;AAChange.ensGene=;clinvar_20150330=;PopFreqMax=;1000G_ALL=;1000G_AFR=;1000G_AMR=;1000G_EAS=;1000G_EUR=;1000G_SAS=;ExAC_ALL=;ExAC_AFR=;ExAC_AMR=;ExAC_EAS=;ExAC_FIN=;ExAC_NFE=;ExAC_OTH=;ExAC_SAS=;ESP6500siv2_ALL=;ESP6500siv2_AA=;ESP6500siv2_EA=;CG46=;cosmic70=;snp129=;snp132=;snp138=;avsift=;ALLELE_END

Note: the *multianno.txt file is also malformed but in a different way:
chrM . 185.36 chrM 73 . G A 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0

I am manually skipping these lines for now, but would be helpful to figure out the root of this problem. Let me know if any ideas, and I'll keep troubleshooting as well.

Thanks.

dbNSFP 3.5a for whole genome?

Is there a whole genome version of dbNSFP3.5a available for annotation using ANNOVAR?

Gene.refGene shows NM_ number of the upstream gene?

It seems Annovar uses the previous Gene.refGene to annotate a variant of the next gene. For example, NM_152486 is for SAMD11 but in the third line it is still in the Gene.refGene field where NM_015658 should be used (the gene is NOC2L now).

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&highlight=hg19.chr1:881627-881627&position=chr1:881577-881677

1 877831 . T C 58.74 PASS DP=3;MQ=60.00;FractionInformativeReads=1.000;ANNOVAR_DATE=2016-02-01;Func.refGene=exonic;Gene.refGene=NM_152486;GeneDetail.refGene=.;ExonicFunc.refGene=nonsynonymous_SNV;AAChange.refGene=SAMD11:NM_152486:exo
n10:c.1027T>C:p.W343R;cytoBand=1p36.33;genomicSuperDups=.;esp6500siv2_all=.;1000g2015aug_all=1;SIFT_score=1;SIFT_pred=T;Polyphen2_HDIV_score=0.0;Polyphen2_HDIV_pred=B;Polyphen2_HVAR_score=0.0;Polyphen2_HVAR_pred=B;LRT_score=0.003;LRT_pred=N;MutationTaster_score=1
.000;MutationTaster_pred=P;MutationAssessor_score=-2.085;MutationAssessor_pred=N;FATHMM_score=.;FATHMM_pred=.;RadialSVM_score=-0.980;RadialSVM_pred=T;LR_score=0.000;LR_pred=T;VEST3_score=0.421;CADD_raw=-1.112;CADD_phred=0.132;GERP++_RS=2.51;phyloP46way_placental=
0.624;phyloP100way_vertebrate=1.209;SiPhy_29way_logOdds=7.519;ExAC_ALL=0.9999;ExAC_AFR=1;ExAC_AMR=1;ExAC_EAS=1;ExAC_FIN=1;ExAC_NFE=1;ExAC_OTH=1;ExAC_SAS=0.9999;avsnp147=rs6672356;CLINSIG=.;CLNDBN=.;CLNACC=.;CLNDSDB=.;CLNDSDBID=.;gnomAD_genome_ALL=0.9998;gnomAD_ge
nome_AFR=0.9994;gnomAD_genome_AMR=1;gnomAD_genome_ASJ=1;gnomAD_genome_EAS=1;gnomAD_genome_FIN=1;gnomAD_genome_NFE=0.9999;gnomAD_genome_OTH=1;gnomAD_exome_ALL=0.9999;gnomAD_exome_AFR=0.9994;gnomAD_exome_AMR=0.9999;gnomAD_exome_ASJ=1;gnomAD_exome_EAS=1;gnomAD_exome
FIN=1;gnomAD_exome_NFE=0.9999;gnomAD_exome_OTH=1;gnomAD_exome_SAS=0.9997;ALLELE_END;ANN=C|missense_variant|MODERATE|SAMD11|SAMD11|transcript|NM_152486.2|protein_coding|10/14|c.1027T>C|p.Trp343Arg|1107/2554|1027/2046|343/681||,C|downstream_gene_variant|MODIFIER|N
OC2L|NOC2L|transcript|NM_015658.3|protein_coding||c.*2243A>G|||||1752|;CSQ=C|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000420190|protein_coding|||||||||||1|3160|1|HGNC|28706||||||,C|downstream_gene_variant|MODIFIER|NOC2L|ENSG0000018
8976|Transcript|ENST00000496938|processed_transcript|||||||||||1|2868|-1|HGNC|24517||||||0.887,C|missense_variant|MODERATE|SAMD11|ENSG00000187634|Transcript|ENST00000342066|protein_coding|10/14||ENST00000342066.3:c.1027T>C|ENSP00000342313.3:p.Trp343Arg|1110|1027|
343|W/R|Tgg/Cgg||1||1|HGNC|28706|YES||CCDS2.2|NM_152486.2||,C|downstream_gene_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000327044|protein_coding|||||||||||1|1753|-1|HGNC|24517|YES||CCDS3.1|NM_015658.3||0.887,C|non_coding_transcript_exon_variant&non
coding_transcript_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000464948|retained_intron|1/2||ENST00000464948.1:n.286T>C||286||||||1||1|HGNC|28706||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|SAMD11|ENSG00000187634
|Transcript|ENST00000466827|retained_intron|2/2||ENST00000466827.1:n.191T>C||191||||||1||1|HGNC|28706||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000474461|retained_intron|3/4||ENST0000
0474461.1:n.389T>C||389||||||1||1|HGNC|28706||||||,C|missense_variant|MODERATE|SAMD11|ENSG00000187634|Transcript|ENST00000455979|protein_coding|4/7||ENST00000455979.1:c.507T>C|ENSP00000412228.1:p.Trp170Arg|507|508|170|W/R|Tgg/Cgg||1||1|HGNC|28706||||||,C|downstre
am_gene_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000483767|retained_intron|||||||||||1|1753|-1|HGNC|24517||||||0.887,C|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000478729|processed_transcript|||||||||||1|278|1|HGNC|28
706||||||,C|downstream_gene_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000477976|retained_intron|||||||||||1|1754|-1|HGNC|24517||||||0.887,C|missense_variant|MODERATE|SAMD11|ENSG00000187634|Transcript|ENST00000341065|protein_coding|8/12||ENST00000341
065.4:c.750T>C|ENSP00000349216.4:p.Trp251Arg|750|751|251|W/R|Tgg/Cgg||1||1|HGNC|28706|||||| GT:AD:DP:GQ:PL:SB 1/1:0,3,0:3:9:91,9,0,91,9,91:0,0,2,1
1 880238 . A G 1053.77 PASS DP=52;MQ=60.00;MQRankSum=0.839;ReadPosRankSum=1.379;FractionInformativeReads=0.981;ANNOVAR_DATE=2016-02-01;Func.refGene=intronic\x3bdownstream;Gene.refGene=NM_152486;GeneDetail.refGene=.;ExonicFunc.refGene=.
;AAChange.refGene=.;cytoBand=1p36.33;genomicSuperDups=.;esp6500siv2_all=.;1000g2015aug_all=0.920927;SIFT_score=.;SIFT_pred=.;Polyphen2_HDIV_score=.;Polyphen2_HDIV_pred=.;Polyphen2_HVAR_score=.;Polyphen2_HVAR_pred=.;LRT_score=.;LRT_pred=.;MutationTaster_score=.;Mu
tationTaster_pred=.;MutationAssessor_score=.;MutationAssessor_pred=.;FATHMM_score=.;FATHMM_pred=.;RadialSVM_score=.;RadialSVM_pred=.;LR_score=.;LR_pred=.;VEST3_score=.;CADD_raw=.;CADD_phred=.;GERP++_RS=.;phyloP46way_placental=.;phyloP100way_vertebrate=.;SiPhy_29w
ay_logOdds=.;ExAC_ALL=.;ExAC_AFR=.;ExAC_AMR=.;ExAC_EAS=.;ExAC_FIN=.;ExAC_NFE=.;ExAC_OTH=.;ExAC_SAS=.;avsnp147=rs3748592;CLINSIG=.;CLNDBN=.;CLNACC=.;CLNDSDB=.;CLNDSDBID=.;gnomAD_genome_ALL=0.9351;gnomAD_genome_AFR=0.9087;gnomAD_genome_AMR=0.9487;gnomAD_genome_ASJ=
0.9238;gnomAD_genome_EAS=0.9066;gnomAD_genome_FIN=0.9614;gnomAD_genome_NFE=0.9460;gnomAD_genome_OTH=0.9490;gnomAD_exome_ALL=.;gnomAD_exome_AFR=.;gnomAD_exome_AMR=.;gnomAD_exome_ASJ=.;gnomAD_exome_EAS=.;gnomAD_exome_FIN=.;gnomAD_exome_NFE=.;gnomAD_exome_OTH=.;gnom
AD_exome_SAS=.;ALLELE_END;ANN=G|downstream_gene_variant|MODIFIER|SAMD11|SAMD11|transcript|NM_152486.2|protein_coding||c.*705A>G|||||277|,G|intron_variant|MODIFIER|NOC2L|NOC2L|transcript|NM_015658.3|protein_coding|18/18|c.2144-58T>C||||||;CSQ=G|downstream_gene_var
iant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000496938|processed_transcript|||||||||||1|461|-1|HGNC|24517||||||0.887,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000342066|protein_coding|||||||||||1|283|1|HGNC|28706|YES||CCDS
2.2|NM_152486.2||,G|intron_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000327044|protein_coding||18/18|ENST00000327044.6:c.2144-58T>C||||||||1||-1|HGNC|24517|YES||CCDS3.1|NM_015658.3||0.887,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Tra
nscript|ENST00000464948|retained_intron|||||||||||1|1966|1|HGNC|28706||||||,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000466827|retained_intron|||||||||||1|2056|1|HGNC|28706||||||,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG0000
0187634|Transcript|ENST00000474461|retained_intron|||||||||||1|1864|1|HGNC|28706||||||,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000455979|protein_coding|||||||||||1|599|1|HGNC|28706||||||,G|intron_variant&non_coding_transcript_va
riant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000483767|retained_intron||4/4|ENST00000483767.1:n.1015-58T>C||||||||1||-1|HGNC|24517||||||0.887,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000478729|processed_transcript|||||||
||||1|2685|1|HGNC|28706||||||,G|intron_variant&non_coding_transcript_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000477976|retained_intron||16/16|ENST00000477976.1:n.3606-58T>C||||||||1||-1|HGNC|24517||||||0.887,G|downstream_gene_variant|MODIFIER|SAMD
11|ENSG00000187634|Transcript|ENST00000341065|protein_coding|||||||||||1|283|1|HGNC|28706|||||| GT:AD:DP:GQ:PL:SB 0/1:20,31,0:51:99:1082,0,546,1781,639,1781:8,12,14,17
1 881627 . G A 332.77 PASS DP=25;MQ=60.00;MQRankSum=-0.739;ReadPosRankSum=-0.192;FractionInformativeReads=1.000;ANNOVAR_DATE=2016-02-01;Func.refGene=exonic\x3bdownstream;Gene.refGene=NM_152486;GeneDetail.refGene=.;ExonicFunc.refGene=s
ynonymous_SNV;AAChange.refGene=NOC2L:NM_015658:exon16:c.1843C>T:p.L615L;cytoBand=1p36.33;genomicSuperDups=.;esp6500siv2_all=0.4748;1000g2015aug_all=0.441893;SIFT_score=.;SIFT_pred=.;Polyphen2_HDIV_score=.;Polyphen2_HDIV_pred=.;Polyphen2_HVAR_score=.;Polyphen2_HVA
R_pred=.;LRT_score=.;LRT_pred=.;MutationTaster_score=.;MutationTaster_pred=.;MutationAssessor_score=.;MutationAssessor_pred=.;FATHMM_score=.;FATHMM_pred=.;RadialSVM_score=.;RadialSVM_pred=.;LR_score=.;LR_pred=.;VEST3_score=.;CADD_raw=.;CADD_phred=.;GERP++_RS=.;ph
yloP46way_placental=.;phyloP100way_vertebrate=.;SiPhy_29way_logOdds=.;ExAC_ALL=0.5653;ExAC_AFR=0.1397;ExAC_AMR=0.4840;ExAC_EAS=0.6560;ExAC_FIN=0.6221;ExAC_NFE=0.6283;ExAC_OTH=0.5737;ExAC_SAS=0.5648;avsnp147=rs2272757;CLINSIG=.;CLNDBN=.;CLNACC=.;CLNDSDB=.;CLNDSDBI
D=.;gnomAD_genome_ALL=0.4889;gnomAD_genome_AFR=0.1429;gnomAD_genome_AMR=0.4844;gnomAD_genome_ASJ=0.5265;gnomAD_genome_EAS=0.6751;gnomAD_genome_FIN=0.6175;gnomAD_genome_NFE=0.6336;gnomAD_genome_OTH=0.5855;gnomAD_exome_ALL=0.5703;gnomAD_exome_AFR=0.1305;gnomAD_exom
e_AMR=0.4852;gnomAD_exome_ASJ=0.5441;gnomAD_exome_EAS=0.6577;gnomAD_exome_FIN=0.6262;gnomAD_exome_NFE=0.6355;gnomAD_exome_OTH=0.5652;gnomAD_exome_SAS=0.5650;ALLELE_END;ANN=A|synonymous_variant|LOW|NOC2L|NOC2L|transcript|NM_015658.3|protein_coding|16/19|c.1843C>T|
p.Leu615Leu|1902/2800|1843/2250|615/749||,A|downstream_gene_variant|MODIFIER|SAMD11|SAMD11|transcript|NM_152486.2|protein_coding||c.*2094G>A|||||1666|;CSQ=A|upstream_gene_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000496938|processed_transcript||||||
|||||1|685|-1|HGNC|24517||||||0.887,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000342066|protein_coding|||||||||||1|1672|1|HGNC|28706|YES||CCDS2.2|NM_152486.2||,A|synonymous_variant|LOW|NOC2L|ENSG00000188976|Transcript|ENST00000327
044|protein_coding|16/19||ENST00000327044.6:c.1843C>T|ENST00000327044.6:c.1843C>T(p.%3D)|1893|1843|615|L|Ctg/Ttg||1||-1|HGNC|24517|YES||CCDS3.1|NM_015658.3||0.887,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000464948|retained_intron
|||||||||||1|3355|1|HGNC|28706||||||,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000466827|retained_intron|||||||||||1|3445|1|HGNC|28706||||||,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000474461|reta
ined_intron|||||||||||1|3253|1|HGNC|28706||||||,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000455979|protein_coding|||||||||||1|1988|1|HGNC|28706||||||,A|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|NOC
2L|ENSG00000188976|Transcript|ENST00000483767|retained_intron|2/5||ENST00000483767.1:n.699C>T||699||||||1||-1|HGNC|24517||||||0.887,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000478729|processed_transcript|||||||||||1|4074|1|HGNC|2
8706||||||,A|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000477976|retained_intron|14/17||ENST00000477976.1:n.3290C>T||3290||||||1||-1|HGNC|24517||||||0.887,A|downstream_gene_variant|MODIFIER|SA
MD11|ENSG00000187634|Transcript|ENST00000341065|protein_coding|||||||||||1|1672|1|HGNC|28706|||||| GT:AD:DP:GQ:PL:SB 0/1:11,14,0:25:99:361,0,319,754,360,754:9,2,10,4

HGVS annotation for intronic variants

Hi,

first of all, thank you for maintaining this great piece of software! I am trying to annotate some variants using this command:

perl ./annovar/table_annovar.pl --thread 30 final.vcf ./annovar/humandb/ -buildver hg19 -out final_annotation -remove -protocol refGene.mit,clinvar_20170905.mit,exac03,ALL.sites.2015_0_mod8,esp6500siv2_all,avsnp150.mit,gnomad_exome,gnomad_genome -operation g,f,f,f,f,f,f,f -nastring . -vcfinput

However, intronic variants are not annotated using the HGVS standard. How might I obtain these HGVS annotations?

Thanks.

Can't download Annovar

Hi,

I have been trying to download annovar ( by going to this webpage 'www.openbioinformatics.org/annovar/annovar_download.html') and this page is never loading. So I also tried by using this command line : 'wget http://www.openbioinformatics.org/annovar/download/annovar.latest.tar.gz.mirror' and got the same issue. Is the website in maintenance?

Thank you very much for your help

Best,

Delphine

table_annovar.pl does not respect -nastring option when processing 1000G datasets

When annotating 1000G dataset, the NaN string will be . regardless of -nastring option.

Step to reproduce:

./table_annovar.pl example/ex1.avinput humandb/ -buildver hg19 -out myanno -remove -protocol popfreq_all_20150413 -operation f -nastring NaN -csvout -polish

Does `table_annovar.pl -vcfinput` intentionally change character encoding?

I do not know if this is a feature, bug, or user error. When providing a VCF to ANNOVAR via table_annovar.pl -vcfinput the output VCF has altered character encoding for punctuation that is part of some annotations -- i.e. ";" becomes "\x3b" and "=" becomes "\x3d" (illustrated by excerpted portions of multianno.txt and .vcf files below)

From the VCF specification, these characters are reserved as delimiters and should not appear within individual INFO fields. So, while I can see that this behavior may be intentional, I was unable to determine from documentation if this is indeed the case.

multianno.txt

Chr     Start   End     Ref     Alt     Func.refGene    Gene.refGene    GeneDetail.refGene
1       10469   10469   C       0       intergenic      NONE;DDX11L1    dist=NONE;dist=1405
1       10469   10469   C       G       intergenic      NONE;DDX11L1    dist=NONE;dist=1405

multianno.vcf

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO 
1       10469   rs370233998     C       *       2026.12 GATKCutoffSNP   ANNOVAR_DATE=2018-04-16;Func.refGene=intergenic;Gene.refGene=NONE\x3bDDX11L1;GeneDetail.refGene=dist\x3dNONE\x3bdist\x3d1405
1       10469   rs370233998     C       G       2026.12 GATKCutoffSNP   ANNOVAR_DATE=2018-04-16;Func.refGene=intergenic;Gene.refGene=NONE\x3bDDX11L1;GeneDetail.refGene=dist\x3dNONE\x3bdist\x3d1405

index file out of date

when run the annotation, I got a warning:
WARNING: Your index file hg19_clinvar_20170130.txt.idx is out of date a nd will not be used. ANNOVAR can still generate correct results without index file.

I want to know why this warning happened and how Annovar distinguish the index file is out of date?last problem: if the index file is out of date, what influence will occur? will the speed slow down?

many thanks here

rs snp allele frequency deficiency

The dataset exac03 have no allele frequency for SNP rs2504779, which was shown in vcf file from ExAC official website(AF=0.178). The variant site is thought to be a common variants which should be filtered out, but the missing AF cause some problems to my analysis.
And I think there were many missing allele frequency in exac03 dataset, how can I deal with this problem?

avdblist is not kept up to date

The clinvar_20180603 databases are not in the avdblist file as of Jul 25.

the same mutations have two different rs

the same mutaions have two different rsID in hg19_avsnp150.txt. for example:
1 10352 10352 T TA rs145072688
1 10352 10352 T TA rs555500075
I search rs145072688 and rs555500075 in dbsnp database:
rs555500075 means 1:10352:T:TA
rs145072688 means 1:10353:A:AC
dbSNP's own VCF file: ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b150_GRCh37p13/VCF/All_20170710.vcf.gz has this problem. but ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/All_20180423.vcf.gz has no this problem. Could you please update the avsnp to v151?
thank you very much!

bug in wannovar

using the link for the anemia example (http://wannovar.usc.edu/done/1/THtp3lWHEh893pWw/index.html)

Click exome summary results/view. Results consist of 173 pages, 100 variants each one.

Filtering by exonic function / stopgain SNV works, reducing the list of variants, but according to the info in the ExonicFunc column, it includes several kinds of SNVs, not only stopgain variants

Annotate using .gtk file

Hi,

I have followed this link:

http://annovar.openbioinformatics.org/en/latest/user-guide/gene/#what-about-gff3-file-for-new-species

In order to be able to annotate from a gff3 file. However I cannot really make annovar work with this database, so what command exactly should I run to annotate this:

AT_refGene.txt
AT_refGeneMrna.fa
input.sorted.vcf.uniqShort

I have tried something like this:
perl annotate_variation.pl -out ex1 -dbtype AT_refGene -vcfinput input.sorted.vcf.uniqShort

Annnotated VCF missing new INFO headers

When annotating with table_annovar.pl, my output VCF file contains several new entries in the INFO column, as expected. However these new INFO values are not added to the meta-information header lines at the top of the VCF file.

The VCF specification states "It is strongly encouraged that information lines describing the INFO, FILTER, and FORMAT entries used in the body of the VCF file be included in the meta-information section."

Several downstream VCF tools, including bcftools, require these INFO headers to be present in order to parse VCFs. So while they are technically optional, it seems like best practice to include them in the VCF output.

Thoughts?

CSV generated from ANNOVAR may violate format rules in some cases

e.g. the last column contains commas:
"1/1:0,131:133:99:4445,358,0"
This also happens on random fields.

From wikipedia:
"Fields with embedded commas or double-quote characters must be quoted."

The page containing informations about CSV is here

frameshift insertion HGVS difference between Annovar and VEP

Dear Developers!
annovar_2018-04-16
When I annotate my vcf, we found a insertion variant event with different HGVS description between Annovar and VEP.
This insertion variant is as below:
chr6 49457714 49457714 - AA

Annovar Result:

HGVS: p.Asp244Leufs*38

VEP online Result:

HGVS: p.Asp244LeufsTer39

Difference: Annovar Ter38 but VEP Ter39.
When I check article about this variant record, found most of HGVS was record as Ter39.

What causes this difference of protein HGVS? Which one is correct?

Annotation for transposon insertion

Hi,

I have a vcf file from transposon detection software Mobster that looks like this:

CHROM POS ID REF ALT QUAL FILTER
chr11 34288 . . INS:ME:ALU . PASS
chr11 43445 . . INS:ME:L1 . PASS
chr11 67645 . . INS:ME:SVA . PASS

I want to annotate these transposon insertion points and I've used ANNOVAR hg19 refGene. However, all of these insertion points are being treated as invalid inputs. Is it because there is no ref and alt bases? Is it possible to annotate it with ANNOVAR?

Thanks,

ClinVar Database Entries Missing Information

I'm not sure if this is a bug, but the expectation is that the number of entries (number of pipe-delimited values) should be the same for CLINSIG, CLNDBN, etc. However, for these cases, when we split on |, we get get different numbers of fields:

% cat hg19_clinvar_20150330.txt | cut -f 6 | sed -e 's/;/    /g' | awk '{numsig=split($1,sig,"|");numacc=split($4,acc,"|"); if (numsig!=numacc) print}' | head
CLINSIG=pathogenic|pathogenic|pathogenic    CLNDBN=Paragangliomas_4|Pheochromocytoma|Hereditary_cancer-predisposing_syndrome,Phaeochromocytoma|Cowden-like_syndrome CLNREVSTAT=single|single|single,single|single   CLNACC=RCV000013623.23|RCV000013624.16|RCV000129929.2,RCV000148870.1|RCV000148871.1 CLNDSDB=GeneReviews:MedGen:OMIM:Orphanet|GeneReviews:MedGen:OMIM:Orphanet|MedGen:SNOMED_CT,MedGen|MedGen:OMIM:Orphanet  CLNDSDBID=NBK1548:C1861848:115310:ORPHA29072|NBK1548:C0031511:171300:ORPHA29072|C0027672:699346009,CN221602|C2676500:612359:ORPHA201
CLINSIG=pathogenic|pathogenic|pathogenic    CLNDBN=Paragangliomas_4|Pheochromocytoma|Hereditary_cancer-predisposing_syndrome,Phaeochromocytoma|Cowden-like_syndrome CLNREVSTAT=single|single|single,single|single   CLNACC=RCV000013623.23|RCV000013624.16|RCV000129929.2,RCV000148870.1|RCV000148871.1 CLNDSDB=GeneReviews:MedGen:OMIM:Orphanet|GeneReviews:MedGen:OMIM:Orphanet|MedGen:SNOMED_CT,MedGen|MedGen:OMIM:Orphanet  CLNDSDBID=NBK1548:C1861848:115310:ORPHA29072|NBK1548:C0031511:171300:ORPHA29072|C0027672:699346009,CN221602|C2676500:612359:ORPHA201
CLINSIG=pathogenic|pathogenic   CLNDBN=Elliptocytosis_1|Protein_4.1_lille,Elliptocytosis_1|Protein_4.1_madrid   CLNREVSTAT=single|single,single|single  CLNACC=RCV000018198.26|RCV000018199.26,RCV000018196.26|RCV000018197.22  CLNDSDB=MedGen:OMIM:Orphanet|.,MedGen:OMIM:Orphanet|.   CLNDSDBID=C2678497:611804:ORPHA288|.,C2678497:611804:ORPHA288|.
CLINSIG=pathogenic|pathogenic   CLNDBN=Elliptocytosis_1|Protein_4.1_lille,Elliptocytosis_1|Protein_4.1_madrid   CLNREVSTAT=single|single,single|single  CLNACC=RCV000018198.26|RCV000018199.26,RCV000018196.26|RCV000018197.22  CLNDSDB=MedGen:OMIM:Orphanet|.,MedGen:OMIM:Orphanet|.   CLNDSDBID=C2678497:611804:ORPHA288|.,C2678497:611804:ORPHA288|.
CLINSIG=other   CLNDBN=Epilepsy\x2c_idiopathic_generalized\x2c_susceptibility_to\x2c_12,not_provided|Glucose_transporter_type_1_deficiency_syndrome CLNREVSTAT=single,single|single CLNACC=RCV000082868.1,RCV000128117.1|RCV000147523.1 CLNDSDB=MedGen:OMIM,MedGen|GeneReviews:MedGen:OMIM:Orphanet CLNDSDBID=CN158708:614847,CN221809|NBK1430:C1847501:606777:ORPHA71277
CLINSIG=other   CLNDBN=Epilepsy\x2c_idiopathic_generalized\x2c_susceptibility_to\x2c_12,not_provided|Glucose_transporter_type_1_deficiency_syndrome CLNREVSTAT=single,single|single CLNACC=RCV000082868.1,RCV000128117.1|RCV000147523.1 CLNDSDB=MedGen:OMIM,MedGen|GeneReviews:MedGen:OMIM:Orphanet CLNDSDBID=CN158708:614847,CN221809|NBK1430:C1847501:606777:ORPHA71277
CLINSIG=pathogenic  CLNDBN=Glut1_deficiency_syndrome_1\x2c_autosomal_recessive,Glucose_transporter_type_1_deficiency_syndrome|not_provided  CLNREVSTAT=single,mult|single   CLNACC=RCV000017489.24,RCV000017491.27|RCV000081432.3   CLNDSDB=MedGen,GeneReviews:MedGen:OMIM:Orphanet|MedGen  CLNDSDBID=C3149117,NBK1430:C1847501:606777:ORPHA71277|CN221809
CLINSIG=pathogenic  CLNDBN=Glut1_deficiency_syndrome_1\x2c_autosomal_recessive,Glucose_transporter_type_1_deficiency_syndrome|not_provided  CLNREVSTAT=single,mult|single   CLNACC=RCV000017489.24,RCV000017491.27|RCV000081432.3   CLNDSDB=MedGen,GeneReviews:MedGen:OMIM:Orphanet|MedGen  CLNDSDBID=C3149117,NBK1430:C1847501:606777:ORPHA71277|CN221809
CLINSIG=pathogenic  CLNDBN=MYH-associated_polyposis,Hereditary_cancer-predisposing_syndrome|Carcinoma_of_colon  CLNREVSTAT=single,mult|single   CLNACC=RCV000123141.1,RCV000115749.4|RCV000144636.1 CLNDSDB=GeneReviews:MedGen:OMIM:Orphanet,MedGen:SNOMED_CT|MedGen:SNOMED_CT  CLNDSDBID=NBK107219:C1837991:608456:ORPHA220460,C0027672:699346009|C0699790:269533000
CLINSIG=probable-non-pathogenic|other   CLNDBN=MYH-associated_polyposis|Hereditary_cancer-predisposing_syndrome,MYH-associated_polyposis|Hereditary_cancer-predisposing_syndrome    CLNREVSTAT=single|mult,mult|single  CLNACC=RCV000119223.2|RCV000126890.3,RCV000005617.2|RCV000163049.1  CLNDSDB=GeneReviews:MedGen:OMIM:Orphanet|MedGen:SNOMED_CT,GeneReviews:MedGen:OMIM:Orphanet|MedGen:SNOMED_CT CLNDSDBID=NBK107219:C1837991:608456:ORPHA220460|C0027672:699346009,NBK107219:C1837991:608456:ORPHA220460|C0027672:699346009

If this is the expected behavior, how do we interpret these entries?

A question about Annovar

Dear Kai,
I am using your annovar software ,now I have a question :
which dbNSFP version database is used to get SIFT scores, Can you tell me ?
Thank you very much !

Error: Exonic variant "unknown" when set "--splicing_threshold 10"

Hi developer
I find some variants in exonic region with 'unknown' when setting "--splicing_threshold 10".

Before：
Normal.xlsx

I change table_annovar.pl script, add "--splicing_threshold 10" & “-hgvs”.

387c387
< $sc = "annotate_variation.pl -geneanno -buildver $buildver --splicing_threshold 10 -hgvs -dbtype $protocol -outfile $tempfile.$protocol -exonsort $queryfile $dbloc";

$sc = "annotate_variation.pl -geneanno -buildver $buildver -dbtype $protocol -outfile $tempfile.$protocol -exonsort $queryfile $dbloc";

After:
Error.xlsx

How can I resolve this problem?

Thanks!
Wufan Hai

How to build database for virus in annovar

Dear Wang,

I am working on a virus. Can you please help me to build the database for virus in order to perform annotation of variants.

ExonicFunc type is "nonframeshift substitution", but AAChange is ":p.,"

Hi developer
I find some variants is "nonframeshift substitution", but the AAChange column show "":p.,".
How can I solve this problem?

wglab / doc-annovar Goto Github PK

doc-annovar's Introduction

ANNOVAR Documentation

Reference

doc-annovar's People

Contributors

Stargazers

Watchers

Forkers

doc-annovar's Issues

387c387 < $sc = "annotate_variation.pl -geneanno -buildver $buildver --splicing_threshold 10 -hgvs -dbtype $protocol -outfile $tempfile.$protocol -exonsort $queryfile $dbloc";

Recommend Projects

Recommend Topics

Recommend Org

Jobs

387c387
< $sc = "annotate_variation.pl -geneanno -buildver $buildver --splicing_threshold 10 -hgvs -dbtype $protocol -outfile $tempfile.$protocol -exonsort $queryfile $dbloc";