GithubHelp home page GithubHelp logo

doc-annovar's Introduction

Logo

ANNOVAR Documentation

ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, hg38, as well as mouse, worm, fly, yeast and many others).

This is the GitHub repository for the documentation of the ANNOVAR software, described in the paper listed below. Any edit to this repository will be reflected at ANNOVAR home page at http://annovar.openbioinformatics.org instantly.

If you like this repository, please click on the "Star" button on top of this page, to show appreciation to the repository maintainer. If you want to receive notifications on changes to this repository, please click the "Watch" button on top of this page.

Reference

doc-annovar's People

Contributors

ashishjayamohan avatar d0ugal avatar dcroote avatar eyherabh avatar kaichop avatar pjvandehaar avatar sanderjbouwman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

doc-annovar's Issues

Does `table_annovar.pl -vcfinput` intentionally change character encoding?

I do not know if this is a feature, bug, or user error. When providing a VCF to ANNOVAR via table_annovar.pl -vcfinput the output VCF has altered character encoding for punctuation that is part of some annotations -- i.e. ";" becomes "\x3b" and "=" becomes "\x3d" (illustrated by excerpted portions of multianno.txt and .vcf files below)

From the VCF specification, these characters are reserved as delimiters and should not appear within individual INFO fields. So, while I can see that this behavior may be intentional, I was unable to determine from documentation if this is indeed the case.

multianno.txt

Chr     Start   End     Ref     Alt     Func.refGene    Gene.refGene    GeneDetail.refGene
1       10469   10469   C       0       intergenic      NONE;DDX11L1    dist=NONE;dist=1405
1       10469   10469   C       G       intergenic      NONE;DDX11L1    dist=NONE;dist=1405 

multianno.vcf

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO 
1       10469   rs370233998     C       *       2026.12 GATKCutoffSNP   ANNOVAR_DATE=2018-04-16;Func.refGene=intergenic;Gene.refGene=NONE\x3bDDX11L1;GeneDetail.refGene=dist\x3dNONE\x3bdist\x3d1405
1       10469   rs370233998     C       G       2026.12 GATKCutoffSNP   ANNOVAR_DATE=2018-04-16;Func.refGene=intergenic;Gene.refGene=NONE\x3bDDX11L1;GeneDetail.refGene=dist\x3dNONE\x3bdist\x3d1405

gff3toGenePred command not found

Dear Kai Wang,

Follow up to this How to build database for virus in annovar.

I created gff3 file from each gene's genbank file of a virus. As you mentioned, I referred following section in website and the recent nature protocol paper.

What about GFF3 file for new species?
Then go to http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/, download the gff3ToGenePred tool, and convert the GFF3 file to a format that ANNOVAR can read. Everything else is the same as above.

As per the link, I downloaded gff3toGenePred program in to my local machine.
gff3ToGenePred - convert a GFF3 file to a genePred file
usage:
gff3ToGenePred inGff3 outGp

I received following message when executing this command

$gff3ToGenePred gff_files/input.gff outputGP
gff3ToGenePred: command not found

minqueryfrac is not compatible with geneanno

Hello,

I wonder if you let me know how to set *minqueryfrac when using geneanno.
Shown below are two scripts that I tested.

Thank you in advance,

hk

Run successfully
./annotate_variation.pl -geneanno -buildver hg19 -dbtype knownGene -outfile myanno.knownGene -exonsort ./avinput.txt ./annovar/humandb/

Failed
./annotate_variation.pl -geneanno -buildver hg19 -dbtype knownGene -outfile myanno.knownGene --minqueryfrac 0.1 -exonsort ./avinput.txt ./annovar/humandb/

Annnotated VCF missing new INFO headers

When annotating with table_annovar.pl, my output VCF file contains several new entries in the INFO column, as expected. However these new INFO values are not added to the meta-information header lines at the top of the VCF file.

The VCF specification states "It is strongly encouraged that information lines describing the INFO, FILTER, and FORMAT entries used in the body of the VCF file be included in the meta-information section."

Several downstream VCF tools, including bcftools, require these INFO headers to be present in order to parse VCFs. So while they are technically optional, it seems like best practice to include them in the VCF output.

Thoughts?

why ANNOVAR report a 3-bp deletion as 2 amino-acid deletion

For example "chr9 139390944 TGTG T" (hg19 coordinate) is annotated as "AAChange.refGene=NOTCH1:NM_017617:exon34:c.7244_7246del:p.2415_2416del".But I think it should be annotated as "AAChange.refGene=NOTCH1:NM_017617:exon34:c.7244_7246:p.2415_2416delinsQ"

ClinVar Database Entries Missing Information

I'm not sure if this is a bug, but the expectation is that the number of entries (number of pipe-delimited values) should be the same for CLINSIG, CLNDBN, etc. However, for these cases, when we split on |, we get get different numbers of fields:

% cat hg19_clinvar_20150330.txt | cut -f 6 | sed -e 's/;/    /g' | awk '{numsig=split($1,sig,"|");numacc=split($4,acc,"|"); if (numsig!=numacc) print}' | head
CLINSIG=pathogenic|pathogenic|pathogenic    CLNDBN=Paragangliomas_4|Pheochromocytoma|Hereditary_cancer-predisposing_syndrome,Phaeochromocytoma|Cowden-like_syndrome CLNREVSTAT=single|single|single,single|single   CLNACC=RCV000013623.23|RCV000013624.16|RCV000129929.2,RCV000148870.1|RCV000148871.1 CLNDSDB=GeneReviews:MedGen:OMIM:Orphanet|GeneReviews:MedGen:OMIM:Orphanet|MedGen:SNOMED_CT,MedGen|MedGen:OMIM:Orphanet  CLNDSDBID=NBK1548:C1861848:115310:ORPHA29072|NBK1548:C0031511:171300:ORPHA29072|C0027672:699346009,CN221602|C2676500:612359:ORPHA201
CLINSIG=pathogenic|pathogenic|pathogenic    CLNDBN=Paragangliomas_4|Pheochromocytoma|Hereditary_cancer-predisposing_syndrome,Phaeochromocytoma|Cowden-like_syndrome CLNREVSTAT=single|single|single,single|single   CLNACC=RCV000013623.23|RCV000013624.16|RCV000129929.2,RCV000148870.1|RCV000148871.1 CLNDSDB=GeneReviews:MedGen:OMIM:Orphanet|GeneReviews:MedGen:OMIM:Orphanet|MedGen:SNOMED_CT,MedGen|MedGen:OMIM:Orphanet  CLNDSDBID=NBK1548:C1861848:115310:ORPHA29072|NBK1548:C0031511:171300:ORPHA29072|C0027672:699346009,CN221602|C2676500:612359:ORPHA201
CLINSIG=pathogenic|pathogenic   CLNDBN=Elliptocytosis_1|Protein_4.1_lille,Elliptocytosis_1|Protein_4.1_madrid   CLNREVSTAT=single|single,single|single  CLNACC=RCV000018198.26|RCV000018199.26,RCV000018196.26|RCV000018197.22  CLNDSDB=MedGen:OMIM:Orphanet|.,MedGen:OMIM:Orphanet|.   CLNDSDBID=C2678497:611804:ORPHA288|.,C2678497:611804:ORPHA288|.
CLINSIG=pathogenic|pathogenic   CLNDBN=Elliptocytosis_1|Protein_4.1_lille,Elliptocytosis_1|Protein_4.1_madrid   CLNREVSTAT=single|single,single|single  CLNACC=RCV000018198.26|RCV000018199.26,RCV000018196.26|RCV000018197.22  CLNDSDB=MedGen:OMIM:Orphanet|.,MedGen:OMIM:Orphanet|.   CLNDSDBID=C2678497:611804:ORPHA288|.,C2678497:611804:ORPHA288|.
CLINSIG=other   CLNDBN=Epilepsy\x2c_idiopathic_generalized\x2c_susceptibility_to\x2c_12,not_provided|Glucose_transporter_type_1_deficiency_syndrome CLNREVSTAT=single,single|single CLNACC=RCV000082868.1,RCV000128117.1|RCV000147523.1 CLNDSDB=MedGen:OMIM,MedGen|GeneReviews:MedGen:OMIM:Orphanet CLNDSDBID=CN158708:614847,CN221809|NBK1430:C1847501:606777:ORPHA71277
CLINSIG=other   CLNDBN=Epilepsy\x2c_idiopathic_generalized\x2c_susceptibility_to\x2c_12,not_provided|Glucose_transporter_type_1_deficiency_syndrome CLNREVSTAT=single,single|single CLNACC=RCV000082868.1,RCV000128117.1|RCV000147523.1 CLNDSDB=MedGen:OMIM,MedGen|GeneReviews:MedGen:OMIM:Orphanet CLNDSDBID=CN158708:614847,CN221809|NBK1430:C1847501:606777:ORPHA71277
CLINSIG=pathogenic  CLNDBN=Glut1_deficiency_syndrome_1\x2c_autosomal_recessive,Glucose_transporter_type_1_deficiency_syndrome|not_provided  CLNREVSTAT=single,mult|single   CLNACC=RCV000017489.24,RCV000017491.27|RCV000081432.3   CLNDSDB=MedGen,GeneReviews:MedGen:OMIM:Orphanet|MedGen  CLNDSDBID=C3149117,NBK1430:C1847501:606777:ORPHA71277|CN221809
CLINSIG=pathogenic  CLNDBN=Glut1_deficiency_syndrome_1\x2c_autosomal_recessive,Glucose_transporter_type_1_deficiency_syndrome|not_provided  CLNREVSTAT=single,mult|single   CLNACC=RCV000017489.24,RCV000017491.27|RCV000081432.3   CLNDSDB=MedGen,GeneReviews:MedGen:OMIM:Orphanet|MedGen  CLNDSDBID=C3149117,NBK1430:C1847501:606777:ORPHA71277|CN221809
CLINSIG=pathogenic  CLNDBN=MYH-associated_polyposis,Hereditary_cancer-predisposing_syndrome|Carcinoma_of_colon  CLNREVSTAT=single,mult|single   CLNACC=RCV000123141.1,RCV000115749.4|RCV000144636.1 CLNDSDB=GeneReviews:MedGen:OMIM:Orphanet,MedGen:SNOMED_CT|MedGen:SNOMED_CT  CLNDSDBID=NBK107219:C1837991:608456:ORPHA220460,C0027672:699346009|C0699790:269533000
CLINSIG=probable-non-pathogenic|other   CLNDBN=MYH-associated_polyposis|Hereditary_cancer-predisposing_syndrome,MYH-associated_polyposis|Hereditary_cancer-predisposing_syndrome    CLNREVSTAT=single|mult,mult|single  CLNACC=RCV000119223.2|RCV000126890.3,RCV000005617.2|RCV000163049.1  CLNDSDB=GeneReviews:MedGen:OMIM:Orphanet|MedGen:SNOMED_CT,GeneReviews:MedGen:OMIM:Orphanet|MedGen:SNOMED_CT CLNDSDBID=NBK107219:C1837991:608456:ORPHA220460|C0027672:699346009,NBK107219:C1837991:608456:ORPHA220460|C0027672:699346009

If this is the expected behavior, how do we interpret these entries?

convert vcf4 to avinput

Hi, developers
I am confused when I convert vcf(vcf4) to avinput format.
for instance,
In the output file I see conversion like this:
origin record:
chr1 9671257 . CTTTT C,CT,CTT,CTTT,CTTTTT
convertion:
chr1 9671259 9671261 TTT -
chr1 9671260 9671261 TT -
chr1 9671261 9671261 T -
chr1 9671261 9671261 - T
the end positions are same.

but in some records like this:
origin record:
chr1 36203712 . AAAATATATATAT A,AAT,AATAT,AATATAT
conversion:
chr1 36203713 36203724 AAATATATATAT -
chr1 36203713 36203722 AAATATATAT -
chr1 36203713 36203720 AAATATAT -
chr1 36203713 36203718 AAATAT -
However,the start positions are same in this conversion.

Dr wang have mentioned ANNOVAR will left-align both input vcf and database in the documation. I also compare the results from gatk4 LeftAlignAndTrimVariants tool. the conversions about the two records like this:

chr1 9671257 . CTTTT C
chr1 9671257 . CTTT C
chr1 9671257 . CTT C
chr1 9671257 . CT C
chr1 9671257 . C CT

chr1 36203712 . AAAATATATATAT A
chr1 36203712 . AAAATATATAT A
chr1 36203712 . AAAATATAT A
chr1 36203712 . AAAATAT A

Despite the way to show ref and alt bases, I saw that both conversions have same POS.

I wonder know does these conversion make sense? and could you please give me a brief description about the algrithm that be used in the convertion from vcf4 to avinput?(I'am not familar with Perl language.so, it is hard for me to read source code.)

Best regards!
xinchang zheng

update the gnomAD database in ANNOVAR

Recently, gnomAD released version r2.1. Is there a plan to update the gnomAD database in ANNOVAR, or how to make my own gnomAD databases for use in ANNOVAR, just like the ClinVar?

Creating our own indexes

Hello,
I'm looking into creating our own in-house database of variant VCFs that we can use as an input for ANNOVAR annotation. I was wondering if there is a way to generate indexes for files to improve the speed of annotation.

The only thing I've seen online is this thread from seqansers:
http://seqanswers.com/forums/showthread.php?t=23535

Is there any other way or should I give this random perl script a try?

Thanks a ton,
Phil Richmond

LoFTool in table_annovar.pl

I am not sure I have the correctly formatted loftool database. I downloaded the file from the readthedocs website (LoFtool_scores.txt) and renamed it to /opt/annovar/humandb/hg19_loftool.txt but when I run table_annovar.pl with this command:

table_annovar.pl rslist.avinput ${annovar}humandb/ -buildver hg19 -out rslist -remove -protocol dbnsfp33a,exac03,gnomad_exome,clinvar_20170130,loftool -operation f,f,f,f,g -nastring "-"

I get correct outputs for every other operation but loftool:
NOTICE: Processing operation=g protocol=loftool

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg19 -dbtype loftool -outfile rslist.loftool -exonsort rslist.avinput /opt/annovar/humandb/>
NOTICE: Output files were written to rslist.loftool.variant_function, rslist.loftool.exonic_variant_function
NOTICE: Reading gene annotation from /opt/annovar/humandb/hg19_loftool.txt ... Error: invalid record in /opt/annovar/humandb/hg19_loftool.txt (>=11 fields expected in loftool gene definition file):
Error running system command: <annotate_variation.pl -geneanno -buildver hg19 -dbtype loftool -outfile rslist.loftool -exonsort rslist.avinput /opt/annovar/humandb/>

Do I need to process the Loftool file prior to running annovar (it only has two columns, << 11 it is looking for)?

annotation for plant genome

Can I annotate my plant data using annovar ?? I have soybeam data. how i create database of that genome ?

A question about Annovar

Dear Kai,
I am using your annovar software ,now I have a question :
which dbNSFP version database is used to get SIFT scores, Can you tell me ?
Thank you very much !

Annotate using .gtk file

Hi,

I have followed this link:

http://annovar.openbioinformatics.org/en/latest/user-guide/gene/#what-about-gff3-file-for-new-species

In order to be able to annotate from a gff3 file. However I cannot really make annovar work with this database, so what command exactly should I run to annotate this:

AT_refGene.txt
AT_refGeneMrna.fa
input.sorted.vcf.uniqShort

I have tried something like this:
perl annotate_variation.pl -out ex1 -dbtype AT_refGene -vcfinput input.sorted.vcf.uniqShort

How could I Convert this format to avinput

Dear Dr.Wang
I use the Breakdancer to identify SVs and the output format like this
#Chr1 Pos1 Orientation1 Chr2 Pos2 Orientation2 Type Size Score num_Reads num_Reads_lib danban.bam

  • original_scaffold_197 5227 21+21- original_scaffold_197 5355 21+21- INS -360 33 21 danban.bam|21 NA
  • original_scaffold_197 6370 33+33- original_scaffold_197 6664 33+33- INS -328 43 34 danban.bam|34 NA
  • original_scaffold_197 8436 0+0- original_scaffold_197 9379 16+16- INS -327 38 16 danban.bam|16 5.85

for example ,the 1st line means An 360 bp insertion detected by BreakDancer between scaffold_197:5227 and scaffold_197:5355 with 21 supporting read pairs,and a confidence score of 33. this software also said that "Real SV breakpoints are expected to reside within the predicted boundaries."

①how could i convert this format to avinput,may i use this format to describe this INS like the annovar README and worked in gene-based annonation?

  • original_scaffold_197 5227 5355 0 0 comments:a 360bp insertion between there

②and how could i describe the inter-chromosomal translocation between different Chr ,and intra-chromosomal translocation?
③I noticed that BreakDancer could use the .bed format output ,.i tried to convert .bed to .vcf but failed ,and annovar don't said support .bed format, so i can't use convert2annovar.pl convert bedfile to avinput ,right?

                                                                                                        Thanks for your time 

Error: Exonic variant "unknown" when set "--splicing_threshold 10"

Hi developer
I find some variants in exonic region with 'unknown' when setting "--splicing_threshold 10".

Before:
Normal.xlsx

I change table_annovar.pl script, add "--splicing_threshold 10" & “-hgvs”.

387c387
< $sc = "annotate_variation.pl -geneanno -buildver $buildver --splicing_threshold 10 -hgvs -dbtype $protocol -outfile $tempfile.$protocol -exonsort $queryfile $dbloc";

$sc = "annotate_variation.pl -geneanno -buildver $buildver -dbtype $protocol -outfile $tempfile.$protocol -exonsort $queryfile $dbloc";

After:
Error.xlsx

How can I resolve this problem?

Thanks!
Wufan Hai

issue annotating GATK Haplotype Caller VCFs with table_annovar.pl

Hello,

I'm attempting to annotate a vcf produced by GATK's haplotype caller using the following command:
##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller --contamination-fraction-to-filter 0.0 --output K000049_1_lane_dupsFlagged_sm_tagged.vcf.gz --intervals /projects/trans_scratch/pedigree_calling/iTARGET_quad/vcfs/cromwell-executions/run_haplotypecaller_on_directory/509c4a23-7811-4d20-abc9-ce5bd7fe6d45/call-HaplotypeCallerGvcf_GATK4/shard-3/haplotypecaller.HaplotypeCallerGvcf_GATK4/3b9dfdc2-7cd5-4212-830c-d89263fe84fa/call-HaplotypeCaller/shard-97/inputs/1764290932/0097-scattered.intervals --input /projects/trans_scratch/pedigree_calling/iTARGET_quad/vcfs/cromwell-executions/run_haplotypecaller_on_directory/509c4a23-7811-4d20-abc9-ce5bd7fe6d45/call-HaplotypeCallerGvcf_GATK4/shard-3/haplotypecaller.HaplotypeCallerGvcf_GATK4/3b9dfdc2-7cd5-4212-830c-d89263fe84fa/call-HaplotypeCaller/shard-97/inputs/-1081452564/K000049_1_lane_dupsFlagged_sm_tagged.bam --reference /projects/trans_scratch/pedigree_calling/iTARGET_quad/vcfs/cromwell-executions/run_haplotypecaller_on_directory/509c4a23-7811-4d20-abc9-ce5bd7fe6d45/call-HaplotypeCallerGvcf_GATK4/shard-3/haplotypecaller.HaplotypeCallerGvcf_GATK4/3b9dfdc2-7cd5-4212-830c-d89263fe84fa/call-HaplotypeCaller/shard-97/inputs/-533456238/GRCh37-lite.fa --emit-ref-confidence NONE --gvcf-gq-bands 1 --gvcf-gq-bands 2 --gvcf-gq-bands 3 --gvcf-gq-bands 4 --gvcf-gq-bands 5 --gvcf-gq-bands 6 --gvcf-gq-bands 7 --gvcf-gq-bands 8 --gvcf-gq-bands 9 --gvcf-gq-bands 10 --gvcf-gq-bands 11 --gvcf-gq-bands 12 --gvcf-gq-bands 13 --gvcf-gq-bands 14 --gvcf-gq-bands 15 --gvcf-gq-bands 16 --gvcf-gq-bands 17 --gvcf-gq-bands 18 --gvcf-gq-bands 19 --gvcf-gq-bands 20 --gvcf-gq-bands 21 --gvcf-gq-bands 22 --gvcf-gq-bands 23 --gvcf-gq-bands 24 --gvcf-gq-bands 25 --gvcf-gq-bands 26 --gvcf-gq-bands 27 --gvcf-gq-bands 28 --gvcf-gq-bands 29 --gvcf-gq-bands 30 --gvcf-gq-bands 31 --gvcf-gq-bands 32 --gvcf-gq-bands 33 --gvcf-gq-bands 34 --gvcf-gq-bands 35 --gvcf-gq-bands 36 --gvcf-gq-bands 37 --gvcf-gq-bands 38 --gvcf-gq-bands 39 --gvcf-gq-bands 40 --gvcf-gq-bands 41 --gvcf-gq-bands 42 --gvcf-gq-bands 43 --gvcf-gq-bands 44 --gvcf-gq-bands 45 --gvcf-gq-bands 46 --gvcf-gq-bands 47 --gvcf-gq-bands 48 --gvcf-gq-bands 49 --gvcf-gq-bands 50 --gvcf-gq-bands 51 --gvcf-gq-bands 52 --gvcf-gq-bands 53 --gvcf-gq-bands 54 --gvcf-gq-bands 55 --gvcf-gq-bands 56 --gvcf-gq-bands 57 --gvcf-gq-bands 58 --gvcf-gq-bands 59 --gvcf-gq-bands 60 --gvcf-gq-bands 70 --gvcf-gq-bands 80 --gvcf-gq-bands 90 --gvcf-gq-bands 99 --indel-size-to-eliminate-in-ref-model 10 --use-alleles-trigger false --disable-optimizations false --just-determine-active-regions false --dont-genotype false --max-mnp-distance 0 --dont-trim-active-regions false --max-disc-ar-extension 25 --max-gga-ar-extension 300 --padding-around-indels 150 --padding-around-snps 20 --kmer-size 10 --kmer-size 25 --dont-increase-kmer-sizes-for-cycles false --allow-non-unique-kmers-in-ref false --num-pruning-samples 1 --recover-dangling-heads false --do-not-recover-dangling-branches false --min-dangling-branch-length 4 --consensus false --max-num-haplotypes-in-population 128 --error-correct-kmers false --min-pruning 2 --debug-graph-transformations false --kmer-length-for-read-error-correction 25 --min-observations-for-kmer-to-be-solid 20 --likelihood-calculation-engine PairHMM --base-quality-score-threshold 18 --pair-hmm-gap-continuation-penalty 10 --pair-hmm-implementation FASTEST_AVAILABLE --pcr-indel-model CONSERVATIVE --phred-scaled-global-read-mismapping-rate 45 --native-pair-hmm-threads 4 --native-pair-hmm-use-double-precision false --debug false --use-filtered-reads-for-annotations false --bam-writer-type CALLED_HAPLOTYPES --dont-use-soft-clipped-bases false --capture-assembly-failure-bam false --error-correct-reads false --do-not-run-physical-phasing false --min-base-quality-score 10 --smith-waterman JAVA --use-new-qual-calculator false --annotate-with-num-discovered-alleles false --heterozygosity 0.001 --indel-heterozygosity 1.25E-4 --heterozygosity-stdev 0.01 --standard-min-confidence-threshold-for-calling 10.0 --max-alternate-alleles 6 --max-genotype-count 1024 --sample-ploidy 2 --num-reference-samples-if-no-call 0 --genotyping-mode DISCOVERY --genotype-filtered-alleles false --output-mode EMIT_VARIANTS_ONLY --all-site-pls false --min-assembly-region-size 50 --max-assembly-region-size 300 --assembly-region-padding 100 --max-reads-per-alignment-start 50 --active-probability-threshold 0.002 --max-prob-propagation-distance 50 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --minimum-mapping-quality 20 --disable-tool-default-annotations false --enable-all-annotations false",Version=4.0.10.0,Date="March 5, 2019 4:08:02 PM PST">

Some of the resulting vcf records, when run through annovar have their chromosome removed. As an example:
Before annovar
1 9407759 . AC . 86.73 . AN=2;DP=34;MQ=60.0 GT:AD:DP 0/0:34:34
After annovar
9407759 . AC . 86.73 . AN=2;DP=34;MQ=60.00 GT:AD:DP;ANNOVAR_DATE=2018-04-16;cosmic70=.;Func.refGene=intronic;Gene.refGene=SPSB1;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;esp6500siv2_all=.;1000g2015aug_all=.;avsnp147=.;SIFT_score=.;SIFT_converted_rankscore=.;SIFT_pred=.;Polyphen2_HDIV_score=.;Polyphen2_HDIV_rankscore=.;Polyphen2_HDIV_pred=.;Polyphen2_HVAR_score=.;Polyphen2_HVAR_rankscore=.;Polyphen2_HVAR_pred=.;LRT_score=.;LRT_converted_rankscore=.;LRT_pred=.;MutationTaster_score=.;MutationTaster_converted_rankscore=.;MutationTaster_pred=.;MutationAssessor_score=.;MutationAssessor_score_rankscore=.;MutationAssessor_pred=.;FATHMM_score=.;FATHMM_converted_rankscore=.;FATHMM_pred=.;PROVEAN_score=.;PROVEAN_converted_rankscore=.;PROVEAN_pred=.;VEST3_score=.;VEST3_rankscore=.;MetaSVM_score=.;MetaSVM_rankscore=.;MetaSVM_pred=.;MetaLR_score=.;MetaLR_rankscore=.;MetaLR_pred=.;M-CAP_score=.;M-CAP_rankscore=.;M-CAP_pred=.;CADD_raw=.;CADD_raw_rankscore=.;CADD_phred=.;DANN_score=.;DANN_rankscore=.;fathmm-MKL_coding_score=.;fathmm-MKL_coding_rankscore=.;fathmm-MKL_coding_pred=.;Eigen_coding_or_noncoding=.;Eigen-raw=.;Eigen-PC-raw=.;GenoCanyon_score=.;GenoCanyon_score_rankscore=.;integrated_fitCons_score=.;integrated_fitCons_score_rankscore=.;integrated_confidence_value=.;GERP++_RS=.;GERP++_RS_rankscore=.;phyloP100way_vertebrate=.;phyloP100way_vertebrate_rankscore=.;phyloP20way_mammalian=.;phyloP20way_mammalian_rankscore=.;phastCons100way_vertebrate=.;phastCons100way_vertebrate_rankscore=.;phastCons20way_mammalian=.;phastCons20way_mammalian_rankscore=.;SiPhy_29way_logOdds=.;SiPhy_29way_logOdds_rankscore=.;Interpro_domain=.;GTEx_V6_gene=.;GTEx_V6_tissue=.;CLINSIG=.;CLNDBN=.;CLNACC=.;CLNDSDB=.;CLNDSDBID=.;ExAC_ALL=.;ExAC_AFR=.;ExAC_AMR=.;ExAC_EAS=.;ExAC_FIN=.;ExAC_NFE=.;ExAC_OTH=.;ExAC_SAS=.;dbscSNV_ADA_SCORE=.;dbscSNV_RF_SCORE=.;Interpro_domain=.;rmsk=.;Func.ensGene=intronic;Gene.ensGene=ENSG00000171621;GeneDetail.ensGene=.;ExonicFunc.ensGene=.;AAChange.ensGene=.;Func.knownGene=intronic;Gene.knownGene=SPSB1;GeneDetail.knownGene=.;ExonicFunc.knownGene=.;AAChange.knownGene=.;ALLELE_END 0/0:34:34

The annovar command used to generate this file is as follows
perl /projects/tvira_prj/tools/annovar/table_annovar.pl error_testing_vcf.vcf /projects/tvira_prj/tools/annovar/humandb/ -buildver hg19 -vcfinput -out /projects/trans_scratch/pedigree_calling/iTARGET_quad/pedcall/K000049/K000049_error_records.vcf_Annovar -remove -protocol cosmic70,refGene,esp6500siv2_all,1000g2015aug_all,avsnp147,dbnsfp33a,clinvar_20170905,exac03,dbscsnv11,dbnsfp31a_interpro,rmsk,ensGene,knownGene -operation f,g,f,f,f,f,f,f,f,f,f,g,g

I've attached a vcf which when annotated with the above command reproduces the error. Please advise if there is an option I am missing to handle the vcf records formatted by gatk.

K000049_error_records.txt

Gene.refGene shows NM_ number of the upstream gene?

It seems Annovar uses the previous Gene.refGene to annotate a variant of the next gene. For example, NM_152486 is for SAMD11 but in the third line it is still in the Gene.refGene field where NM_015658 should be used (the gene is NOC2L now).

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&highlight=hg19.chr1:881627-881627&position=chr1:881577-881677

1 877831 . T C 58.74 PASS DP=3;MQ=60.00;FractionInformativeReads=1.000;ANNOVAR_DATE=2016-02-01;Func.refGene=exonic;Gene.refGene=NM_152486;GeneDetail.refGene=.;ExonicFunc.refGene=nonsynonymous_SNV;AAChange.refGene=SAMD11:NM_152486:exo
n10:c.1027T>C:p.W343R
;cytoBand=1p36.33;genomicSuperDups=.;esp6500siv2_all=.;1000g2015aug_all=1;SIFT_score=1;SIFT_pred=T;Polyphen2_HDIV_score=0.0;Polyphen2_HDIV_pred=B;Polyphen2_HVAR_score=0.0;Polyphen2_HVAR_pred=B;LRT_score=0.003;LRT_pred=N;MutationTaster_score=1
.000;MutationTaster_pred=P;MutationAssessor_score=-2.085;MutationAssessor_pred=N;FATHMM_score=.;FATHMM_pred=.;RadialSVM_score=-0.980;RadialSVM_pred=T;LR_score=0.000;LR_pred=T;VEST3_score=0.421;CADD_raw=-1.112;CADD_phred=0.132;GERP++_RS=2.51;phyloP46way_placental=
0.624;phyloP100way_vertebrate=1.209;SiPhy_29way_logOdds=7.519;ExAC_ALL=0.9999;ExAC_AFR=1;ExAC_AMR=1;ExAC_EAS=1;ExAC_FIN=1;ExAC_NFE=1;ExAC_OTH=1;ExAC_SAS=0.9999;avsnp147=rs6672356;CLINSIG=.;CLNDBN=.;CLNACC=.;CLNDSDB=.;CLNDSDBID=.;gnomAD_genome_ALL=0.9998;gnomAD_ge
nome_AFR=0.9994;gnomAD_genome_AMR=1;gnomAD_genome_ASJ=1;gnomAD_genome_EAS=1;gnomAD_genome_FIN=1;gnomAD_genome_NFE=0.9999;gnomAD_genome_OTH=1;gnomAD_exome_ALL=0.9999;gnomAD_exome_AFR=0.9994;gnomAD_exome_AMR=0.9999;gnomAD_exome_ASJ=1;gnomAD_exome_EAS=1;gnomAD_exome
FIN=1;gnomAD_exome_NFE=0.9999;gnomAD_exome_OTH=1;gnomAD_exome_SAS=0.9997;ALLELE_END;ANN=C|missense_variant|MODERATE|SAMD11|SAMD11|transcript|NM_152486.2|protein_coding|10/14|c.1027T>C|p.Trp343Arg|1107/2554|1027/2046|343/681||,C|downstream_gene_variant|MODIFIER|N
OC2L|NOC2L|transcript|NM_015658.3|protein_coding||c.*2243A>G|||||1752|;CSQ=C|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000420190|protein_coding|||||||||||1|3160|1|HGNC|28706||||||,C|downstream_gene_variant|MODIFIER|NOC2L|ENSG0000018
8976|Transcript|ENST00000496938|processed_transcript|||||||||||1|2868|-1|HGNC|24517||||||0.887,C|missense_variant|MODERATE|SAMD11|ENSG00000187634|Transcript|ENST00000342066|protein_coding|10/14||ENST00000342066.3:c.1027T>C|ENSP00000342313.3:p.Trp343Arg|1110|1027|
343|W/R|Tgg/Cgg||1||1|HGNC|28706|YES||CCDS2.2|NM_152486.2||,C|downstream_gene_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000327044|protein_coding|||||||||||1|1753|-1|HGNC|24517|YES||CCDS3.1|NM_015658.3||0.887,C|non_coding_transcript_exon_variant&non

coding_transcript_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000464948|retained_intron|1/2||ENST00000464948.1:n.286T>C||286||||||1||1|HGNC|28706||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|SAMD11|ENSG00000187634
|Transcript|ENST00000466827|retained_intron|2/2||ENST00000466827.1:n.191T>C||191||||||1||1|HGNC|28706||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000474461|retained_intron|3/4||ENST0000
0474461.1:n.389T>C||389||||||1||1|HGNC|28706||||||,C|missense_variant|MODERATE|SAMD11|ENSG00000187634|Transcript|ENST00000455979|protein_coding|4/7||ENST00000455979.1:c.507T>C|ENSP00000412228.1:p.Trp170Arg|507|508|170|W/R|Tgg/Cgg||1||1|HGNC|28706||||||,C|downstre
am_gene_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000483767|retained_intron|||||||||||1|1753|-1|HGNC|24517||||||0.887,C|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000478729|processed_transcript|||||||||||1|278|1|HGNC|28
706||||||,C|downstream_gene_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000477976|retained_intron|||||||||||1|1754|-1|HGNC|24517||||||0.887,C|missense_variant|MODERATE|SAMD11|ENSG00000187634|Transcript|ENST00000341065|protein_coding|8/12||ENST00000341
065.4:c.750T>C|ENSP00000349216.4:p.Trp251Arg|750|751|251|W/R|Tgg/Cgg||1||1|HGNC|28706|||||| GT:AD:DP:GQ:PL:SB 1/1:0,3,0:3:9:91,9,0,91,9,91:0,0,2,1
1 880238 . A G 1053.77 PASS DP=52;MQ=60.00;MQRankSum=0.839;ReadPosRankSum=1.379;FractionInformativeReads=0.981;ANNOVAR_DATE=2016-02-01;Func.refGene=intronic\x3bdownstream;Gene.refGene=NM_152486;GeneDetail.refGene=.;ExonicFunc.refGene=.
;AAChange.refGene=.;cytoBand=1p36.33;genomicSuperDups=.;esp6500siv2_all=.;1000g2015aug_all=0.920927;SIFT_score=.;SIFT_pred=.;Polyphen2_HDIV_score=.;Polyphen2_HDIV_pred=.;Polyphen2_HVAR_score=.;Polyphen2_HVAR_pred=.;LRT_score=.;LRT_pred=.;MutationTaster_score=.;Mu
tationTaster_pred=.;MutationAssessor_score=.;MutationAssessor_pred=.;FATHMM_score=.;FATHMM_pred=.;RadialSVM_score=.;RadialSVM_pred=.;LR_score=.;LR_pred=.;VEST3_score=.;CADD_raw=.;CADD_phred=.;GERP++_RS=.;phyloP46way_placental=.;phyloP100way_vertebrate=.;SiPhy_29w
ay_logOdds=.;ExAC_ALL=.;ExAC_AFR=.;ExAC_AMR=.;ExAC_EAS=.;ExAC_FIN=.;ExAC_NFE=.;ExAC_OTH=.;ExAC_SAS=.;avsnp147=rs3748592;CLINSIG=.;CLNDBN=.;CLNACC=.;CLNDSDB=.;CLNDSDBID=.;gnomAD_genome_ALL=0.9351;gnomAD_genome_AFR=0.9087;gnomAD_genome_AMR=0.9487;gnomAD_genome_ASJ=
0.9238;gnomAD_genome_EAS=0.9066;gnomAD_genome_FIN=0.9614;gnomAD_genome_NFE=0.9460;gnomAD_genome_OTH=0.9490;gnomAD_exome_ALL=.;gnomAD_exome_AFR=.;gnomAD_exome_AMR=.;gnomAD_exome_ASJ=.;gnomAD_exome_EAS=.;gnomAD_exome_FIN=.;gnomAD_exome_NFE=.;gnomAD_exome_OTH=.;gnom
AD_exome_SAS=.;ALLELE_END;ANN=G|downstream_gene_variant|MODIFIER|SAMD11|SAMD11|transcript|NM_152486.2|protein_coding||c.*705A>G|||||277|,G|intron_variant|MODIFIER|NOC2L|NOC2L|transcript|NM_015658.3|protein_coding|18/18|c.2144-58T>C||||||;CSQ=G|downstream_gene_var
iant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000496938|processed_transcript|||||||||||1|461|-1|HGNC|24517||||||0.887,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000342066|protein_coding|||||||||||1|283|1|HGNC|28706|YES||CCDS
2.2|NM_152486.2||,G|intron_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000327044|protein_coding||18/18|ENST00000327044.6:c.2144-58T>C||||||||1||-1|HGNC|24517|YES||CCDS3.1|NM_015658.3||0.887,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Tra
nscript|ENST00000464948|retained_intron|||||||||||1|1966|1|HGNC|28706||||||,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000466827|retained_intron|||||||||||1|2056|1|HGNC|28706||||||,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG0000
0187634|Transcript|ENST00000474461|retained_intron|||||||||||1|1864|1|HGNC|28706||||||,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000455979|protein_coding|||||||||||1|599|1|HGNC|28706||||||,G|intron_variant&non_coding_transcript_va
riant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000483767|retained_intron||4/4|ENST00000483767.1:n.1015-58T>C||||||||1||-1|HGNC|24517||||||0.887,G|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000478729|processed_transcript|||||||
||||1|2685|1|HGNC|28706||||||,G|intron_variant&non_coding_transcript_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000477976|retained_intron||16/16|ENST00000477976.1:n.3606-58T>C||||||||1||-1|HGNC|24517||||||0.887,G|downstream_gene_variant|MODIFIER|SAMD
11|ENSG00000187634|Transcript|ENST00000341065|protein_coding|||||||||||1|283|1|HGNC|28706|||||| GT:AD:DP:GQ:PL:SB 0/1:20,31,0:51:99:1082,0,546,1781,639,1781:8,12,14,17
1 881627 . G A 332.77 PASS DP=25;MQ=60.00;MQRankSum=-0.739;ReadPosRankSum=-0.192;FractionInformativeReads=1.000;ANNOVAR_DATE=2016-02-01;Func.refGene=exonic\x3bdownstream;Gene.refGene=NM_152486;GeneDetail.refGene=.;ExonicFunc.refGene=s
ynonymous_SNV;AAChange.refGene=NOC2L:NM_015658:exon16:c.1843C>T:p.L615L;cytoBand=1p36.33;genomicSuperDups=.;esp6500siv2_all=0.4748;1000g2015aug_all=0.441893;SIFT_score=.;SIFT_pred=.;Polyphen2_HDIV_score=.;Polyphen2_HDIV_pred=.;Polyphen2_HVAR_score=.;Polyphen2_HVA
R_pred=.;LRT_score=.;LRT_pred=.;MutationTaster_score=.;MutationTaster_pred=.;MutationAssessor_score=.;MutationAssessor_pred=.;FATHMM_score=.;FATHMM_pred=.;RadialSVM_score=.;RadialSVM_pred=.;LR_score=.;LR_pred=.;VEST3_score=.;CADD_raw=.;CADD_phred=.;GERP++_RS=.;ph
yloP46way_placental=.;phyloP100way_vertebrate=.;SiPhy_29way_logOdds=.;ExAC_ALL=0.5653;ExAC_AFR=0.1397;ExAC_AMR=0.4840;ExAC_EAS=0.6560;ExAC_FIN=0.6221;ExAC_NFE=0.6283;ExAC_OTH=0.5737;ExAC_SAS=0.5648;avsnp147=rs2272757;CLINSIG=.;CLNDBN=.;CLNACC=.;CLNDSDB=.;CLNDSDBI
D=.;gnomAD_genome_ALL=0.4889;gnomAD_genome_AFR=0.1429;gnomAD_genome_AMR=0.4844;gnomAD_genome_ASJ=0.5265;gnomAD_genome_EAS=0.6751;gnomAD_genome_FIN=0.6175;gnomAD_genome_NFE=0.6336;gnomAD_genome_OTH=0.5855;gnomAD_exome_ALL=0.5703;gnomAD_exome_AFR=0.1305;gnomAD_exom
e_AMR=0.4852;gnomAD_exome_ASJ=0.5441;gnomAD_exome_EAS=0.6577;gnomAD_exome_FIN=0.6262;gnomAD_exome_NFE=0.6355;gnomAD_exome_OTH=0.5652;gnomAD_exome_SAS=0.5650;ALLELE_END;ANN=A|synonymous_variant|LOW|NOC2L|NOC2L|transcript|NM_015658.3|protein_coding|16/19|c.1843C>T|
p.Leu615Leu|1902/2800|1843/2250|615/749||,A|downstream_gene_variant|MODIFIER|SAMD11|SAMD11|transcript|NM_152486.2|protein_coding||c.*2094G>A|||||1666|;CSQ=A|upstream_gene_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000496938|processed_transcript||||||
|||||1|685|-1|HGNC|24517||||||0.887,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000342066|protein_coding|||||||||||1|1672|1|HGNC|28706|YES||CCDS2.2|NM_152486.2||,A|synonymous_variant|LOW|NOC2L|ENSG00000188976|Transcript|ENST00000327
044|protein_coding|16/19||ENST00000327044.6:c.1843C>T|ENST00000327044.6:c.1843C>T(p.%3D)|1893|1843|615|L|Ctg/Ttg||1||-1|HGNC|24517|YES||CCDS3.1|NM_015658.3||0.887,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000464948|retained_intron
|||||||||||1|3355|1|HGNC|28706||||||,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000466827|retained_intron|||||||||||1|3445|1|HGNC|28706||||||,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000474461|reta
ined_intron|||||||||||1|3253|1|HGNC|28706||||||,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000455979|protein_coding|||||||||||1|1988|1|HGNC|28706||||||,A|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|NOC
2L|ENSG00000188976|Transcript|ENST00000483767|retained_intron|2/5||ENST00000483767.1:n.699C>T||699||||||1||-1|HGNC|24517||||||0.887,A|downstream_gene_variant|MODIFIER|SAMD11|ENSG00000187634|Transcript|ENST00000478729|processed_transcript|||||||||||1|4074|1|HGNC|2
8706||||||,A|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|NOC2L|ENSG00000188976|Transcript|ENST00000477976|retained_intron|14/17||ENST00000477976.1:n.3290C>T||3290||||||1||-1|HGNC|24517||||||0.887,A|downstream_gene_variant|MODIFIER|SA
MD11|ENSG00000187634|Transcript|ENST00000341065|protein_coding|||||||||||1|1672|1|HGNC|28706|||||| GT:AD:DP:GQ:PL:SB 0/1:11,14,0:25:99:361,0,319,754,360,754:9,2,10,4

Creating an avinput from avsnp file error

I am having an issue with convert2annovar.pl from an rsID list. Using the following:

convert2annovar.pl -format rsid rslist.txt -dbsnpfile ${annovar}humandb/hg19_avsnp147.txt > rslist.avinput

I get:
NOTICE: Scanning dbSNP file /opt/annovar/humandb/hg19_avsnp147.txt...
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 1.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 1.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 1.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 1.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 2.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 2.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 2.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 2.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 3.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 3.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 3.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 3.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 4.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 4.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 4.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 4.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 5.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 5.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 5.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 5.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 6.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 6.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 6.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 6.
Use of uninitialized value $class in string eq at /opt/annovar/convert2annovar.pl line 1022, line 7.

And so forth for all lines (I stopped checking after more than a million lines). Input file looks like:
rs150829393
rs112039851
rs1800562
rs200401432
rs118161496
rs1799945

And does the same with or without the rs prefix. Is there a different dbSNP file to use?

Unable to download regsnpintron

Hi,
I am unable to download the regsnpintron database. The command & the corresponding error are shown below. I was able to download other databases without any problems. Is the regsnpintron database still available?
Thanks,
Mike

./annotate_variation.pl -downdb -buildver hg19 -webfrom annovar regsnpintron humandb/
NOTICE: Web-based checking to see whether ANNOVAR new version is available ... Done
NOTICE: Downloading annotation database http://www.openbioinformatics.org/annovar/download/hg19_regsnpintron.txt.gz ... Failed
NOTICE: Downloading annotation database http://www.openbioinformatics.org/annovar/download/hg19_regsnpintron.txt.idx.gz ... Failed
WARNING: Some files cannot be downloaded, including http://www.openbioinformatics.org/annovar/download/hg19_regsnpintron.txt.idx.gz, http://www.openbioinformatics.org/annovar/download/hg19_regsnpintron.txt.gz

Annotation for transposon insertion

Hi,

I have a vcf file from transposon detection software Mobster that looks like this:

CHROM POS ID REF ALT QUAL FILTER
chr11 34288 . . INS:ME:ALU . PASS
chr11 43445 . . INS:ME:L1 . PASS
chr11 67645 . . INS:ME:SVA . PASS

I want to annotate these transposon insertion points and I've used ANNOVAR hg19 refGene. However, all of these insertion points are being treated as invalid inputs. Is it because there is no ref and alt bases? Is it possible to annotate it with ANNOVAR?

Thanks,

AA change conflicts with cdna change

7 55242464 . AGGAATTAAGAGAAGC A . . DP=1185;ECNT=1;POP_AF=4.06e-06;TLOD=45.87;ANNOVAR_DATE=2018-04-16;Func.refGene=exonic;Gene.refGene=EGFR;GeneDetail.refGene=.;ExonicFunc.refGene=nonframeshift_deletion;AAChange.refGene=EGFR:NM_005228:exon19:c.2235_2249del:p.745_750del;ALLELE_END

I got an nonframeshift_deletion , 15bp in cdna but 6aa in protein.

image

It should be p.746_750del
Is there any thing wrong?

wANNOVAR

Hi could you give me an indication when the server for wANNOVAR is going to be up again? will it still be accessed from http://wannovar.wglab.org/
thank you for your assistance
Elena

A question about clinvar 20180603 lose one significant position!!

Honored Gentlemen, I found a significant position(chr9:136501794) doesn't appear in 20180603 version clinvar, but it still been found in 20170905 version clinvar. When I go to Clinvar website, there has the position information, and show its a pathogenetic dbsnp, so I'm confused!

bug in wannovar

using the link for the anemia example (http://wannovar.usc.edu/done/1/THtp3lWHEh893pWw/index.html)

Click exome summary results/view. Results consist of 173 pages, 100 variants each one.

Filtering by exonic function / stopgain SNV works, reducing the list of variants, but according to the info in the ExonicFunc column, it includes several kinds of SNVs, not only stopgain variants

-xref annotation issues with new version

Hello,

I am trying to run table_annovar.pl with the new xref and polish options and am running into a problem. Update it appears to be a polish problem, without that switch it runs without issue.

table_annovar.pl avinput.temp /ghi/butlerr/opt/annovar/humandb/ -buildver hg19 -out rslist -remove -protocol refGene,avsnp147,dbnsfp33a,exac03,gnomad_genome,intervar_20170202 -operation gx,f,f,f,f,f -nastring "-" -polish -xref /ghi/butlerr/opt/annovar/example/gene_fullxref.txt
-----------------------------------------------------------------
NOTICE: Processing operation=gx protocol=refGene

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg19 -dbtype refGene -outfile rslist.refGene -exonsort avinput.temp /ghi/butlerr/opt/annovar/humandb/>
NOTICE: Output files were written to rslist.refGene.

variant_function, rslist.refGene.exonic_variant_function
NOTICE: Reading gene annotation from /ghi/butlerr/opt/annovar/humandb/hg19_refGene.txt ... Done with 63481 transcripts (including 15216 without coding sequence annotation) for 27720 unique genes
NOTICE: Processing next batch with 377 unique variants in 377 input lines
NOTICE: Reading FASTA sequences from /ghi/butlerr/opt/annovar/humandb/hg19_refGeneMrna.fa ... Done with 22 sequences
WARNING: A total of 405 sequences will be ignored due to lack of correct ORF annotation

NOTICE: Running with system command <coding_change.pl rslist.refGene.exonic_variant_function.orig /ghi/butlerr/opt/annovar/humandb//hg19_refGene.txt /ghi/butlerr/opt/annovar/humandb//hg19_refGeneMrna.fa -alltranscript -out rslist.refGene.fa -newevf rslist.refGene.exonic_variant_function>
Error: invalid record found in exonic_variant_function file (exonic format error): <line2       frameshift substitution CFTR:NM_000492:exon1:c.-13_10G  7       117120135 117120158       GCGCCCGAGAGACCATGCAGAGGT        G       rs397508136> at /ghi/butlerr/opt/annovar/coding_change.pl line 51, <EVF> line 2.
Error running system command: <coding_change.pl rslist.refGene.exonic_variant_function.orig /ghi/butlerr/opt/annovar/humandb//hg19_refGene.txt /ghi/butlerr/opt/annovar/humandb//hg19_refGeneMrna.fa -alltranscript -out rslist.refGene.fa -newevf rslist.refGene.exonic_variant_function>

It can run with other avinput files, just not this one (the second line seems to be the issue). the body of the file was generated from avsnp147 lines (below):

3 15676984 15676990 GCGGCTG TCC rs80338684
7 117120135 117120158 GCGCCCGAGAGACCATGCAGAGGT G rs397508136
7 117120136 117120158 CGCCCGAGAGACCATGCAGAGGT - rs397508136
7 117120149 117120149 A G rs397508328
7 117120159 117120159 C A rs397508173
7 117120159 117120159 C T rs397508173
7 117120191 117120192 CT C rs397508742
7 117120192 117120192 T - rs397508742
7 117120202 117120202 G T rs397508746
7 117144332 117144332 G A rs397508796
7 117144332 117144332 G C rs397508796
7 117144332 117144332 G T rs397508796
7 117144368 117144368 C T rs397508168
7 117144390 117144390 C A rs151020603
7 117144390 117144390 C T rs151020603
7 117144418 117144418 G A rs397508243
7 117144418 117144418 G C rs397508243
7 117144418 117144418 G T rs397508243
7 117149087 117149087 G A rs397508249
7 117149089 117149089 G A rs397508256
7 117149093 117149093 G A rs397508279
7 117149094 117149094 G A rs121909025
7 117149097 117149097 - A rs397508294
7 117149097 117149097 T TA rs397508294
7 117149101 117149101 G A rs77284892
7 117149101 117149101 G T rs77284892
7 117149123 117149123 C T rs368505753
7 117149146 117149146 C T rs121908749
7 117149150 117149150 G GT rs397508360
7 117149150 117149150 - T rs397508360

HGVS annotation for intronic variants

Hi,

first of all, thank you for maintaining this great piece of software! I am trying to annotate some variants using this command:

perl ./annovar/table_annovar.pl --thread 30 final.vcf ./annovar/humandb/ -buildver hg19 -out final_annotation -remove -protocol refGene.mit,clinvar_20170905.mit,exac03,ALL.sites.2015_0_mod8,esp6500siv2_all,avsnp150.mit,gnomad_exome,gnomad_genome -operation g,f,f,f,f,f,f,f -nastring . -vcfinput

However, intronic variants are not annotated using the HGVS standard. How might I obtain these HGVS annotations?

Thanks.

Input file renaming

I haven't noticed this before and I'm trying to figure out what is happening. A recent file that I'm processing with ANNOVAR is being renamed (filename is truncated) during processing, resulting in an unexpected output file name.

The command being run:

time perl /media/joannaprzybyl/4/software/annovar/table_annovar.pl \
    -vcfinput NPC.HK.12PY0019T-DNA.12PY0019-ensemble.temp3.vcf \
    /media/joannaprzybyl/4/software/annovar/humandb/ \
    -buildver hg19 \
    -out NPC.HK.12PY0019T-DNA.12PY0019-ensemble.temp3.vcf \
    -protocol refGene,ensGene,clinvar_20150330,popfreq_all_20150413,cosmic70,snp129,snp132,snp138,avsift \
    -operation g,g,f,f,f,f,f,f,f \
    -otherinfo \
    &>annovar.log

results in NPC.HK.12PY0019T-DNA.12PY0019.avinput and consequently NPC.HK.12PY0019T-DNA.12PY0019.hg19_multianno.vcf. Any idea why the -ensemble.temp3 is removed?

Annotated VCF output botched for ChrM and ChrY

Hi,

I've been using ANNOVAR with great results for a long time now, thank you for great work. I recently decided to switch from using the *multianno.txt output format to the more cross-compatible *multianno.vcf. However, I've run into a formatting issue in the output for lines beginning with "chrM" and "chrY".

See example below:

Input line:
chrM 73 . G A 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0

Incorrectly formatted output *multianno.vcf line:
185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0 ;ANNOVAR_DATE=2014-07-22;Func.refGene=;Gene.refGene=;GeneDetail.refGene=;ExonicFunc.refGene=;AAChange.refGene=;Func.ensGene=;Gene.ensGene=;GeneDetail.ensGene=;ExonicFunc.ensGene=;AAChange.ensGene=;clinvar_20150330=;PopFreqMax=;1000G_ALL=;1000G_AFR=;1000G_AMR=;1000G_EAS=;1000G_EUR=;1000G_SAS=;ExAC_ALL=;ExAC_AFR=;ExAC_AMR=;ExAC_EAS=;ExAC_FIN=;ExAC_NFE=;ExAC_OTH=;ExAC_SAS=;ESP6500siv2_ALL=;ESP6500siv2_AA=;ESP6500siv2_EA=;CG46=;cosmic70=;snp129=;snp132=;snp138=;avsift=;ALLELE_END

Note: the *multianno.txt file is also malformed but in a different way:
chrM . 185.36 chrM 73 . G A 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0

I am manually skipping these lines for now, but would be helpful to figure out the root of this problem. Let me know if any ideas, and I'll keep troubleshooting as well.

Thanks.

frameshift insertion HGVS difference between Annovar and VEP

Dear Developers!
annovar_2018-04-16
When I annotate my vcf, we found a insertion variant event with different HGVS description between Annovar and VEP.
This insertion variant is as below:
chr6 49457714 49457714 - AA

Annovar Result:

image

HGVS: p.Asp244Leufs*38

VEP online Result:
image

HGVS: p.Asp244LeufsTer39

Difference: Annovar Ter38 but VEP Ter39.
When I check article about this variant record, found most of HGVS was record as Ter39.

What causes this difference of protein HGVS? Which one is correct?

Downloading annotation database hg38_dbnsfp33a.txt.gz ... Failed

perl annotate_variation.pl -buildver hg38 -downdb -webfrom annovar dbnsfp33a humandb/
NOTICE: Web-based checking to see whether ANNOVAR new version is available ... Done
NOTICE: Downloading annotation database http://www.openbioinformatics.org/annovar/download/hg38_dbnsfp33a.txt.gz ... Failed
NOTICE: Downloading annotation database http://www.openbioinformatics.org/annovar/download/hg38_dbnsfp33a.txt.idx.gz ... OK
NOTICE: Uncompressing downloaded files
NOTICE: Finished downloading annotation files for hg38 build version, with files saved at the 'humandb' directory
WARNING: Some files cannot be downloaded, including http://www.openbioinformatics.org/annovar/download/hg38_dbnsfp33a.txt.gz

Any alternatives to download dbnsfp33a?

Died at /work/Software/Download/Variant_Package/annovar/coding_change.pl line 553, <FASTA> line 149454.

Hi Developer!
I test the latest annovar, and get a error.

log record as below:

$ table_annovar.pl $Sample.combined.vcf $Anno_db --vcfinput -buildver hg38 -out $Sample --checkfile --otherinfo -remove -polish -protocol cytoBand,refGeneWithVer,ensGene,knownGene -operation r,g,g,g -nastring .

NOTICE: Running with system command <convert2annovar.pl -includeinfo -allsample -withfreq -format vcf4 N0202G2.combined.vcf > N0202G2.avinput>
NOTICE: Finished reading 365921 lines from VCF file
NOTICE: A total of 362523 locus in VCF file passed QC threshold, representing 346476 SNPs (239967 transitions and 106509 transversions) and 16547 indels/substitutions
NOTICE: Finished writing allele frequencies based on 346476 SNP genotypes (239967 transitions and 106509 transversions) and 16547 indels/substitutions for 1 samples

NOTICE: Running with system command </work/Software/Download/Variant_Package/annovar/table_annovar.pl N0202G2.avinput /work/Database/Annovar_db/hg38_20180130 -buildver hg38 -outfile N0202G2 --checkfile --otherinfo -remove -polish -protocol cytoBand,refGeneWithVer,ensGene,knownGene -operation r,g,g,g -nastring . -otherinfo>

NOTICE: Processing operation=r protocol=cytoBand

NOTICE: Running with system command <annotate_variation.pl -regionanno -dbtype cytoBand -buildver hg38 -outfile N0202G2 N0202G2.avinput /work/Database/Annovar_db/hg38_20180130>
NOTICE: Output file is written to N0202G2.hg38_cytoBand
NOTICE: Reading annotation database /work/Database/Annovar_db/hg38_20180130/hg38_cytoBand.txt ... Done with 1293 regions
NOTICE: Finished region-based annotation on 363002 genetic variants
NOTICE: Variants with invalid input format were written to N0202G2.invalid_input

NOTICE: Processing operation=g protocol=refGeneWithVer

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGeneWithVer -outfile N0202G2.refGeneWithVer -exonsort N0202G2.avinput /work/Database/Annovar_db/hg38_20180130>
NOTICE: Output files were written to N0202G2.refGeneWithVer.variant_function, N0202G2.refGeneWithVer.exonic_variant_function
NOTICE: Reading gene annotation from /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVer.txt ... Done with 74727 transcripts (including 18443 without coding sequence annotation) for 28059 unique genes
NOTICE: Processing next batch with 363002 unique variants in 363002 input lines
NOTICE: Reading FASTA sequences from /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVerMrna.fa ... Done with 21138 sequences
WARNING: A total of 526 sequences will be ignored due to lack of correct ORF annotation
NOTICE: Variants with invalid input format were written to N0202G2.refGeneWithVer.invalid_input

NOTICE: Running with system command <coding_change.pl N0202G2.refGeneWithVer.exonic_variant_function.orig /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVer.txt /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVerMrna.fa -alltranscript -out N0202G2.refGeneWithVer.fa -newevf N0202G2.refGeneWithVer.exonic_variant_function>
Died at /work/Software/Download/Variant_Package/annovar/coding_change.pl line 553, line 149454.
Error running system command: <coding_change.pl N0202G2.refGeneWithVer.exonic_variant_function.orig /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVer.txt /work/Database/Annovar_db/hg38_20180130/hg38_refGeneWithVerMrna.fa -alltranscript -out N0202G2.refGeneWithVer.fa -newevf N0202G2.refGeneWithVer.exonic_variant_function>
Error running system command: </work/Software/Download/Variant_Package/annovar/table_annovar.pl N0202G2.avinput /work/Database/Annovar_db/hg38_20180130 -buildver hg38 -outfile N0202G2 --checkfile --otherinfo -remove -polish -protocol cytoBand,refGeneWithVer,ensGene,knownGene -operation r,g,g,g -nastring . -otherinfo>

And then I check temp file $Sample.refGeneWithVer.fa file and found :

$ tail $Sample.refGeneWithVer.fa
LNLGIFASRLYYHWCKPQQKGLRLLCGSQVPVEVMGFPEFADCWENFVDHEKPLSFNPYKMLEELDKNSRAIKRRLERIKQS*

line343937 NM_004711.4 WILDTYPE
MEGGAYGAGKAGGAFDPYTLVRQPHTILRVVSWLFSIVVFGSIVNEGYLNSASEGEEFCIYNRNPNACSYGVAVGVLAFLTCLLYLALDVYFPQISSVKD
RKKAVLSDIGVSAFWAFLWFVGFCYLANQWQVSKPKDNPLNEGTDAARAAIAFSFFSIFTWAGQAVLAFQRYQIGADSALFSQDYMDPSQDSSMPYAPYV
EPTGPDPAGMGGTYQQPANTFDTEPQGYQSQGY*
line343937 NM_004711.4 c.605_606insCAA p.P202_T203insN protein-altering (position 202-203 has insertion N)
MEGGAYGAGKAGGAFDPYTLVRQPHTILRVVSWLFSIVVFGSIVNEGYLNSASEGEEFCIYNRNPNACSYGVAVGVLAFLTCLLYLALDVYFPQISSVKD
RKKAVLSDIGVSAFWAFLWFVGFCYLANQWQVSKPKDNPLNEGTDAARAAIAFSFFSIFTWAGQAVLAFQRYQIGADSALFSQDYMDPSQDSSMPYAPYV
EPNTGPDPAGMGGTYQQPANTFDTEPQGYQSQGY*
WARNING: invalid triplets found in DNA sequence to be translated: in

Then I get line343937 info from $Sample.refGeneWithVer.exonic_variant_function.orig file, but don't not found some problem.

$ grep "NM_004711" $Sample.refGeneWithVer.exonic_variant_function.orig
line343937 nonframeshift insertion SYNGR1:NM_004711.4:exon4:c.605_606insCAA:p.P202delinsPN chr22 39381817 39381817 - CAA 1 9966.73 223 chr22 39381817 rs149306472 C CCAA 9966.73 PASS AC=2;AF=1.00;AN=2;DB;DP=230;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.000;MQ=60.00;QD=44.69;SOR=1.179;set=variant2 GT:AD:DP:GQ:PL 1/1:0,223:223:99:10004,673,0

How to handle SV annotation of big DEL or DUP?

When trying to annotate a SV vcf file, big DUP, DEL or INV are not annotated or just the first matching gene in the region is annotated. I have run the command shown below:

perl ${path}/annovar/table_annovar.pl file.vcf ${path}/annovar/humandb/ -buildver hg19 --regionanno -out final_annotation -remove -protocol refGene,clinvar_20170905,exac03,ALL.sites.2015_0_mod8,esp6500siv2_all,avsnp150 -operation g,f,f,f,f,f -nastring . -vcfinput

The VCF does not contain information in the ALT column (region alteration) when the region have a big size and then this column is filled up with DEL,DUP or INV. Example of the VCF file:

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
chr1 1044050 - TCACCACAGCCACCATGTC TC 65 PASS END=; GT:GQ:PR:SR 0/1:65:7,0:13,4
chr1 1431164- G DEL 56 PASS END=1469606 GT:GQ:PR:SR 0/1:56:73,4:62,17

Thinking that this may be the problem, I have run the command below, using as input a file which contains the start and end position of the region so annovar could know the length of the region. The problem appear again big DUP, DEL or INV are not annotated.

perl ${path}/annovar/table_annovar.pl file.avinput ${path}/annovar/humandb/ -buildver hg19 -out test -remove -protocol refGene,clinvar_20170905,exac03,ALL.sites.2015_0_mod8,esp6500siv2_all,avsnp150 -operation g,f,f,f,f,f -nastring .

Example of the avinput file:
1 537588 537647 TTCTCTCCATCCCCCCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCATCCC T
1 1431164 1469606 G <DEL>

I know that one solution may be to fill up the ALT column of the VCF file (as we know the length of the region) but this column are empty because of a reason, as these regions are very big (some have size of 217,860,463), this will create a big file that will be computationally expensive.

I would like to know if there is exits a way to handle annotation of big SV.

Thank you.

Not updated exonic variant funcion in documentation

There are missing values in the documentation of annovar for the exonic variant function: http://annovar.openbioinformatics.org/en/latest/user-guide/gene/

The values that I found looking around in annotate_variation.pl (I am not sure if its all there):

  • frameshift insertion
  • frameshift deletion
  • frameshift substitution
  • stopgain
  • stoploss
  • nonframeshift insertion
  • nonframeshift deletion
  • nonframeshift substitution
  • nonsynonymous SNV
  • synonymous SNV
  • unknown

index file out of date

when run the annotation, I got a warning:
WARNING: Your index file hg19_clinvar_20170130.txt.idx is out of date a nd will not be used. ANNOVAR can still generate correct results without index file.

I want to know why this warning happened and how Annovar distinguish the index file is out of date?last problem: if the index file is out of date, what influence will occur? will the speed slow down?

many thanks here

Annovar & Bioconda

Dear Mr. Wang,

Would it be possible to get to some kind of licensing agreement to make it possible to make ANNOVAR available as a package in de conda package manager.

Conda is being widely used throughout the scientific community, and adding your excellent software to this system would be of great advantage to everyone. It would also greatly improve the visibility and ease of use of your software.

Thank you for your time
Matthias De Smet and all of the bioconda community

PS: check us out at https://bioconda.github.io

the same mutations have two different rs

  1. the same mutaions have two different rsID in hg19_avsnp150.txt. for example:
    1 10352 10352 T TA rs145072688
    1 10352 10352 T TA rs555500075
  2. I search rs145072688 and rs555500075 in dbsnp database:
    rs555500075 means 1:10352:T:TA
    rs145072688 means 1:10353:A:AC
  3. dbSNP's own VCF file: ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b150_GRCh37p13/VCF/All_20170710.vcf.gz has this problem. but ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/All_20180423.vcf.gz has no this problem. Could you please update the avsnp to v151?
    thank you very much!

Database warning

when I use annovar, I get the warning:

WARNING: Your index file hg19_gnomad_exome.txt.idx is out of date and will not be used. ANNOVAR can still generate correct results without index file.

what should I do?

rs snp allele frequency deficiency

The dataset exac03 have no allele frequency for SNP rs2504779, which was shown in vcf file from ExAC official website(AF=0.178). The variant site is thought to be a common variants which should be filtered out, but the missing AF cause some problems to my analysis.
And I think there were many missing allele frequency in exac03 dataset, how can I deal with this problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.