GithubHelp home page GithubHelp logo

Sanity check failed about viralgenie HOT 3 CLOSED

Joon-Klaps avatar Joon-Klaps commented on September 22, 2024
Sanity check failed

from viralgenie.

Comments (3)

Joon-Klaps avatar Joon-Klaps commented on September 22, 2024

The problem lies within the custom script ivar_varaints_to_vcf.py from the viralrecon pipeline
The script ivar_variants_to_vcf.py makes the original variant table (tsv output from ivar):

REGION	POS	REF	ALT	REF_DP	REF_RV	REF_QUAL	ALT_DP	ALT_RV	ALT_QUAL	ALT_FREQ	TOTAL_DP	PVAL	PASS	GFF_FEATURE	REF_CODON	REF_AA	ALT_CODON	ALT_AA	POS_AA
LVE00096_cl5_it1.consensus_bcftools	10324	A	-NNNNNNNNNNNNNNNNNNNNNNN	149	97	33	147	0	20	0.986577	149	3.98648e-32	TRUE	NA	NA	NA	NA	NA	NA
LVE00096_cl5_it1.consensus_bcftools	10354	G	-NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN	188	119	33	147	0	20	0.777778	189	5.86768e-34	TRUE	NA	NA	NA	NA	NA	NA

To

CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	LVE00096_cl5_it1
LVE00096_cl5_it1.consensus_bcftools	10324	.	ANNNNNNNNNNNNNNNNNNNNNNN	A	.	PASS	DP=149	GT:REF_DP:REF_RV:REF_QUAL:ALT_DP:ALT_RV:ALT_QUAL:ALT_FREQ	1:149:97:33:147:0:20:0.986577
LVE00096_cl5_it1.consensus_bcftools	10354	.	GNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN	G	.	PASS	DP=189	GT:REF_DP:REF_RV:REF_QUAL:ALT_DP:ALT_RV:ALT_QUAL:ALT_FREQ	1:188:119:33:147:0:20:0.777778

from viralgenie.

Joon-Klaps avatar Joon-Klaps commented on September 22, 2024

*Update, the problem isn't the script. Instead it's samtools mpileup not ignoring the reference from time to time. It's running it without one despite being given which results in

SRR11140748_MT192765.1	1	N	3	^]G^]G^]G	FHG
SRR11140748_MT192765.1	2	N	5	TTT^]T^]t	FHHHG
SRR11140748_MT192765.1	3	N	5	TTTTt	GGHHH
SRR11140748_MT192765.1	4	N	5	TTTTt	HHHHH
SRR11140748_MT192765.1	5	N	5	AAAAa	HHHHH
SRR11140748_MT192765.1	6	N	5	TTTTt	HHHHH
SRR11140748_MT192765.1	7	N	5	AAAAa	HGHHH
SRR11140748_MT192765.1	8	N	5	CCCCc	HHHHG
SRR11140748_MT192765.1	9	N	5	CCCCc	HHHHG
SRR11140748_MT192765.1	10	N	5	TTTTt	HHHHH

The same run another time (literally just bash .command.run)
The output becomes:

SRR11140748_MT192765.1	1	G	3	^].^].^].	FHG
SRR11140748_MT192765.1	2	T	5	...^].^],	FHHHG
SRR11140748_MT192765.1	3	T	5	....,	GGHHH
SRR11140748_MT192765.1	4	T	5	....,	HHHHH
SRR11140748_MT192765.1	5	A	5	....,	HHHHH
SRR11140748_MT192765.1	6	T	5	....,	HHHHH
SRR11140748_MT192765.1	7	A	5	....,	HGHHH
SRR11140748_MT192765.1	8	C	5	....,	HHHHG
SRR11140748_MT192765.1	9	C	5	....,	HHHHG
SRR11140748_MT192765.1	10	T	5	....,	HHHHH

I'm uncertain how to avoid this behaviour. I'll remove -B (recaculates the base alignment score) from the arguments

from viralgenie.

Joon-Klaps avatar Joon-Klaps commented on September 22, 2024

Within the documentation of IVAR it's suggested to include -B:

Note: Please use the -B options with samtools mpileup to call variants and generate consensus. When a reference sequence is supplied, the quality of the reference base is reduced to 0 (ASCII: !) in the mpileup output. Disabling BAQ with -B seems to fix this. This was tested in samtools 1.7 and 1.8.

It's also shown that BAQ vastly reduces the number of FP but then also increases the FN. I think in our case for the hyper diverse population. We cannot affort the increase of FN so I'll keep -B in the default setting.

from viralgenie.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.