GenoClone: a tool for subclone inference and studying tumor heterogenity.

This tutorial will help you get started with GenoClone by demonstrating how to perform subclone inference from exome or whole genome sequencing. If you experience any problems following these steps, please don't hesitate to contact us by email: [email protected].

Requirements

GenoClone aims to provide useful tool to infer the tumor subclones and study tumor heterogeneity. If any of the following requirements conflict with your lab's set up, please contact us.

System Requirements:Linux operating system
Software Requirements: Python 2.7+, MATLAB 2014+, samtools 1.3.1, VarScan 2.4.2

Notes: Python, MATLAB, and samtools are available through your path environment variable.

Alignment file: BAM alignment - You need the tumor and matched normal bam file from exome or whole genome sequencing.
Reference files: --reference - genome FASTA file (Input for VarScan to detect the mutations)

Download

You need to download: GenoClone-0.1.tar.gz.

Uncompress and extract the directory in a location of your choice to get a directory containing GenoClone executable files:

   $ tar -xzvf GenoClone-0.1.tar.gz
	 $ ls GenoClone/bin/
	 GenoClone

Then check the commond of GenoClone:

$ GenoClone/bin/GenoClone -h usage: GenoClone [-h] -o OUTPUT [--tempdir TEMPDIR | --specific_tempdir SPECIFIC_TEMPDIR] varscan bam Detect the linkage information between somatic mutation(SNV) and germline mutation(SNP) positional arguments: varscan REQUIRED Input the output file from Varscan bam REQUIRED Input the alignment tumor bam file optional arguments: -h, --help show this help message and exit -o OUTPUT, --output OUTPUT REQUIRED Output filename, totally it produce two files, one '.csv' file for the composition of subclone and the other '.pdf' file for the evaluation of different number of subclones (default: None) --tempdir TEMPDIR The temporary directory is made and destroyed here. (default: /tmp) --specific_tempdir SPECIFIC_TEMPDIR This temporary directory will be used, but will remain after executing. (default: None)

Input data:

GenoClone depends on the output of VarScan, the two softwares require:

Reference.fasta: the indexed reference genome FASTA file.
Tumor.bam and Normal.bam : the binary sequence alignment/map formatted (BAM) sequence data from tumor and matched normal DNA sample.

VarScan needs the BAM file of tumor and matched normal and reference genome to obtain the total mutations (germline and somatic mutations). Then GenoClone uses the tumor bam file and total mutations to infer the subclones.

The bam files for tumor and matched normal could be downloaded from National Cancer Institute and corresponding Reference.fasta. Here is the link: https://gdc.cancer.gov/about-data/data-harmonization-and-generation/gdc-reference-files.

Example commonds:

Step 1. Use VarScan to obtain the total mutations:

$ samtools mpileup -q 1 -f Reference.fasta Tumor.bam >tumor.pileup $ samtools mpileup -q 1 -f Reference.Genome Normal.bam >tumor.pileup $ java -jar varScan somatic normal.pileup tumor.pileup --output-vcf 1 --output-snp total_mutations.vcf

Step 2. Use GenoClone to infer subclones:

$ /GenoClone/bin/GenoClone total_mutations.vcf Tumor.bam -o Tumor

Output files:

GenoClone output two files:

Tumor.csv: the number of subclones and the compositon of each subclones and the goodness value varies with number of subclones.

Tumor.pdf: the difference between true and observed VAF and goodness varies with number of subclones.

augroup / genoclone Goto Github PK

genoclone's Introduction

GenoClone: a tool for subclone inference and studying tumor heterogenity.

Requirements

Download

Input data:

Example commonds:

Output files:

genoclone's People

Contributors

Stargazers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs