Vt
Introduction
vt is a tool set that calls, genotypes and filters short variants. It provides profiling of variants to aid in QC.
Discovery
Discovery is performed at per sample level, the evidence sites lists for each sample is then merged and site discovery statistics are computed. The user then makes a decision on cut offs to make to create an initial site list.
Generates site list with info fields E and N.
vt discover -i NA12878.bam -o NA12878.sites.vcf -g hs37d5.fa
Left align variants.
vt left_align -i NA12878.bam -o NA12878.leftaligned.sites.vcf -g hs37d5.fa
Evidence site lists are combined across samples and split by sites.
vt merge_and_split_sample_vcf -i NA12878.sites.vcf,NA12879.sites.vcf,NA12880.sites.vcf -o 1-1000000.sites.vcf
Discovery statistics are computed.
vt compute_discovery_stats -i 1-1000000.sites.vcf -o 1-1000000.annotated.sites.vcf
A calling pipeline implemented in a make file is available here.
Genotyping
Each individual is genotyped at a set of sites.
vt genotype -i NA12878.bam -o NA12878.sites.vcf -g hs37d5.fa
Genotype sample VCFs are combined across samples and split by sites.
vt merge_and_split_sample_vcf -i NA12878.sites.vcf,NA12879.sites.vcf,NA12880.sites.vcf -o 1-1000000.sites.vcf
Features are computed.
vt compute_features -i 1-1000000.sites.vcf -o 1-1000000.annotated.sites.vcf
A genotyping pipeline implemented in a make file is available here.
Filtering
Requires a set of features
vt svm NA12878.bam -i NA12878.sites.vcf -o NA12878.svm.sites.vcf --pos positive.sites.vcf --neg negative.sites.vcf
A filtering pipeline implemented in a make file is available here.
Left Alignment
Left align indel type variants in a VCF file.
vt leftalign -i mills.vcf -o mills.leftaligned.vcf