Difference between revisions of "Vt"
(Created page with '=== Introduction === vt is a tool set that calls, genotypes and filters short variants. It provides profiling of variants to aid in QC. === Discovery === Discovery is perform…') |
|||
Line 56: | Line 56: | ||
vt leftalign -i mills.vcf -o mills.leftaligned.vcf | vt leftalign -i mills.vcf -o mills.leftaligned.vcf | ||
+ | |||
+ | === Profile SNPs === | ||
+ | |||
+ | Profile SNPs. | ||
+ | |||
+ | vt profile_snps -i mills.snps.sites.vcf | ||
+ | |||
+ | === Profile Indels === | ||
+ | |||
+ | Profile indels. | ||
+ | |||
+ | vt profile_indels -i mills.indels.sites.vcf | ||
+ | |||
+ | === Profile MNPs === | ||
+ | |||
+ | Profile MNPs. | ||
+ | |||
+ | vt profile_mnps -i mills.mnps.sites.vcf | ||
+ | |||
+ | === sort === | ||
+ | |||
+ | Sort variants according to contig lists in header. | ||
+ | |||
+ | vt sort -i mills.sites.vcf | ||
+ | |||
+ | === split_by_variant === | ||
+ | |||
+ | Split VCF files by variant type. | ||
+ | |||
+ | vt split_by_variant -i mills.sites.vcf | ||
+ | |||
+ | === compute_<feature> === | ||
+ | |||
+ | Compute feature of variant. | ||
+ | |||
+ | vt compute_feature -i mills.vcf | ||
+ | |||
+ | |||
+ | |||
+ | === Resource Files === | ||
+ | |||
+ | dbSNP | ||
+ | OMNI 1000G | ||
+ | Mills | ||
+ | HAPMAP |
Revision as of 16:18, 12 March 2013
Introduction
vt is a tool set that calls, genotypes and filters short variants. It provides profiling of variants to aid in QC.
Discovery
Discovery is performed at per sample level, the evidence sites lists for each sample is then merged and site discovery statistics are computed. The user then makes a decision on cut offs to make to create an initial site list.
Generates site list with info fields E and N.
vt discover -i NA12878.bam -o NA12878.sites.vcf -g hs37d5.fa
Left align variants.
vt left_align -i NA12878.bam -o NA12878.leftaligned.sites.vcf -g hs37d5.fa
Evidence site lists are combined across samples and split by sites.
vt merge_and_split_sample_vcf -i NA12878.sites.vcf,NA12879.sites.vcf,NA12880.sites.vcf -o 1-1000000.sites.vcf
Discovery statistics are computed.
vt compute_discovery_stats -i 1-1000000.sites.vcf -o 1-1000000.annotated.sites.vcf
A calling pipeline implemented in a make file is available here.
Genotyping
Each individual is genotyped at a set of sites.
vt genotype -i NA12878.bam -o NA12878.sites.vcf -g hs37d5.fa
Genotype sample VCFs are combined across samples and split by sites.
vt merge_and_split_sample_vcf -i NA12878.sites.vcf,NA12879.sites.vcf,NA12880.sites.vcf -o 1-1000000.sites.vcf
Features are computed.
vt compute_features -i 1-1000000.sites.vcf -o 1-1000000.annotated.sites.vcf
A genotyping pipeline implemented in a make file is available here.
Filtering
Requires a set of features
vt svm NA12878.bam -i NA12878.sites.vcf -o NA12878.svm.sites.vcf --pos positive.sites.vcf --neg negative.sites.vcf
A filtering pipeline implemented in a make file is available here.
Left Alignment
Left align indel type variants in a VCF file.
vt leftalign -i mills.vcf -o mills.leftaligned.vcf
Profile SNPs
Profile SNPs.
vt profile_snps -i mills.snps.sites.vcf
Profile Indels
Profile indels.
vt profile_indels -i mills.indels.sites.vcf
Profile MNPs
Profile MNPs.
vt profile_mnps -i mills.mnps.sites.vcf
sort
Sort variants according to contig lists in header.
vt sort -i mills.sites.vcf
split_by_variant
Split VCF files by variant type.
vt split_by_variant -i mills.sites.vcf
compute_<feature>
Compute feature of variant.
vt compute_feature -i mills.vcf
Resource Files
dbSNP OMNI 1000G Mills HAPMAP