Difference between revisions of "Vt"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 92: Line 92:
 
   vt discover -b NA12878.bam -s NA12878 -r hs37d5.fa -i 20 -v snps,indels,mnps
 
   vt discover -b NA12878.bam -s NA12878 -r hs37d5.fa -i 20 -v snps,indels,mnps
 
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
usage : vt discover [options]  
+
  usage : vt discover [options]  
  
 
   options : -b  input BAM file
 
   options : -b  input BAM file

Revision as of 16:21, 4 December 2013

Introduction

vt is a variant tool set that discovers short variants from Next Generation Sequencing data. The features are being rolled out to github as major rewriting is being undertaken.

Installation

The source files are housed in github.

To install, perform the following steps:

 #this will create a directory named vt in the directory you cloned the repository
 1. git clone https://github.com/atks/vt.git 

 #change directory to vt
 2. cd vt

 #run make, note that compilers need to support the c++0x standard
 3. make

Building has been tested on Linux and Mac systems on gcc 4.3 and above and clang 3.4.

Common options

   -i   multiple intervals in <seq>:<start>-<end> format delimited by commas.
   -I   multiple intervals in <seq>:<start>-<end> format listed in a text file line by line.
   -o   defines the out file which and has the STDOUT set as the default.
        You may modify the STDOUT to output the binary version of the format.

Programs

Normalization

Normalize variants in a VCF file.

Normalized variants may have their positions changed; in such cases, the normalized variants are reordered and output in an ordered fashion. The local reordering takes place over a window of 10000 base pairs.

  #normalize variants and write out to mills.normalized.vcf
  vt normalize mills.vcf -r seq.fa -o mills.normalized.vcf
  #normalize variants, send to standard out and remove duplicates.
  vt normalize mills.vcf -r seq.fa | vt merge_duplicate_variants - -o mills.normalized.merged.vcf
  usage : vt normalize [options] <in.vcf>
  options : -o  output VCF file [-]
            -I  file containing list of intervals []
            -i  intervals []
            -r  reference sequence fasta file []
            --  ignores the rest of the labeled arguments following this flag
            -h  displays help

Merge duplicate variants

Merges duplicate variants by position with the option of considering alleles. (This just discards the duplicate variant that appears later in the VCF file)

  #merge duplicate variants and save output in mills.merged.vcf
  vt mergedups mills.vcf -o mills.merged.vcf
  usage : vt mergedups [options] <in.vcf>
  options : -o  output VCF file [-]
            -p  merge by position [false]

Discover

Discovers variants from reads in a BAM file.

  #discover variants from NA12878.bam and write to stdout
  vt discover -b NA12878.bam -s NA12878 -r hs37d5.fa -i 20 -v snps,indels,mnps
 usage : vt discover [options] 
 options : -b  input BAM file
           -v  variant types [snps,mnps,indels]
           -f  fractional evidence cutoff for candidate allele [0.1]
           -e  evidence count cutoff for candidate allele [2]
           -q  base quality cutoff for bases [13]
           -m  MAPQ cutoff for alignments [20]
           -s  sample ID
           -r  reference sequence fasta file []
           -o  output VCF file [-]
           -I  file containing list of intervals []
           -i  intervals []
           --  ignores the rest of the labeled arguments following this flag
           -h  displays help

Merge candidate variants

Merges duplicate variants by position with the option of considering alleles. (This just discards the duplicate variant that appears later in the VCF file)

  #merge duplicate variants and save output in mills.merged.vcf
  vt mergedups mills.vcf -o mills.merged.vcf
  usage : vt mergedups [options] <in.vcf>
  options : -o  output VCF file [-]
            -p  merge by position [false]

Genotype

Merges duplicate variants by position with the option of considering alleles. (This just discards the duplicate variant that appears later in the VCF file)

  #merge duplicate variants and save output in mills.merged.vcf
  vt mergedups mills.vcf -o mills.merged.vcf
  usage : vt mergedups [options] <in.vcf>
  options : -o  output VCF file [-]
            -p  merge by position [false]

View

Views a VCF or BCF or VCF.GZ file. Eventually to support filters, subsetting of samples.

  #views mills.bcf and outputs to standard out
  vt view mills.bcf 
 usage : vt view [options] <in.vcf>
 options : -o  output VCF/VCF.GZ/BCF file [-]
           -v  variant type []
           -p  print options and summary []
           -I  file containing list of intervals []
           -i  intervals []
           --  ignores the rest of the labeled arguments following this flag
           -h  displays help

Maintained by

This page is maintained by Adrian