Difference between revisions of "Vt"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 23: Line 23:
  
 
<div class=" mw-collapsible mw-collapsed">
 
<div class=" mw-collapsible mw-collapsed">
[http://genome.sph.umich.edu/wiki/Variant_Normalization Normalize] variants in a VCF file.   
+
[http://genome.sph.umich.edu/wiki/Variant_Normalization Normalize] variants in a [http://www.1000genomes.org/wiki/analysis/variant-call-format/vcf-variant-call-format-version-42 VCFfile.   
 
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
 
Normalized variants may have their positions changed; in such cases, the normalized variants
 
Normalized variants may have their positions changed; in such cases, the normalized variants

Revision as of 11:36, 5 November 2013

Introduction

vt is a variant tool set that discovers short variants from Next Generation Sequencing data. The features are being rolled out to github as major rewriting is being undertaken.

Location

The source files are housed in github.

 git clone https://github.com/atks/vt.git

Common options

   -i   multiple intervals in <seq>:<start>-<end> format delimited by commas.
   -I   multiple intervals in <seq>:<start>-<end> format listed in a text file line by line.
   -o  defines the out file which and has the STDOUT set as the default.
         You may modify the STDOUT to output the binary version of the format.

Programs

Normalization

Normalize variants in a VCF file.

Normalized variants may have their positions changed; in such cases, the normalized variants are reordered and output in an ordered fashion. The local reordering takes place over a window of 10000 base pairs.

  vt normalize mills.vcf -r seq.fa -o - mills.normalized.vcf
  #normalize variants, send to standard out and remove duplicates.
  vt normalize mills.vcf -r seq.fa -o - | vt merge_duplicate_variants - -o mills.normalized_merged.vcf
  usage : vt normalize [options] <in.vcf>
  options : -o  output VCF file [-]
            -I  file containing list of intervals []
            -i  intervals []
            -r  reference sequence fasta file []
            --  ignores the rest of the labeled arguments following this flag
            -h  displays help

Merge duplicate variants

Merges duplicate variants by position with the option of considering alleles. (This just discards the duplicate variant that appears later in the VCF file)

  Options:
  -i,  --input-vcf <string>  : Input VCF file
  -o,  --output-vcf <string> : Output VCF file [-]
  -p,  --merge-by-position   : Merge by position [false]
  Example:
  e.g. vt merge_duplicate_variants -i 8904indels.dups.genotypes.vcf -o out.vcf
  e.g. vt merge_duplicate_variants -p -i 8904indels.dups.genotypes.vcf -o out.vcf

Maintained by

This page is maintained by Adrian