Changes

From Genome Analysis Wiki
Jump to navigationJump to search
2,340 bytes added ,  11:31, 26 September 2014
Line 198: Line 198:  
   #normalize variants, send to standard out and remove duplicates.
 
   #normalize variants, send to standard out and remove duplicates.
 
   vt normalize dbsnp.vcf -r seq.fa | vt mergedups - -o dbsnp.normalized.merged.vcf
 
   vt normalize dbsnp.vcf -r seq.fa | vt mergedups - -o dbsnp.normalized.merged.vcf
 +
 +
  #variants that are normalized will be annotated with an OLD_VARIANT info tag.
 +
  #CHROM  POS      ID  REF          ALT  QUAL  FILTER  INFO
 +
  19   29238772 . C            G    .    PASS VT=SNP;OLD_VARIANT=19:29238771:TC/TG
 +
  20   60674709 . GCCCAGCCCCAC  G    .    PASS VT=INDEL;OLD_VARIANT=20:60674718:CACCCCAGCCCC/C
 +
 +
  #this shows a sample output with the normalization operations that were used
 +
  #categorized into 5 categories each for biallelic and multiallelic variants. <br>
 +
  stats: biallelic
 +
          no. left trimmed                      : 156908
 +
          no. right trimmed                    : 323
 +
          no. left and right trimmed            : 33
 +
          no. right trimmed and left aligned    : 7
 +
          no. left aligned                      : 12360 <br>
 +
      total no. biallelic normalized          : 169631 <br> <br>
 +
      multiallelic
 +
          no. left trimmed                      : 627189
 +
          no. right trimmed                    : 2509
 +
          no. left and right trimmed            : 1498
 +
          no. right trimmed and left aligned    : 212
 +
          no. left aligned                      : 1783 <br>
 +
      total no. multiallelic normalized        : 633191 <br>
 +
      total no. variants normalized            : 802822
 +
      total no. variants observed              : 88052639
 +
 +
<div class="mw-collapsible-content">
 +
  usage : vt normalize [options] <in.vcf>
 +
 +
  options : -o  output VCF file [-]
 +
            -I  file containing list of intervals []
 +
            -i  intervals []
 +
            -r  reference sequence fasta file []
 +
            --  ignores the rest of the labeled arguments following this flag
 +
            -h  displays help
 +
</div>
 +
</div>
 +
 +
=== Decompose===
 +
 +
<div>
 +
[http://genome.sph.umich.edu/wiki/Variant_Normalization Normalize] variants in a [http://www.1000genomes.org/wiki/analysis/variant-call-format/vcf-variant-call-format-version-42 VCF]  file.  Normalized variants may have their positions changed; in such cases, the normalized variants
 +
are reordered and output in an ordered fashion.  The local reordering takes place over a window
 +
of 10000 base pairs.
 +
</div>
 +
 +
<div class=" mw-collapsible mw-collapsed">
 +
  #decomposes multiallelic variants into biallelic variants and write out to gatk.decomposed.vcf
 +
  vt decompose gatk.vcf -o gatk.decomposed.vcf
    
   #variants that are normalized will be annotated with an OLD_VARIANT info tag.
 
   #variants that are normalized will be annotated with an OLD_VARIANT info tag.
1,102

edits

Navigation menu