Changes

From Genome Analysis Wiki
Jump to navigationJump to search
490 bytes removed ,  11:47, 26 September 2014
Line 238: Line 238:     
<div>
 
<div>
[http://genome.sph.umich.edu/wiki/Variant_Normalization Normalize] variants in a [http://www.1000genomes.org/wiki/analysis/variant-call-format/vcf-variant-call-format-version-42 VCF]  file. Normalized variants may have their positions changed; in such cases, the normalized variants
+
Decompose multiallelic variants in a [http://www.1000genomes.org/wiki/analysis/variant-call-format/vcf-variant-call-format-version-42 VCF]  file.   If the VCF file has genotype fields GT,PL or GL, they are
are reordered and output in an ordered fashionThe local reordering takes place over a window
+
modified to reflect the change in allelesAll other genotype fields are removed.
of 10000 base pairs.
   
</div>
 
</div>
   Line 247: Line 246:  
   vt decompose gatk.vcf -o gatk.decomposed.vcf
 
   vt decompose gatk.vcf -o gatk.decomposed.vcf
   −
   #variants that are normalized will be annotated with an OLD_VARIANT info tag.
+
   #before decomposition
   #CHROM  POS      ID  REF           ALT QUAL FILTER  INFO
+
   #CHROM  POS      ID  REF     ALT             QUAL   FILTER  INFO                   FORMAT    S1                                            S2                                                                         
   19   29238772 . C            G    .     PASS VT=SNP;OLD_VARIANT=19:29238771:TC/TG
+
   1 3759889 . TA TAA,TAAA,T . PASS AF=0.342,0.173,0.037 GT:DP:PL   1/2::81:281,5,9,58,0,115,338,46,116,809 0/0:86:0,30,323,31,365,483,38,291,325,567
  20   60674709 . GCCCAGCCCCAC G    .    PASS VT=INDEL;OLD_VARIANT=20:60674718:CACCCCAGCCCC/C
     −
   #this shows a sample output with the normalization operations that were used
+
   #after decomposition
   #categorized into 5 categories each for biallelic and multiallelic variants. <br>
+
   #CHROM  POS      ID  REF    ALT    QUAL    FILTER  INFO                                                            FORMAT  S1              S2           
   stats: biallelic
+
   1 3759889 . TA TAA . PASS AF=0.342,0.173,0.037;OLD_MULTIALLELIC=1:3759889:TA/TAA/TAAA/T GT:PL 1/.:281,5,9 0/0:0,30,323
          no. left trimmed                      : 156908
+
  1 3759889 . TA TAAA . . OLD_MULTIALLELIC=1:3759889:TA/TAA/TAAA/T                         GT:PL ./1:281,58,115 0/0:0,31,483
          no. right trimmed                    : 323
+
  1 3759889 . TA T . . OLD_MULTIALLELIC=1:3759889:TA/TAA/TAAA/T                         GT:PL ./.:281,338,809 0/0:0,38,567
          no. left and right trimmed            : 33
  −
          no. right trimmed and left aligned    : 7
  −
          no. left aligned                      : 12360 <br>
  −
      total no. biallelic normalized          : 169631 <br> <br>
  −
      multiallelic
  −
          no. left trimmed                      : 627189
  −
          no. right trimmed                    : 2509
  −
          no. left and right trimmed            : 1498
  −
          no. right trimmed and left aligned    : 212
  −
          no. left aligned                      : 1783 <br>
  −
      total no. multiallelic normalized        : 633191 <br>
  −
      total no. variants normalized            : 802822
  −
      total no. variants observed              : 88052639
      
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
1,102

edits

Navigation menu