Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 434: Line 434:  
To normalize and remove duplicate variants:
 
To normalize and remove duplicate variants:
   −
   ${GC}/bin/vt normalize  mills.genotypes.bcf -r hs37d5.fa  | ${GC}/bin/vt mergedups - -o mills.normalized.genotypes.bcf  
+
   ${GC}/bin/vt normalize  ${VTREF}/mills_indels_hg19.sites.bcf -r ${VTREF}/hs37d5.fa  | ${GC}/bin/vt mergedups - -o ${OUT}/mills.normalized.genotypes.bcf  
    
and you will observe that 3994 variants had to be left aligned and 1092 variants were removed.
 
and you will observe that 3994 variants had to be left aligned and 1092 variants were removed.
Line 455: Line 455:  
         Total number of unique variants    8904 <br>
 
         Total number of unique variants    8904 <br>
    +
Let's look for a variant that was normalized.
 +
${GC}/bin/vt view ${OUT}/mills.normalized.genotypes.bcf | grep OLD_VARIANT |head -1
 +
 +
Results:
 +
1 18293097 . T TCTC . PASS VC=INDEL;AC=263;AF=0.84;AN=314;OLD_VARIANT=1:18293100:C/CCTC
 +
 +
The position has changed - it was:
 +
* 18293100 (as seen after OLD_VARIANT)
 +
Now it is
 +
* 18293097
    
UMICH's algorithm for normalization has been adopted by Petr Danecek in bcftools and is also used in GKNO.
 
UMICH's algorithm for normalization has been adopted by Petr Danecek in bcftools and is also used in GKNO.

Navigation menu