Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 32: Line 32:  
== How to observe that a variant is not left aligned or parsimonious on the right side? ==
 
== How to observe that a variant is not left aligned or parsimonious on the right side? ==
   −
     If each allele ends with the same nucleotide, it is not left aligned or not right parsimonious.
+
     Each allele ends with the same nucleotide if and only if it is not left aligned or not right parsimonious.
   −
We prove the contrapositive:  
+
For the => direction, we prove the contrapositive :  
    
* If an Indel is left aligned  and right parsimonious then each allele do not end with the same type of nucleotide.
 
* If an Indel is left aligned  and right parsimonious then each allele do not end with the same type of nucleotide.
   −
We first assume an indel is already left aligned and right parsimonious.  Suppose all alleles have a length greater than 1, since the indel is right parsimonious, clearly, each allele do not end with the same type of nucleotide.  Now, suppose that there exists an allele of length 1 and that all the alleles end with a particular nucleotide say 'A'.  This is still considered right parsimonious as there are no superfluous nucleotides to remove without resulting in an empty allele.  It is possible to extend all the alleles one position to the left by copying from a nucleotide on the reference genome, so now we have a superfluous nucleotide on the right side and can remove that nucleotide resulting in a new representation that shifts the Indel to the left by one position where one of the alleles is of length one.  This is left aligning the Indel and thus there is a contradiction, so each allele cannot end with the same type of nucleotide.  This completes the proof.
+
We first assume an indel is already left aligned and right parsimonious.  Suppose all alleles have a length greater than 1, since the indel is right parsimonious, clearly, each allele do not end with the same type of nucleotide.  Now, suppose that there exists an allele of length 1 and that all the alleles end with a particular nucleotide say 'A'.  This is still considered right parsimonious as there are no superfluous nucleotides to remove without resulting in an empty allele.  It is possible to extend all the alleles one position to the left by copying from a nucleotide on the reference genome, so now we have a superfluous nucleotide on the right side and can remove that nucleotide resulting in a new representation that shifts the Indel to the left by one position where one of the alleles is of length one.  This is left aligning the Indel and thus there is a contradiction, so each allele cannot end with the same type of nucleotide.
 +
 
 +
For the <= direction :
 +
 
 +
* If an Indel is not left aligned or not right parsimonious then each allele ends with the same type of nucleotide.
 +
 
 +
Suppose a variant is not left aligned, then it must be possible to extend the alleles one nucleotide to the left and remove one nucleotide from the right to endure that all the alleles remain the same lengthThus each allele must end with the same type of nucleotide for the removal of the rightmost nucleotide to be possible.
 +
 
 +
Suppose a variant is not right parsimonious, then for sure, all the alleles have length greater than one and  by definition, the right most nucleotide is the same for all alleles and may be removed.
    
= Algorithm for Normalization =
 
= Algorithm for Normalization =
1,102

edits

Navigation menu