Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 12: Line 12:  
== Parsimony ==
 
== Parsimony ==
   −
Parsimony means doing  something in the simplest and most economical way.  In the context of variant representation, this means representing a variant in as few nucleotides as possible without reducing the length of any allele to 0.  Taking the example below, the Multi Nucleotide Polymorphism (MNP) is represented superfluously for the first 3 representations and parsimoniously for the 4th representation.  When a variant has superfluous nucleotides on the left side, it is defined as not being left parsimonious as there is a need to left trim.  The concept is symmetric for right parsimony and trimming.  Parsimony applies to Indels too which we shall demonstrate in the left alignment section.
+
Parsimony means doing  something in the simplest and most economical way.  In the context of variant representation, this means representing a variant in as few nucleotides as possible without reducing the length of any allele to 0.  Taking the example below, the Multi Nucleotide Polymorphism (MNP) is represented superfluously for the first 3 representations and parsimoniously for the 4th representation.  When a variant has superfluous nucleotides on the left side, it is defined as not being left parsimonious as there is a need to left trim.  The concept is symmetric for right parsimony and trimming.  Parsimony applies to Indels too which we shall demonstrate in the left alignment section.
    
[[Image:normalization_mnp.png|none|700px|This figure shows multiple representations of a MNP. The left shows 4 possible representations differentiated by color. The right shows the corresponding representation in VCF. The last representation represents the parsimonious representation of the MNP.]]  
 
[[Image:normalization_mnp.png|none|700px|This figure shows multiple representations of a MNP. The left shows 4 possible representations differentiated by color. The right shows the corresponding representation in VCF. The last representation represents the parsimonious representation of the MNP.]]  
 
This figure shows multiple representations of a MNP. The left shows 4 possible representations differentiated by color. The right shows the corresponding representation in VCF. The last representation represents the parsimonious representation of the MNP.
 
This figure shows multiple representations of a MNP. The left shows 4 possible representations differentiated by color. The right shows the corresponding representation in VCF. The last representation represents the parsimonious representation of the MNP.
 +
 +
Based on the definition of parsimony, it is easy to see that:
 +
 +
If a variant is non parsimonious, all its alleles of length have to have length greater than 1.
 +
 +
However, the converse is not true.
    
== Left alignment ==
 
== Left alignment ==
1,102

edits

Navigation menu