Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,713 bytes added ,  23:32, 26 March 2015
Line 104: Line 104:     
Thus A and B have to be at the same position and have the same length and variant normalization is unique.
 
Thus A and B have to be at the same position and have the same length and variant normalization is unique.
 +
 +
= I can find an example where the normalization algorithm fails =
 +
 +
Hi Terry,
 +
 +
Thanks for the report. This is an interesting example.
 +
 +
But before I begin, I would like to distinguish the difference between normalization and decomposition of variants (as we defined it)
 +
 +
Decomposition of variants involves the breaking down of a variant record into multiple records. It may be done vertically - as in multiallelics becoming biallelics or it can be done horizontally - a cluster of indels and SNPs represented as a complex variant being splitted up into several records. Horizontal decompositions in general do not have a unique solution.
 +
 +
Normalization involves ensuring the representation of a variant record is left aligned and parsimonious and does not increase or decrease the number of records representing that variant. Normalization can be applied to biallelic variants or multiallelic variants. The problem of normalization is solvable and there exists a unique representation that is left aligned and parsimonious. Mathematical proof is published. [http://bioinformatics.oxfordjournals.org/content/early/2015/03/22/bioinformatics.btv112]
 +
 +
Supporting haplotype reconstruction is actually not the goal of vt's normalization, it is meant for allowing one to compare the alleles of variant call sets from different variant callers applied possibly to multiple samples.
 +
 +
The notion of normalization that you described involves reconstruction of haplotypes, and you are right to say that there should be some inbuilt collision detection mechanism. It was a really nice example and in the context of a single sample and the assumption that the alternate alleles must occur on the same haplotype, it is correct.
 +
    
= Implementation =
 
= Implementation =
1,102

edits

Navigation menu