From Genome Analysis Wiki
Jump to navigationJump to search
623 bytes added
, 10:25, 5 September 2014
Line 7: |
Line 7: |
| = Definitions = | | = Definitions = |
| | | |
− | The normalization of a variant representation in VCF consists of two parts: parsimony and left alignment pertaining to the nature of a variant's length and position respectively. | + | The definition of a variant is based on the definition of each allele with respect to the reference sequence. We consider 5 major types as follows. |
| + | |
| + | ;1. SNP |
| + | : The reference and alternate sequences are of length 1 and the base nucleotide is different from one another. |
| + | ;2. MNP |
| + | : a.The reference and alternate sequences are of the same length and have to be greater than 1 and all nucleotides in the sequences differ from one another. |
| + | : OR |
| + | : b. all reference and alternate sequences have the same length. |
| + | ;INDEL |
| + | : a. The reference and alternate sequence are not the same length. |
| + | : AND |
| + | : b. The removal of a subsequence of the longer sequence would reduce the longer sequence to the smaller sequence. |
| + | ;CLUMPED |
| + | : |
| + | ;SV |
| + | : The alternate sequence is represented by a angled bracket tag - <DEL>, for example. |
| | | |
| = Example = | | = Example = |