From Genome Analysis Wiki
Jump to navigationJump to search
63 bytes added
, 09:21, 22 May 2014
Line 1: |
Line 1: |
| = Introduction = | | = Introduction = |
| | | |
− | The Variant Call Format (VCF) is a flexible file format specification that allows us to represent many different variant types ranging from SNPs, indels to copy number variations. However, variant representation in VCF is non-unique for SNPs and indels. a failure to recognize this will frequently result in inaccurate analyses. | + | The Variant Call Format (VCF) is a flexible file format specification that allows us to represent many different variant types ranging from SNPs, indels to copy number variations. However, variant representation in VCF is non-unique for variants that have their reference and alternate sequence expressed explicitly. a failure to recognize this will frequently result in inaccurate analyses. |
| | | |
| On this wiki page, we describe a variant normalization procedure that is well defined for biallelic as well as multiallelic variants. We then provide a formal proof the procedure's correctness. | | On this wiki page, we describe a variant normalization procedure that is well defined for biallelic as well as multiallelic variants. We then provide a formal proof the procedure's correctness. |