Line 39: |
Line 39: |
| #####if not all nucleotides differ, add CLUMPED classification | | #####if not all nucleotides differ, add CLUMPED classification |
| #Variant classification is the union of the classifications of each allele present in the variant. | | #Variant classification is the union of the classifications of each allele present in the variant. |
− | #If all alleles are the same length, add MNP MNP classification. | + | #If all alleles are the same length, add MNP classification. |
| | | |
| = Examples = | | = Examples = |
Line 60: |
Line 60: |
| MNP<br> | | MNP<br> |
| REF AT | | REF AT |
− | ALT GC #MPN, 2 ts | + | ALT GC #MNP, 2 ts |
| | | |
| INDEL<br> | | INDEL<br> |
Line 69: |
Line 69: |
| REF AT | | REF AT |
| ALT T #INDEL, 1 del | | ALT T #INDEL, 1 del |
− | | + | #Note that although the padding base differs - A vs T, this is actually a simple indel because it is simply a deletion of a A base. |
− | INDEL<br>
| + | #If you right align this instead of left aligning, then the padding will be T on both the reference and alternative alleles. |
− | REF AT
| + | #Simple Indel classification should be invariant whether it is left or right aligned. |
− | ALT G #SNP,INDEL, 1 ts, 1 del
| |
| | | |
| SV<br> | | SV<br> |
Line 83: |
Line 82: |
| REF AT | | REF AT |
| ALT G #SNP, INDEL, 1 ts | | ALT G #SNP, INDEL, 1 ts |
| + | #Note that it is ambiguous as to which pairing should be a SNP, as such, the transition or transversion contribution is actually |
| + | #not defined. In this case, assuming it is a A/G SNP, we get a transition, but we may also consider this as a T/G SNP which |
| + | #is a transversion. In such ambiguous cases, we simply consider the aligned bases after left alignment to get the transition |
| + | #and transversion contribution. But please be very clear that this is an ambiguous case. It is better to consider this simply |
| + | #as a complex variant. |
| | | |
| MNP|INDEL<br> | | MNP|INDEL<br> |
Line 90: |
Line 94: |
| MNP|CLUMPED<br> | | MNP|CLUMPED<br> |
| REF ATTTT | | REF ATTTT |
− | ALT GTTTC #MNP, CLUMEPD, 2 ts | + | ALT GTTTC #MNP, CLUMPED, 2 ts |
− | #since all the alleles are of the sample length, classified as MNP too.
| + | #since all the alleles are of the same length, classified as MNP too. |
| | | |
| INDEL|CLUMPED<br> | | INDEL|CLUMPED<br> |
Line 120: |
Line 124: |
| ALT GT #SNP, 1 ts | | ALT GT #SNP, 1 ts |
| ALT AC #SNP, 1 ts | | ALT AC #SNP, 1 ts |
− | #since all the alleles are of the sample length, classified as MNP too.
| + | #since all the alleles are of the sample length, classified as MNP too. |
| | | |
| SNP|MNP|CLUMPED<br> | | SNP|MNP|CLUMPED<br> |
| REF ATTTG | | REF ATTTG |
| ALT GTTTC #CLUMPED, 1 ts, 1 tv | | ALT GTTTC #CLUMPED, 1 ts, 1 tv |
− | ALT ATTTC #SNP, 1 tv | + | ALT ATTTC #SNP, 1 tv, note that we get the SNP after truncating the bases ATTT to reveal a G/C transversion SNP |
− | #since all the alleles are of the sample length, classified as MNP too.
| + | #since all the alleles are of the sample length, classified as MNP too. |
| | | |
| SNP|MNP|INDEL<br> | | SNP|MNP|INDEL<br> |
Line 139: |
Line 143: |
| ALT AG #MNP, INDEL, 1 ts, 1 tv | | ALT AG #MNP, INDEL, 1 ts, 1 tv |
| ALT GTGTG #SNP, INDEL, CLUMPED, 1 tv, 1 ins | | ALT GTGTG #SNP, INDEL, CLUMPED, 1 tv, 1 ins |
− |
| |
− | == Weird Examples ==
| |
| | | |
| == Structured Variants Examples == | | == Structured Variants Examples == |
Line 152: |
Line 154: |
| ALT <CN4> #SV | | ALT <CN4> #SV |
| ALT <CN12> #SV | | ALT <CN12> #SV |
| + | |
| + | =Interesting Variant Types = |
| + | |
| + | Adjacent Tandem Repeats from lobSTR's tandem repeat finder panel. <br> |
| + | |
| + | |
| + | 20 9538655 <span style="color:#FF0000">ATTTATTTATTTATTTATTTATTTATTTATTTATTTATT</span><span style="color:#0000FF">CATTCATTCATTCATTCATTCATTC </span> <STR> |
| + | |
| + | This can be induced as |
| + | |
| + | one record considering only the ATTT repeats |
| + | 20 9538655 <span style="color:#FF0000">ATTTATTTATTT </span> <span style="color:#FF0000">ATTT </span> |
| + | |
| + | one record with CATT repeats |
| + | 20 9538695 <span style="color:#0000FF">CATTCATT </span> <span style="color:#0000FF">CATT </span> |
| + | |
| + | one record with a mix of both repeat types |
| + | 20 9538695 <span style="color:#FF0000">TATT<span style="color:#0000FF">CATTCATT </span> <span style="color:#0000FF">CATT </span> |
| + | |
| + | = Representation of close by variants = |
| + | |
| + | 1:124001690 |
| + | TTTCTTT--CAAAAAAAGATAAAAAGGTATTTCATGG |
| + | TTTCTTTAAAAAAAAAAGATAAAAAGGAATTTCATGG |
| + | |
| + | a single complex variant |
| + | CHROM POS REF ALT |
| + | 1 124001690 C AAA |
| + | |
| + | an Indel and SNP adjacent to one another |
| + | CHROM POS REF ALT |
| + | 1 124001689 T TAA |
| + | 1 124001690 C A |
| + | |
| + | Representing it as a single complex variant enforces that both "indel" and "SNP" are always together. |
| + | Representing it as 2 separate variants allows both alleles to segregate independently. |
| | | |
| = Output = | | = Output = |