Changes

From Genome Analysis Wiki
Jump to navigationJump to search
2,453 bytes added ,  21:44, 25 February 2016
Line 39: Line 39:  
#####if not all nucleotides differ, add CLUMPED classification
 
#####if not all nucleotides differ, add CLUMPED classification
 
#Variant classification is the union of the classifications of each allele present in the variant.
 
#Variant classification is the union of the classifications of each allele present in the variant.
#If all alleles are the same length, add MNP MNP classification.
+
#If all alleles are the same length, add MNP classification.
    
= Examples =
 
= Examples =
   −
We present the following examples to explain the concepts explained earlier.
+
We present the following examples to explain the classification described.
    
== Legend for examples ==
 
== Legend for examples ==
Line 60: Line 60:  
     MNP<br>
 
     MNP<br>
 
     REF  AT     
 
     REF  AT     
     ALT  GC    #MPN, 2 ts
+
     ALT  GC    #MNP, 2 ts
    
     INDEL<br>
 
     INDEL<br>
Line 69: Line 69:  
     REF  AT       
 
     REF  AT       
 
     ALT  T    #INDEL, 1 del
 
     ALT  T    #INDEL, 1 del
 
+
              #Note that although the padding base differs - A vs T, this is actually a simple indel because it is simply a deletion of a A base.  
    INDEL<br>
+
              #If you right align this instead of left aligning, then the padding will be T on both the reference and alternative alleles.
    REF AT     
+
              #Simple Indel classification should be invariant whether it is left or right aligned.
    ALT  G    #SNP,INDEL, 1 ts, 1 del
      
     SV<br>
 
     SV<br>
Line 83: Line 82:  
     REF  AT           
 
     REF  AT           
 
     ALT  G          #SNP, INDEL, 1 ts
 
     ALT  G          #SNP, INDEL, 1 ts
 +
                    #Note that it is ambiguous as to which pairing should be a SNP, as such, the transition or transversion contribution is actually
 +
                    #not defined.  In this case, assuming it is a A/G SNP, we get a transition, but we may also consider this as a T/G SNP which
 +
                    #is a transversion.  In such ambiguous cases, we simply consider the aligned bases after left alignment to get the transition
 +
                    #and transversion contribution.  But please be very clear that this is an ambiguous case.  It is better to consider this simply
 +
                    #as a complex variant.
    
     MNP|INDEL<br>
 
     MNP|INDEL<br>
Line 90: Line 94:  
     MNP|CLUMPED<br>
 
     MNP|CLUMPED<br>
 
     REF  ATTTT         
 
     REF  ATTTT         
     ALT  GTTTC      #MNP, CLUMEPD, 2 ts
+
     ALT  GTTTC      #MNP, CLUMPED, 2 ts
    #since all the alleles are of the sample length, classified as MNP too.
+
                    #since all the alleles are of the same length, classified as MNP too.
    
     INDEL|CLUMPED<br>
 
     INDEL|CLUMPED<br>
Line 120: Line 124:  
     ALT  GT          #SNP, 1 ts
 
     ALT  GT          #SNP, 1 ts
 
     ALT  AC          #SNP, 1 ts
 
     ALT  AC          #SNP, 1 ts
    #since all the alleles are of the sample length, classified as MNP too.
+
                    #since all the alleles are of the sample length, classified as MNP too.
    
     SNP|MNP|CLUMPED<br>
 
     SNP|MNP|CLUMPED<br>
 
     REF  ATTTG     
 
     REF  ATTTG     
 
     ALT  GTTTC      #CLUMPED, 1 ts, 1 tv
 
     ALT  GTTTC      #CLUMPED, 1 ts, 1 tv
     ALT  ATTTC      #SNP, 1 tv
+
     ALT  ATTTC      #SNP, 1 tv, note that we get the SNP after truncating the bases ATTT to reveal a G/C transversion SNP
    #since all the alleles are of the sample length, classified as MNP too.
+
                    #since all the alleles are of the sample length, classified as MNP too.
    
     SNP|MNP|INDEL<br>
 
     SNP|MNP|INDEL<br>
Line 139: Line 143:  
     ALT  AG          #MNP, INDEL, 1 ts, 1 tv
 
     ALT  AG          #MNP, INDEL, 1 ts, 1 tv
 
     ALT  GTGTG      #SNP, INDEL, CLUMPED, 1 tv, 1 ins
 
     ALT  GTGTG      #SNP, INDEL, CLUMPED, 1 tv, 1 ins
  −
== Weird Examples ==
      
== Structured Variants Examples ==
 
== Structured Variants Examples ==
Line 152: Line 154:  
     ALT &lt;CN4&gt;            #SV
 
     ALT &lt;CN4&gt;            #SV
 
     ALT &lt;CN12&gt;            #SV
 
     ALT &lt;CN12&gt;            #SV
 +
 +
=Interesting Variant Types =
 +
 +
    Adjacent Tandem Repeats from lobSTR's tandem repeat finder panel. <br>
 +
   
 +
 +
    20 9538655 <span style="color:#FF0000">ATTTATTTATTTATTTATTTATTTATTTATTTATTTATT</span><span style="color:#0000FF">CATTCATTCATTCATTCATTCATTC </span> <STR>
 +
 +
    This can be induced as
 +
   
 +
    one record considering only the ATTT repeats
 +
    20 9538655 <span style="color:#FF0000">ATTTATTTATTT </span> <span style="color:#FF0000">ATTT </span>
 +
 +
    one record with CATT repeats
 +
    20 9538695 <span style="color:#0000FF">CATTCATT </span> <span style="color:#0000FF">CATT </span>
 +
 +
    one record with a mix of both repeat types
 +
    20 9538695 <span style="color:#FF0000">TATT<span style="color:#0000FF">CATTCATT </span> <span style="color:#0000FF">CATT </span>
 +
 +
= Representation of close by variants =
 +
 +
    1:124001690
 +
    TTTCTTT--CAAAAAAAGATAAAAAGGTATTTCATGG
 +
    TTTCTTTAAAAAAAAAAGATAAAAAGGAATTTCATGG
 +
 +
    a single complex variant
 +
    CHROM POS        REF  ALT
 +
    1    124001690  C    AAA
 +
 +
    an Indel and SNP adjacent to one another
 +
    CHROM POS        REF  ALT
 +
    1    124001689  T    TAA
 +
    1    124001690  C    A
 +
 +
Representing it as a single complex variant enforces that both "indel" and "SNP" are always together.
 +
Representing it as 2 separate variants allows both alleles to segregate independently.
    
= Output  =
 
= Output  =
Line 169: Line 207:  
           3 alleles                      :            273 (0.89) [537/601]
 
           3 alleles                      :            273 (0.89) [537/601]
 
           4 alleles                      :              3 (1.00) [9/9] <br>
 
           4 alleles                      :              3 (1.00) [9/9] <br>
       no. of Indel                      :    6600770
+
       no. of Indel                      :    6600770   #also referred to as simple Indels
 
           2 alleles                      :        6285861 (0.88) [2937096/3348765] #ins/del ratio and the respective counts
 
           2 alleles                      :        6285861 (0.88) [2937096/3348765] #ins/del ratio and the respective counts
 
           3 alleles                      :          280892 (8.72) [503977/57807]
 
           3 alleles                      :          280892 (8.72) [503977/57807]
1,102

edits

Navigation menu