Changes

From Genome Analysis Wiki
Jump to: navigation, search

Variant classification

2,453 bytes added, 21:44, 25 February 2016
Representation of close by variants
#####if not all nucleotides differ, add CLUMPED classification
#Variant classification is the union of the classifications of each allele present in the variant.
#If all alleles are the same length, add MNP MNP classification.
= Examples =
We present the following examples to explain the concepts explained earlierclassification described.
== Legend for examples ==
MNP<br>
REF AT
ALT GC #MPNMNP, 2 ts
INDEL<br>
REF AT
ALT T #INDEL, 1 del
  INDEL<br> REF #Note that although the padding base differs - A vs T, this is actually a simple indel because it is simply a deletion of a A base. AT ALT G #SNPIf you right align this instead of left aligning,INDEL, 1 ts, 1 delthen the padding will be T on both the reference and alternative alleles. #Simple Indel classification should be invariant whether it is left or right aligned.
SV<br>
REF AT
ALT G #SNP, INDEL, 1 ts
#Note that it is ambiguous as to which pairing should be a SNP, as such, the transition or transversion contribution is actually
#not defined. In this case, assuming it is a A/G SNP, we get a transition, but we may also consider this as a T/G SNP which
#is a transversion. In such ambiguous cases, we simply consider the aligned bases after left alignment to get the transition
#and transversion contribution. But please be very clear that this is an ambiguous case. It is better to consider this simply
#as a complex variant.
MNP|INDEL<br>
MNP|CLUMPED<br>
REF ATTTT
ALT GTTTC #MNP, CLUMEPDCLUMPED, 2 ts #since all the alleles are of the sample same length, classified as MNP too.
INDEL|CLUMPED<br>
ALT GT #SNP, 1 ts
ALT AC #SNP, 1 ts
#since all the alleles are of the sample length, classified as MNP too.
SNP|MNP|CLUMPED<br>
REF ATTTG
ALT GTTTC #CLUMPED, 1 ts, 1 tv
ALT ATTTC #SNP, 1 tv, note that we get the SNP after truncating the bases ATTT to reveal a G/C transversion SNP #since all the alleles are of the sample length, classified as MNP too.
SNP|MNP|INDEL<br>
ALT AG #MNP, INDEL, 1 ts, 1 tv
ALT GTGTG #SNP, INDEL, CLUMPED, 1 tv, 1 ins
 
== Weird Examples ==
== Structured Variants Examples ==
ALT &lt;CN4&gt; #SV
ALT &lt;CN12&gt; #SV
 
=Interesting Variant Types =
 
Adjacent Tandem Repeats from lobSTR's tandem repeat finder panel. <br>
 
20 9538655 <span style="color:#FF0000">ATTTATTTATTTATTTATTTATTTATTTATTTATTTATT</span><span style="color:#0000FF">CATTCATTCATTCATTCATTCATTC </span> <STR>
 
This can be induced as
one record considering only the ATTT repeats
20 9538655 <span style="color:#FF0000">ATTTATTTATTT </span> <span style="color:#FF0000">ATTT </span>
 
one record with CATT repeats
20 9538695 <span style="color:#0000FF">CATTCATT </span> <span style="color:#0000FF">CATT </span>
 
one record with a mix of both repeat types
20 9538695 <span style="color:#FF0000">TATT<span style="color:#0000FF">CATTCATT </span> <span style="color:#0000FF">CATT </span>
 
= Representation of close by variants =
 
1:124001690
TTTCTTT--CAAAAAAAGATAAAAAGGTATTTCATGG
TTTCTTTAAAAAAAAAAGATAAAAAGGAATTTCATGG
 
a single complex variant
CHROM POS REF ALT
1 124001690 C AAA
 
an Indel and SNP adjacent to one another
CHROM POS REF ALT
1 124001689 T TAA
1 124001690 C A
 
Representing it as a single complex variant enforces that both "indel" and "SNP" are always together.
Representing it as 2 separate variants allows both alleles to segregate independently.
= Output =
3 alleles : 273 (0.89) [537/601]
4 alleles : 3 (1.00) [9/9] <br>
no. of Indel : 6600770 #also referred to as simple Indels
2 alleles : 6285861 (0.88) [2937096/3348765] #ins/del ratio and the respective counts
3 alleles : 280892 (8.72) [503977/57807]
1,102
edits

Navigation menu