Changes

From Genome Analysis Wiki
Jump to navigationJump to search
no edit summary
Line 12: Line 12:  
* Only works on one chromosome at a time
 
* Only works on one chromosome at a time
 
* Repeats are single-base only
 
* Repeats are single-base only
 +
** An unrepeated base has repeat count = 0 ('A')
 +
** A base repeated once has repeat count = 1 ('AA')
 
* Skips reference positions with base 'N'
 
* Skips reference positions with base 'N'
   Line 20: Line 22:  
A position is considered to have an Insertion Discordance if at least 1 read has an insertion following this position AND at least 1 read does not AND there are at least the "minimum depth" reads at this position.
 
A position is considered to have an Insertion Discordance if at least 1 read has an insertion following this position AND at least 1 read does not AND there are at least the "minimum depth" reads at this position.
   −
==
+
== Error Rate Algorithm ==
 +
 
 +
The weighted average error rate, weighted average deletion error rate, and weighted average insertion error rate are calculated for each repeat count. 
 +
 
 +
The error rate is weighted by the depth, so the discordant counts are
 +
 
 +
 +
For each Repeat Count in the File
 +
  For each Depth at this Repeat Count in the file
 +
      if (depth > MaxAllowedDepth)
 +
        skip calculating error rate for this depth
 +
      else
 +
        // Note: numDiscordant is not equivalent to numDeleteDiscordant + numInsertDiscordant since a position could have both types of discordants
 +
        numDiscordant = number of discordant positions with this repeat count and depth
 +
        numDeleteDiscordant = number of deletion discordant positions with this repeat count and depth
 +
        numInsertDiscordant = number of insertion discordant positions with this repeat count and depth
 +
        count = number of positions with this repeat count and depth
 +
 +
        <math>errorRate = 1 - (\tfrac{numDiscordant}{count})^{({\tfrac{1}{depth}})}</math>
 +
        sumErrorRates += errorRate * count * (depth-1)
 +
        numErrorRates += count * (depth-1)
 +
  Repeat Count Weighted Error Rate = sumErrorRates/numErrorRates
 +
 
    
= Usage =
 
= Usage =

Navigation menu