From Genome Analysis Wiki
Jump to navigationJump to search
1,944 bytes added
, 16:10, 23 February 2012
Line 4: |
Line 4: |
| | | |
| = Overview of the <code>indelDiscordance</code> function of <code>bamUtil</code> = | | = Overview of the <code>indelDiscordance</code> function of <code>bamUtil</code> = |
− | The <code>indelDiscordance</code> option on the [[bamUtil]] looks at discordance at sites on the male X chromosome. | + | The <code>indelDiscordance</code> option on the [[bamUtil]] looks at insertion/deletion discordance. |
| + | |
| + | By default it looks only at the non-pseudoautosomal region of the X-Chromosome. |
| + | |
| | | |
| == ASSUMPTIONS/RESTRICTIONS == | | == ASSUMPTIONS/RESTRICTIONS == |
| + | * Only works on one chromosome at a time |
| + | * Repeats are single-base only |
| + | ** An unrepeated base has repeat count = 0 ('A') |
| + | ** A base repeated once has repeat count = 1 ('AA') |
| + | * Skips reference positions with base 'N' |
| + | |
| + | == What is a Discordance == |
| + | |
| + | A position is considered to have a Deletion Discordance if at least 1 read has a match/mismatch AND at least 1 read has a deletion AND there are at least the "minimum depth" reads at this position. |
| + | |
| + | A position is considered to have an Insertion Discordance if at least 1 read has an insertion following this position AND at least 1 read does not AND there are at least the "minimum depth" reads at this position. |
| + | |
| + | == Error Rate Algorithm == |
| + | |
| + | The weighted average error rate, weighted average deletion error rate, and weighted average insertion error rate are calculated for each repeat count. |
| + | |
| + | The error rate is weighted by the depth, so the discordant counts are |
| + | |
| + | |
| + | For each Repeat Count in the File |
| + | For each Depth at this Repeat Count in the file |
| + | if (depth > MaxAllowedDepth) |
| + | skip calculating error rate for this depth |
| + | else |
| + | // Note: numDiscordant is not equivalent to numDeleteDiscordant + numInsertDiscordant since a position could have both types of discordants |
| + | numDiscordant = number of discordant positions with this repeat count and depth |
| + | numDeleteDiscordant = number of deletion discordant positions with this repeat count and depth |
| + | numInsertDiscordant = number of insertion discordant positions with this repeat count and depth |
| + | count = number of positions with this repeat count and depth |
| + | |
| + | <math>errorRate = 1 - (\tfrac{numDiscordant}{count})^{({\tfrac{1}{depth}})}</math> |
| + | sumErrorRates += errorRate * count * (depth-1) |
| + | numErrorRates += count * (depth-1) |
| + | Repeat Count Weighted Error Rate = sumErrorRates/numErrorRates |
| | | |
| | | |