Line 13: |
Line 13: |
| # Build Recalibration Table | | # Build Recalibration Table |
| # Apply Recalibration Table | | # Apply Recalibration Table |
| + | |
| | | |
| The Recalibration Table groups bases based on a set of covariates: | | The Recalibration Table groups bases based on a set of covariates: |
| * Read Group | | * Read Group |
− | * Quality (either from the quality string or from a tag) | + | * Quality (either from the quality string or [[#Read the quality from a tag (--qualField)|from a tag]]) |
| * Cycle (reverse complement for reverse strands) | | * Cycle (reverse complement for reverse strands) |
| * 1st/2nd read in pair | | * 1st/2nd read in pair |
Line 23: |
Line 24: |
| | | |
| The Recalibration Table tracks the number of matches/mismatches for each set of covariates. | | The Recalibration Table tracks the number of matches/mismatches for each set of covariates. |
| + | |
| | | |
| Only bases meeting all of the following criteria are used to Build the Recalibration Table: | | Only bases meeting all of the following criteria are used to Build the Recalibration Table: |
Line 32: |
Line 34: |
| * Base criteria | | * Base criteria |
| ** match/mismatch (not an insertion/deletion/skip/clip) | | ** match/mismatch (not an insertion/deletion/skip/clip) |
− | ** not a dbSNP position | + | ** not a [[#DBSNP File (--dbsnp)|dbSNP position]] |
− | ** base quality > minBaseQual (5 by default) | + | ** base quality > [[#Minimum Recalibration Base Quality (--minBaseQual)|minBaseQual (5 by default)]] |
| * Additional criteria for cycle != 1 (can be turned off via flags) | | * Additional criteria for cycle != 1 (can be turned off via flags) |
| ** previous base is a CIGAR Match/Mismatch | | ** previous base is a CIGAR Match/Mismatch |
− | ** previous base position is not a dbSNP position | + | ** previous base position is not a [[#DBSNP File (--dbsnp)|dbSNP position]] |
| + | |
| + | |
| + | The Recalibration Table is applied to all bases meeting all of the following criteria (even if they were not used for creating the table): |
| + | * base quality > [[#Minimum Recalibration Base Quality (--minBaseQual)|minBaseQual (5 by default)]] |
| + | * at least 1 match or mismatch for the set of covariates |
| + | |
| + | |
| + | Recalibrated Quality is: <math>-10 * \log \frac{mismatches + 1}{mismatches + matches + 1}</math> |
| | | |
− | The Recalibration Table is applied to all bases meeting all of the following criteria:
| |
− | * base quality > minBaseQual (5 by default)
| |
| | | |
− | The Recalibrated Quality is calculated using: <math>-10 * \log \frac{mismatches + 1}{mismatches + matches + 1}</math>
| + | If the Recalibrated Quality is greater than [[#Maximum Recalibration Base Quality (--maxBaseQual)|maxBaseQual]], the updated quality is set to maxBaseQual. |
| | | |
− | If the Recalibration Table has no matches & no mismatches for a set of covariates, the original base quality is kept.
| |
| | | |
− | If the Recalibrated Quality is greater than maxBaseQual, the updated quality is set to maxBaseQual.
| + | Optionally, the previous quality can be [[#Store the original quality (--storeQualTag)|stored in a tag]]. |
| | | |
− | Optionally, the previous quality can be stored in a tag.
| |
| | | |
| The current recalibration logic was designed for recalibrating ILLUMINA data. | | The current recalibration logic was designed for recalibrating ILLUMINA data. |