3,045
edits
Changes
From Genome Analysis Wiki
no edit summary
# Build Recalibration Table
# Apply Recalibration Table
The Recalibration Table groups bases based on a set of covariates:
* Read Group
* Quality (either from the quality string or [[#Read the quality from a tag(--qualField)|from a tag]])
* Cycle (reverse complement for reverse strands)
* 1st/2nd read in pair
The Recalibration Table tracks the number of matches/mismatches for each set of covariates.
Only bases meeting all of the following criteria are used to Build the Recalibration Table:
* Base criteria
** match/mismatch (not an insertion/deletion/skip/clip)
** not a [[#DBSNP File (--dbsnp)|dbSNP position]]** base quality > [[#Minimum Recalibration Base Quality (--minBaseQual)|minBaseQual (5 by default)]]
* Additional criteria for cycle != 1 (can be turned off via flags)
** previous base is a CIGAR Match/Mismatch
** previous base position is not a [[#DBSNP File (--dbsnp)|dbSNP position]] The Recalibration Table is applied to all bases meeting all of the following criteria (even if they were not used for creating the table):* base quality > [[#Minimum Recalibration Base Quality (--minBaseQual)|minBaseQual (5 by default)]]* at least 1 match or mismatch for the set of covariates Recalibrated Quality is: <math>-10 * \log \frac{mismatches + 1}{mismatches + matches + 1}</math>
The current recalibration logic was designed for recalibrating ILLUMINA data.