Changes

From Genome Analysis Wiki
Jump to navigationJump to search
292 bytes added ,  16:59, 18 September 2012
no edit summary
Line 13: Line 13:  
# Build Recalibration Table
 
# Build Recalibration Table
 
# Apply Recalibration Table
 
# Apply Recalibration Table
 +
    
The Recalibration Table groups bases based on a set of covariates:
 
The Recalibration Table groups bases based on a set of covariates:
 
* Read Group
 
* Read Group
* Quality (either from the quality string or from a tag)
+
* Quality (either from the quality string or [[#Read the quality from a tag (--qualField)|from a tag]])
 
* Cycle (reverse complement for reverse strands)
 
* Cycle (reverse complement for reverse strands)
 
* 1st/2nd read in pair
 
* 1st/2nd read in pair
Line 23: Line 24:     
The Recalibration Table tracks the number of matches/mismatches for each set of covariates.
 
The Recalibration Table tracks the number of matches/mismatches for each set of covariates.
 +
    
Only bases meeting all of the following criteria are used to Build the Recalibration Table:
 
Only bases meeting all of the following criteria are used to Build the Recalibration Table:
Line 32: Line 34:  
* Base criteria
 
* Base criteria
 
** match/mismatch (not an insertion/deletion/skip/clip)
 
** match/mismatch (not an insertion/deletion/skip/clip)
** not a dbSNP position
+
** not a [[#DBSNP File (--dbsnp)|dbSNP position]]
** base quality > minBaseQual (5 by default)
+
** base quality > [[#Minimum Recalibration Base Quality (--minBaseQual)|minBaseQual (5 by default)]]
 
* Additional criteria for cycle != 1 (can be turned off via flags)
 
* Additional criteria for cycle != 1 (can be turned off via flags)
 
** previous base is a CIGAR Match/Mismatch
 
** previous base is a CIGAR Match/Mismatch
** previous base position is not a dbSNP position
+
** previous base position is not a [[#DBSNP File (--dbsnp)|dbSNP position]]
 +
 
 +
 
 +
The Recalibration Table is applied to all bases meeting all of the following criteria (even if they were not used for creating the table):
 +
* base quality > [[#Minimum Recalibration Base Quality (--minBaseQual)|minBaseQual (5 by default)]]
 +
* at least 1 match or mismatch for the set of covariates
 +
 
 +
 
 +
Recalibrated Quality is: <math>-10 * \log \frac{mismatches + 1}{mismatches + matches + 1}</math>
   −
The Recalibration Table is applied to all bases meeting all of the following criteria:
  −
* base quality > minBaseQual (5 by default)
     −
The Recalibrated Quality is calculated using: <math>-10 * \log \frac{mismatches + 1}{mismatches + matches + 1}</math>
+
If the Recalibrated Quality is greater than [[#Maximum Recalibration Base Quality (--maxBaseQual)|maxBaseQual]], the updated quality is set to maxBaseQual.
   −
If the Recalibration Table has no matches & no mismatches for a set of covariates, the original base quality is kept.
     −
If the Recalibrated Quality is greater than maxBaseQual, the updated quality is set to maxBaseQual.
+
Optionally, the previous quality can be [[#Store the original quality (--storeQualTag)|stored in a tag]].
   −
Optionally, the previous quality can be stored in a tag.
      
The current recalibration logic was designed for recalibrating ILLUMINA data.
 
The current recalibration logic was designed for recalibrating ILLUMINA data.

Navigation menu