## BamUtil: recab

18 September 2012
= Overview of the <code>recab</code> function of <code>[[bamUtil]]</code> =
The <code>recab</code> option of [[bamUtil]] recalibrates a SAM/BAM file.

Recalibration can also be called as an option of [[bamUtil: dedup]]. This will perform the recalibration and the deduping in the same set of steps, increasing processing speed.
==Handling Recalibration/Implementation Notes==

* Duplicates
* Unmapped
* Mapping Quality = 0
* Mapping Quality = 255
Recalibration is a 2-step process that loops through the file twice:
# Apply Recalibration Table
The Recalibration is done by grouping Table groups bases based on a set of covariates:
* Previous Cycle's Base(reverse complement for reverse strands)* This Cycle's Base(reverse complement for reverse strands) The Recalibration Table tracks the number of matches/mismatches for each set of covariates. Only bases meeting all of the following criteria are used to Build the Recalibration Table:* Read criteria** not a duplicate** mapped** mapping quality != 0** mapping quality != 255* Base criteria** match/mismatch (not an insertion/deletion/skip/clip)** not a dbSNP position** base quality > minBaseQual (5 by default)* Additional criteria for cycle != 1 (can be turned off via flags)** previous base is a CIGAR Match/Mismatch** previous base position is not a dbSNP position The Recalibration Table is applied to all bases meeting all of the following criteria:* base quality > minBaseQual (5 by default) The Recalibrated Quality is calculated using: $-10 * \log \frac{mismatches + 1}{mismatches + matches + 1}$ If the Recalibration Table has no matches & no mismatches for a set of covariates, the original base quality is kept.