From Genome Analysis Wiki
Jump to navigationJump to search
506 bytes added
, 13:08, 18 June 2012
Line 8: |
Line 8: |
| The <code>recab</code> option of [[bamUtil]] recalibrates a SAM/BAM file. | | The <code>recab</code> option of [[bamUtil]] recalibrates a SAM/BAM file. |
| | | |
− | ==Handling Recalibration== | + | ==Handling Recalibration/Implementation Notes== |
| | | |
| Reads Not Recalibrated: | | Reads Not Recalibrated: |
Line 16: |
Line 16: |
| * Mapping Quality = 255 | | * Mapping Quality = 255 |
| | | |
− | | + | Recalibration is a 2-step process that loops through the file twice: |
− | === Covariates Notes ===
| + | # Build Recalibration Table |
| + | # Apply Recalibration Table |
| | | |
| Recalibration is done by grouping bases based on a set of covariates: | | Recalibration is done by grouping bases based on a set of covariates: |
Line 27: |
Line 28: |
| | | |
| For Reverse Strands, the reverse complement of the SAM/BAM is used for the cycle, previous cycle's base, and current cycle's base. | | For Reverse Strands, the reverse complement of the SAM/BAM is used for the cycle, previous cycle's base, and current cycle's base. |
| + | |
| + | Not all bases are used for building the Recalibration table. Only bases meeting the following criteria are used: |
| + | * Base is a CIGAR Match/Mismatch |
| + | * Previous base is a CIGAR Match/Mismatch or it is the first cycle |
| + | |
| + | The Recalibration Table is applied on all bases in the read sequence (ignoring the alignment/CIGAR). If the base or the pre-base are an 'N', average the 4 alternatives. |
| | | |
| | | |