Open main menu

Genome Analysis Wiki β

Changes

SAM

142 bytes added, 14:13, 29 July 2010
no edit summary
Both SAM & BAM files contain a header section and an alignment section.
The header section may contain information about the entire file and additional information for alignments. The alignments then associate themselves with specific header information.
 
The alignment section contains the information for each sequence about where/how it aligns to the reference genome.
=== What Information Does SAM/BAM Have for an Alignment ===
* leftmost position of where this alignment maps to the reference, POS. For SAM, the reference starts at 1, so this value is 1-based, while for BAM the reference starts at 0,so this value is 0-based. Beware to always use the correct base when referencing positions.
* mapping quality, MAPQ, which contains the "phred-scaled posterior probability that the mapping position" is wrong. (from SAM-1.pdf)
* string indicating alignment information that allows the storing of clipped, [[SAM#What is a CIGAR?|CIGAR]]
* the reference sequence name of the next alignment in this group, MRNM or RNEXT. In paired alignments, it is the mate's reference sequence name. (A group is alignments with the same query name.)
* leftmost position of where the next alignment in this group maps to the reference, MPOS or PNEXT. For SAM, the reference starts at 1, so this value is 1-based, while for BAM the reference starts at 0,so this value is 0-based. Beware to always use the correct base when referencing positions.