Changes

From Genome Analysis Wiki
Jump to navigationJump to search
142 bytes added ,  14:13, 29 July 2010
no edit summary
Line 12: Line 12:  
Both SAM & BAM files contain a header section and an alignment section.
 
Both SAM & BAM files contain a header section and an alignment section.
 
The header section may contain information about the entire file and additional information for alignments.  The alignments then associate themselves with specific header information.
 
The header section may contain information about the entire file and additional information for alignments.  The alignments then associate themselves with specific header information.
 +
 +
The alignment section contains the information for each sequence about where/how it aligns to the reference genome.
    
=== What Information Does SAM/BAM Have for an Alignment ===
 
=== What Information Does SAM/BAM Have for an Alignment ===
Line 33: Line 35:  
* leftmost position of where this alignment maps to the reference, POS.  For SAM, the reference starts at 1, so this value is 1-based, while for BAM the reference starts at 0,so this value is 0-based.  Beware to always use the correct base when referencing positions.
 
* leftmost position of where this alignment maps to the reference, POS.  For SAM, the reference starts at 1, so this value is 1-based, while for BAM the reference starts at 0,so this value is 0-based.  Beware to always use the correct base when referencing positions.
 
* mapping quality, MAPQ, which contains the "phred-scaled posterior probability that the mapping position" is wrong. (from SAM-1.pdf)
 
* mapping quality, MAPQ, which contains the "phred-scaled posterior probability that the mapping position" is wrong. (from SAM-1.pdf)
* string indicating alignment information that allows the storing of clipped, CIGAR
+
* string indicating alignment information that allows the storing of clipped, [[SAM#What is a CIGAR?|CIGAR]]
 
* the reference sequence name of the next alignment in this group, MRNM or RNEXT.  In paired alignments, it is the mate's reference sequence name. (A group is alignments with the same query name.)
 
* the reference sequence name of the next alignment in this group, MRNM or RNEXT.  In paired alignments, it is the mate's reference sequence name. (A group is alignments with the same query name.)
 
* leftmost position of where the next alignment in this group maps to the reference, MPOS or PNEXT.  For SAM, the reference starts at 1, so this value is 1-based, while for BAM the reference starts at 0,so this value is 0-based.  Beware to always use the correct base when referencing positions.
 
* leftmost position of where the next alignment in this group maps to the reference, MPOS or PNEXT.  For SAM, the reference starts at 1, so this value is 1-based, while for BAM the reference starts at 0,so this value is 0-based.  Beware to always use the correct base when referencing positions.

Navigation menu