SAM Validation Criteria
From Genome Analysis Wiki
Jump to navigationJump to searchSAM Header Validation Rules
TODO
SAM Alignment Validation
Validation Criteria | Implemented | Tested |
---|---|---|
QNAME.Length() > 0 and <= 254 | ||
QNAME does not contain [ \t\n\r] | ||
FLAG is an integer [0-9]+ | ||
FLAG < 2048 (I think) or [0, (2^16)-1] | ||
RNAME does not contain [ \t\n\r@=] | ||
POS is an integer [0-9]+ | ||
POS is [0, (2^29)-1] | ||
MAPQ is an integer [0-9]+ | ||
MAPQ is [0, (2^8)-1] | ||
CIGAR ([0-9]+[MIDNSHP])+|\* | ||
MRNM does not contain [ \t\n\r@] ('=' means it is the same as RNAME) | ||
If SQ is in the header RNAME & MRNM (if not “=”) must be in SQ. | ||
MPOS is an integer [0-9]+ | ||
MPOS is [0, (2^29)-1] | ||
ISIZE is an integer -?[0-9]+ | ||
ISIZE is [-(2^29), 2^29] | ||
SEQ is [acgtnACGTN.=]+|\* | ||
If SEQ is * then QUAL is * | ||
QUAL is [!-~]+|* → dec 33 – 126 or dec 42 (which is in 32-126) (for BAM, it is between [0,93]) | ||
If QUAL is not “*” it is the same length as SEQ. | ||
TAG is [A-Z][A-Z0-9] | ||
A TAG only appears once per alignment | ||
VTYPE is [AifZH] for SAM and [AcCsSiIfZH] | ||
VALUE does NOT contain [\t\n\r] | ||
For VTYPE = “A”, VALUE is a printable character | ||
For VTYPE = “i”, VALUE is a signed 32-bit integer. | ||
For VTYPE = “f”, VALUE is a single-precision float. | ||
For VTYPE = “Z”, VALUE is a printable string. | ||
For VTYPE = “H”, VALUE is a Hex string. |
NOTE: There are other TAG Validations that can be done. They will come later.
NOTE: There are other BAM Validations that can be done. They will come later.
SAM Questions
- Comment says: “If the mapping position of the query is not available, RNAME and CIGAR are set as “*”, and POS and MAPQ as 0.” Is it all or nothing? Can some be set to “*”/0 but not all?
- Same question for MRNM = “*” and MPOS & ISIZE = 0
- Comment says: “The name of a pair/read is required to be unique in the SAM file, but one pair/read may appear multiple times in different alignment records, representing multiple or split hits.” - Is there anything here that needs to be validated???