Difference between revisions of "C++ Class: SamFile"
(One intermediate revision by the same user not shown) | |||
Line 18: | Line 18: | ||
== Child Classes == | == Child Classes == | ||
=== SamFileReader === | === SamFileReader === | ||
− | http:// | + | http://csg.sph.umich.edu//mktrost/doxygen/current/classSamFileReader.html |
=== SamFileWriter === | === SamFileWriter === | ||
− | http:// | + | http://csg.sph.umich.edu//mktrost/doxygen/current/classSamFileWriter.html |
== Statistics == | == Statistics == |
Latest revision as of 11:03, 2 February 2017
Reading/Writing SAM/BAM Files In Your Program
The SamFile class allows a user to easily read/write a SAM/BAM file.
The SamFile class contains additional functionality that allows a user to read specific sections of sorted & indexed BAM files. In order take advantage of this capability, the index file must be read prior to setting the read section. This logic saves the time of having to read the entire file and takes advantage of the seeking capability of BGZF files.
Future Enhancements: Add the ability to read alignments that match a given start, end position for a specific reference sequence.
This class is part of C++ Library: libStatGen.
Class Documentation
See: http://csg.sph.umich.edu//mktrost/doxygen/current/classSamFile.html
Child Classes
SamFileReader
http://csg.sph.umich.edu//mktrost/doxygen/current/classSamFileReader.html
SamFileWriter
http://csg.sph.umich.edu//mktrost/doxygen/current/classSamFileWriter.html
Statistics
Statistic Generation
The following statistics can be optionally recorded when reading a SamFile by specifying SamFile::GenerateStatistics()
and displayed with SamFile::PrintStatistics()
The statistics only reflect alignments that were successfully read from the BAM file. Alignments that failed to parse from the file are not reflected in the statistics, but alignments that are invalid for other reasons may show up in the statistics.
Statistic | Description |
---|---|
TotalReads | Total number of alignments that were successfully read from the file. |
MappedReads | Total number of alignments that were successfully read from the file with FLAG bit 0x004 set to 0 (not unmapped). |
PairedReads | Total number of alignments that were successfully read from the file with FLAG bit 0x001 set to 1 (paired). |
ProperPair | Total number of alignments that were successfully read from the file with FLAG bits 0x001 set to 1 (paired) AND 0x002 (proper pair). |
DuplicateReads | Total number of alignments that were successfully read from the file with FLAG bit 0x400 set to 1 (PCR or optical duplicate). |
QCFailureReads | Total number of alignments that were successfully read from the file with FLAG bit 0x200 set to 1 (failed quality checks). |
Statistic | Description |
---|---|
MappingRate(%) | 100 * MappedReads/TotalReads |
PairedReads(%) | 100 * PairedReads/TotalReads |
ProperPair(%) | 100 * ProperPair/TotalReads |
DupRate(%) | 100 * DuplicateReads/TotalReads |
QCFailRate(%) | 100 * QCFailureReads/TotalReads |
Statistic | Description |
---|---|
TotalBases | Sum of the SEQ lengths for all alignments that were successfully read from the file.
NOTE: Includes bases that are 'N'. |
BasesInMappedReads | Sum of the SEQ lengths for all alignments that were successfully read from the file with FLAG bit 0x004 set to 0 (not unmapped).
NOTE: Includes bases that are 'N'. |
NOTE: If the TotalReads is greater than 10^6, then the Read Counts and Base Counts specify the total counts divided by 10^6. This is indicated in the output with a (e6) appended to the field name.
Example Statistics Output
TotalReads(e6) 18.90 MappedReads(e6) 14.77 PairedReads(e6) 18.90 ProperPair(e6) 11.28 DuplicateReads(e6) 0.00 QCFailureReads(e6) 0.00 MappingRate(%) 78.17 PairedReads(%) 100.00 ProperPair(%) 59.68 DupRate(%) 0.00 QCFailRate(%) 0.00 TotalBases(e6) 699.30 BasesInMappedReads(e6) 546.67