Difference between revisions of "LibStatGen: BAM"
Line 75: | Line 75: | ||
} | } | ||
</pre> | </pre> | ||
+ | |||
+ | |||
+ | == Setting fields in a SAM/BAM Record == | ||
+ | The SamRecord class contains accessors to set the fields of a SAM/BAM record. They are used for creating a record that is not read from a SAM/BAM file. By using these set methods to setup the record, they can be pulled back out using the get accessors or the record can be later written as either a SAM/BAM record. | ||
+ | This methods found in the SamRecord class for setting fields are: | ||
+ | {| class="wikitable" style="width:100%" border="1" | ||
+ | |+ style="font-size:150%"|'''SamFile Class Methods''' | ||
+ | ! width=""|Method Name | ||
+ | ! width=""|Description | ||
+ | |- | ||
+ | | bool setReadName(const char* readName) | ||
+ | | Sets QNAME to the passed in name. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool setFlag(int flag) | ||
+ | | Sets the bitwise FLAG to the passed in value. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool setReferenceID(int referenceID) | ||
+ | | Sets the reference sequence id. The reference name is not currently stored. A map to the header needs to be done to get this (which is done when writing a SAM file). THIS is an opportunity for improvement. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool set1BasedPosition(int position) | ||
+ | | Sets the leftmost position. The value passed in is 1-based (SAM formatted). Internal processing handles switching between SAM/BAM formats when read/written. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool set0BasedPosition(int position) | ||
+ | | Sets the leftmost position. The value passed in is 0-based (BAM formatted). Internal processing handles switching between SAM/BAM formats when read/written. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | |bool setMapQuality(int mapQuality) | ||
+ | | Sets the mapping quality. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool setCigar(const char* cigar) | ||
+ | | Sets the cigar string to the passed in CIGAR. This is a SAM formatted CIGAR string. Internal processing handles switching between SAM/BAM formats when read/written. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool setMateReferenceID(int mateReferenceID) | ||
+ | | Sets the mate reference sequence id. The mate reference name is not currently stored. A map to the header needs to be done to get this (which is done when writing a SAM file). THIS is an opportunity for improvement. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool set1BasedMatePosition(int matePosition) | ||
+ | | Sets the leftmost mate position. The value passed in is 1-based (SAM formatted). Internal processing handles switching between SAM/BAM formats when read/written. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool set0BasedMatePosition(int matePosition) | ||
+ | | Sets the leftmost mate position. The value passed in is 0-based (BAM formatted). Internal processing handles switching between SAM/BAM formats when read/written. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool setInsertSize(int insertSize) | ||
+ | | Sets the inferred insert size. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool setSequence(const char* seq) | ||
+ | | Sets the sequence string to the passed in string. This is a SAM formatted sequence string. Internal processing handles switching between SAM/BAM formats when read/written. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool setQuality(const char* quality) | ||
+ | | Sets the quality string to the passed in string. This is a SAM formatted quality string. Internal processing handles switching between SAM/BAM formats when read/written. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |- | ||
+ | | bool addTag(const char* tag, char vtype, const char* value) | ||
+ | | Adds a tag to the record with the specified tag, vtype, and value. Vtype can be SAM/BAM vtype. Internal processing handles switching between SAM/BAM vtypes when read/written. | ||
+ | Returns true if successfully set, false if not. | ||
+ | |} | ||
+ | |||
+ | |||
+ | == Retrieving fields from a SAM/BAM Record == | ||
+ | The SamRecord class contains accessors to access the fields of a SAM/BAM record. They assume that the class has already been populated, either by using the set commands or by calling SamFile::ReadRecord. |
Revision as of 14:51, 16 March 2010
SAM/BAM File
Reading/Writing SAM/BAM Files
The SamFile class allows a user to easily read/write a SAM/BAM file. This methods found in this class are:
Method Name | Description |
---|---|
bool OpenForRead(const char* filename) | Opens the specified file for reading.
Determines if it is a BAM/SAM file by reading the beginning of the file. Returns true if successfully opened reading, false if not. |
bool OpenForWrite(const char * filename) | bool: true if successfully opened, false if not.
Opens as BAM file if the specified filename ends in .bam. Otherwise it is opened as a SAM file. Returns true if successfully opened for writing, false if not. |
bool ReadHeader(SamFileHeader& header) | Reads the header section from the file and stores it in the passed in header.
Returns true if successfully read, false if not. |
bool WriteHeader(const SamFileHeader& header) | Writes the specified header into the file.
Returns true if successfully written, false if not. |
bool ReadRecord(SamFileHeader& header, SamRecord& record) | Reads the next record from the file and stores it in the passed in record.
Returns true if successfully read, false if not. |
bool WriteRecord(SamFileHeader& header, SamRecord& record) | Writes the specified record into the file.
Returns true if successfully written, false if not. |
Usage Example
The following example reads in a sam/bam file and writes it out as a sam/bam file. The file format of the input sam/bam is determined by the SamFile class based on reading the type from the file. The file format of the output sam/bam file is determined by the SamFile class based on the extension of the output file. A ".bam" extension indicates a BAM file. All other extensions indicate SAM files.
int main(int argc, char ** argv) { if(argc != 3) { printf("./bam <inputFile> <outputFile.sam/bam>\n"); exit(-1); } SamFile samIn; samIn.OpenForRead(argv[1]); SamFile samOut; samOut.OpenForWrite(argv[2]); // Read the sam header. SamFileHeader samHeader; samIn.ReadHeader(samHeader); samOut.WriteHeader(samHeader); // Read the first sam record. SamRecord samRecord; // Keep reading records until it fails. int recordCount = 0; while (samIn.ReadRecord(samHeader, samRecord) == true) { recordCount++; samOut.WriteRecord(samHeader, samRecord); } printf("RecordCount = %d\n", recordCount); }
Setting fields in a SAM/BAM Record
The SamRecord class contains accessors to set the fields of a SAM/BAM record. They are used for creating a record that is not read from a SAM/BAM file. By using these set methods to setup the record, they can be pulled back out using the get accessors or the record can be later written as either a SAM/BAM record. This methods found in the SamRecord class for setting fields are:
Method Name | Description |
---|---|
bool setReadName(const char* readName) | Sets QNAME to the passed in name.
Returns true if successfully set, false if not. |
bool setFlag(int flag) | Sets the bitwise FLAG to the passed in value.
Returns true if successfully set, false if not. |
bool setReferenceID(int referenceID) | Sets the reference sequence id. The reference name is not currently stored. A map to the header needs to be done to get this (which is done when writing a SAM file). THIS is an opportunity for improvement.
Returns true if successfully set, false if not. |
bool set1BasedPosition(int position) | Sets the leftmost position. The value passed in is 1-based (SAM formatted). Internal processing handles switching between SAM/BAM formats when read/written.
Returns true if successfully set, false if not. |
bool set0BasedPosition(int position) | Sets the leftmost position. The value passed in is 0-based (BAM formatted). Internal processing handles switching between SAM/BAM formats when read/written.
Returns true if successfully set, false if not. |
bool setMapQuality(int mapQuality) | Sets the mapping quality.
Returns true if successfully set, false if not. |
bool setCigar(const char* cigar) | Sets the cigar string to the passed in CIGAR. This is a SAM formatted CIGAR string. Internal processing handles switching between SAM/BAM formats when read/written.
Returns true if successfully set, false if not. |
bool setMateReferenceID(int mateReferenceID) | Sets the mate reference sequence id. The mate reference name is not currently stored. A map to the header needs to be done to get this (which is done when writing a SAM file). THIS is an opportunity for improvement.
Returns true if successfully set, false if not. |
bool set1BasedMatePosition(int matePosition) | Sets the leftmost mate position. The value passed in is 1-based (SAM formatted). Internal processing handles switching between SAM/BAM formats when read/written.
Returns true if successfully set, false if not. |
bool set0BasedMatePosition(int matePosition) | Sets the leftmost mate position. The value passed in is 0-based (BAM formatted). Internal processing handles switching between SAM/BAM formats when read/written.
Returns true if successfully set, false if not. |
bool setInsertSize(int insertSize) | Sets the inferred insert size.
Returns true if successfully set, false if not. |
bool setSequence(const char* seq) | Sets the sequence string to the passed in string. This is a SAM formatted sequence string. Internal processing handles switching between SAM/BAM formats when read/written.
Returns true if successfully set, false if not. |
bool setQuality(const char* quality) | Sets the quality string to the passed in string. This is a SAM formatted quality string. Internal processing handles switching between SAM/BAM formats when read/written.
Returns true if successfully set, false if not. |
bool addTag(const char* tag, char vtype, const char* value) | Adds a tag to the record with the specified tag, vtype, and value. Vtype can be SAM/BAM vtype. Internal processing handles switching between SAM/BAM vtypes when read/written.
Returns true if successfully set, false if not. |
Retrieving fields from a SAM/BAM Record
The SamRecord class contains accessors to access the fields of a SAM/BAM record. They assume that the class has already been populated, either by using the set commands or by calling SamFile::ReadRecord.