Difference between revisions of "C++ Class: SamRecord"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 85: Line 85:
 
| Returns the BAM block size of the record.
 
| Returns the BAM block size of the record.
 
|-
 
|-
| <code>const char* SamRecord::getReferenceName(SamFileHeader& header)</code>
+
| <code>const char* SamRecord::getReferenceName()</code>
 
| Returns the reference sequence name (SAM format).
 
| Returns the reference sequence name (SAM format).
 
|-
 
|-
Line 115: Line 115:
 
| Returns the length of the read.
 
| Returns the length of the read.
 
|-
 
|-
| <code>const char* SamRecord::getMateReferenceName(SamFileHeader& header)</code>
+
| <code>const char* SamRecord::getMateReferenceName()</code>
 
| Returns the mate reference sequence name (SAM format).  Returns the mate reference sequence name even if it is the same as the reference sequence name.
 
| Returns the mate reference sequence name (SAM format).  Returns the mate reference sequence name even if it is the same as the reference sequence name.
 
|-
 
|-
| <code>const char* SamRecord::getMateReferenceNameOrEqual(SamFileHeader& header)</code>
+
| <code>const char* SamRecord::getMateReferenceNameOrEqual()</code>
| Returns the mate reference sequence name (SAM format).  Returns the mate reference sequence name, unless it is the same as the reference sequence name, then an "=" is returned..
+
| Returns the mate reference sequence name (SAM format).  Returns the mate reference sequence name, unless it is the same as the reference sequence name, then an "=" is returned, unless the name is "*", then "*" is returned.
 
|-
 
|-
 
| <code>int32_t SamRecord::getMateReferenceID()</code>
 
| <code>int32_t SamRecord::getMateReferenceID()</code>

Revision as of 16:06, 12 April 2010

This class is part of libbam.

Setting fields in a SAM/BAM Record

The SamRecord class contains accessors to set the fields of a SAM/BAM record. They are used for creating a record that is not read from a SAM/BAM file. By using these set methods to setup the record, they can be pulled back out using the get accessors or the record can be later written as either a SAM/BAM record. The methods found in the SamRecord class for setting fields are:

Method Name Description
void SamRecord::resetRecord() Resets the record to be an empty record. This is not necessary when you are reading a Sam/Bam file, but if you are setting fields, it is a good idea to clean out a record before reusing it. Clearing it allows you to not have to set any empty fields.
bool SamRecord::setReadName(const char* readName) Sets QNAME to the passed in name.

Returns true if successfully set, false if not.

bool SamRecord::setFlag(int16_t flag) Sets the bitwise FLAG to the passed in value.

Returns true if successfully set, false if not.

bool SamRecord::setReferenceName(SamFileHeader& header, const char* referenceName) Sets the reference sequence name. The reference id is calculated using the header.

Returns true if successfully set, false if not.

bool SamRecord::set1BasedPosition(int32_t position) Sets the leftmost position. The value passed in is 1-based (SAM formatted). Internal processing handles switching between SAM/BAM formats when read/written.

Returns true if successfully set, false if not.

bool SamRecord::set0BasedPosition(int32_t position) Sets the leftmost position. The value passed in is 0-based (BAM formatted). Internal processing handles switching between SAM/BAM formats when read/written.

Returns true if successfully set, false if not.

bool SamRecord::setMapQuality(int8_t mapQuality) Sets the mapping quality.

Returns true if successfully set, false if not.

bool SamRecord::setCigar(const char* cigar) Sets the cigar string to the passed in CIGAR. This is a SAM formatted CIGAR string. Internal processing handles switching between SAM/BAM formats when read/written.

Returns true if successfully set, false if not.

bool SamRecord::setMateReferenceName(SamFileHeader& header, const char* referenceName) Sets the mate reference sequence name. The mate reference id is calculated using the header.

Returns true if successfully set, false if not.

bool SamRecord::set1BasedMatePosition(int32_t matePosition) Sets the leftmost mate position. The value passed in is 1-based (SAM formatted). Internal processing handles switching between SAM/BAM formats when read/written.

Returns true if successfully set, false if not.

bool SamRecord::set0BasedMatePosition(int32_t matePosition) Sets the leftmost mate position. The value passed in is 0-based (BAM formatted). Internal processing handles switching between SAM/BAM formats when read/written.

Returns true if successfully set, false if not.

bool SamRecord::setInsertSize(int32_t insertSize) Sets the inferred insert size.

Returns true if successfully set, false if not.

bool SamRecord::setSequence(const char* seq) Sets the sequence string to the passed in string. This is a SAM formatted sequence string. Internal processing handles switching between SAM/BAM formats when read/written.

Returns true if successfully set, false if not.

bool SamRecord::setQuality(const char* quality) Sets the quality string to the passed in string. This is a SAM formatted quality string. Internal processing handles switching between SAM/BAM formats when read/written.

Returns true if successfully set, false if not.

bool SamRecord::addTag(const char* tag, char vtype, const char* value) Adds a tag to the record with the specified tag, vtype, and value. Vtype can be SAM/BAM vtype. Internal processing handles switching between SAM/BAM vtypes when read/written.

Returns true if successfully set, false if not.

When set, SAM fields are validated against: SAM Validation Criteria


Retrieving fields from a SAM/BAM Record

The SamRecord class contains accessors to access the fields of a SAM/BAM record. They assume that the class has already been populated, either by using the set commands or by calling SamFile::ReadRecord. Not all of the values that can be retrieved using these get accessors have set methods. That is because they are internally calculated values if they were not read from a file.

The methods found in the SamRecord class for setting fields are:

Method Name Description
bool SamRecord::isValid(SamFileHeader& header) Returns true if the record is valid. This performs validation steps. TODO: the method exists, but it does not yet perform any checks, so just returns true.
int32_t SamRecord::getBlockSize() Returns the BAM block size of the record.
const char* SamRecord::getReferenceName() Returns the reference sequence name (SAM format).
int32_t SamRecord::getReferenceID() Returns the reference sequence ID (BAM format).
int32_t SamRecord::get1BasedPosition() Returns the 1-based (SAM formatted) leftmost position.
int32_t SamRecord::get0BasedPosition() Returns the 0-based (BAM formatted) leftmost position.
int8_t SamRecord::getReadNameLength() Returns the length of the ReadName (QNAME).
int8_t SamRecord::getMapQuality() Returns the map quality.
int16_t SamRecord::getBin() Returns the BAM bin for the record.
int16_t SamRecord::getCigarLength() Returns the length of the CIGAR in BAM format.
int16_t SamRecord::getFlag() Returns the flag.
int32_t SamRecord::getReadLength() Returns the length of the read.
const char* SamRecord::getMateReferenceName() Returns the mate reference sequence name (SAM format). Returns the mate reference sequence name even if it is the same as the reference sequence name.
const char* SamRecord::getMateReferenceNameOrEqual() Returns the mate reference sequence name (SAM format). Returns the mate reference sequence name, unless it is the same as the reference sequence name, then an "=" is returned, unless the name is "*", then "*" is returned.
int32_t SamRecord::getMateReferenceID() Returns the mate reference sequence id (BAM format).
int32_t SamRecord::get1BasedMatePosition() Returns the 1-based (SAM formatted) mate leftmost position.
int32_t SamRecord::get0BasedMatePosition() Returns the 0-based (BAM formatted) mate leftmost position.
int32_t SamRecord::getInsertSize() Returns the insert size.
int32_t SamRecord::get0BasedAlignmentEnd() Returns the 0-based inclusive right-most position of the clipped sequence.
int32_t SamRecord::get1BasedAlignmentEnd() Returns the 1-based inclusive right-most position of the clipped sequence.
int32_t SamRecord::get0BasedUnclippedStart() Returns the 0-based inclusive left-most position adjusted for clipped bases.
int32_t SamRecord::get1BasedUnclippedStart() Returns the 1-based inclusive left-most position adjusted for clipped bases.
int32_t SamRecord::get0BasedUnclippedEnd() Returns the 0-based inclusive right-most position adjusted for clipped bases.
int32_t SamRecord::get1BasedUnclippedEnd() Returns the 1-based inclusive right-most position adjusted for clipped bases.
const char* SamRecord::getReadName() Returns the SAM formatted Read Name (QNAME).
const char* SamRecord::getCigar() Returns the SAM formatted CIGAR string.
const char* SamRecord::getSequence() Returns the SAM formatted Sequence string.
const char* SamRecord::getQuality() Returns the SAM formatted Quality string.
bool SamRecord::getNextSamTag(char* tag, char& vtype, void** value) Returns true if a tag was read, false if there are no more tags.

For a true return value, tag is sent to the tag of the tag, vtype is set to the vtype of the tag, and value is a pointer to the value of the tag. You will then need to use a switch to cast value to int, double, char, or String.

bool SamRecord::isIntegerType(char vtype) Returns true if the passed in vtype is of integer ('c', 'C', 's', 'S', 'i', 'I') type.
bool SamRecord::isDoubleType(char vtype) Returns true if the passed in vtype is of double ('f') type.
bool SamRecord::isCharType(char vtype) Returns true if the passed in vtype is of char ('A') type.
bool SamRecord::isStringType(char vtype) Returns true if the passed in vtype is of String ('Z') type.


Example of using getNextSamTag:

   // record is a previously setup SamRecord.
   String recordString = "";
   char tag[3];
   char vtype;
   void* value;

   // While there are more tags, write them to the recordString.
   while(record.getNextSamTag(tag, vtype, &value) != false)
   {
      recordString += "\t";
      recordString += tag;
      recordString += ":"; 
      recordString += vtype;
      recordString += ":";
      if(record.isIntegerType(vtype))
      {
         recordString += (int)*(int*)value;
      }
      else if(record.isDoubleType(vtype))
      {
         recordString += (double)*(double*)value;
      }
      else if(record.isCharType(vtype))
      {
         recordString += (char)*(char*)value;
      }
      else
      {
         // String type.
         recordString += (String)*(String*)value;
      }
   }

   recordString += "\n";