Changes

From Genome Analysis Wiki
Jump to navigationJump to search
7,609 bytes removed ,  11:02, 2 February 2017
Line 1: Line 1: −
== Setting fields in a SAM/BAM Record ==
+
[[Category:C++]]
The '''SamRecord''' class contains accessors to set the fields of a SAM/BAM record.  They are used for creating a record that is not read from a SAM/BAM file.  By using these set methods to setup the record, they can be pulled back out using the get accessors or the record can be later written as either a SAM/BAM record. 
+
[[Category:libStatGen]]
The methods found in the '''SamRecord''' class for setting fields are:
+
[[Category:libStatGen BAM]]
{| style="margin: 1em 1em 1em 0; background-color: #f9f9f9; border: 1px #aaa solid; border-collapse: collapse;" border="1"
  −
|-style="background: #f2f2f2; text-align: center;"  '''SamRecord Class Methods'''
  −
! Method Name !!  Description
  −
|-
  −
| <code>void resetRecord()</code>
  −
| Resets the record to be an empty record.  This is not necessary when you are reading a Sam/Bam file, but if you are setting fields, it is a good idea to clean out a record before reusing it.  Clearing it allows you to not have to set any empty fields.
  −
|-
  −
| <code>bool setReadName(const char* readName)</code>
  −
| Sets QNAME to the passed in name.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool setFlag(int16_t flag)</code>
  −
| Sets the bitwise FLAG to the passed in value.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool setReferenceName(SamFileHeader& header, const char* referenceName)</code>
  −
| Sets the reference sequence name.  The reference id is calculated using the header.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool set1BasedPosition(int32_t position)</code>
  −
| Sets the leftmost position.  The value passed in is 1-based (SAM formatted).  Internal processing handles switching between SAM/BAM formats when read/written.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool set0BasedPosition(int32_t position)</code>
  −
| Sets the leftmost position.  The value passed in is 0-based (BAM formatted).  Internal processing handles switching between SAM/BAM formats when read/written.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool setMapQuality(int8_t mapQuality)</code>
  −
| Sets the mapping quality.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool setCigar(const char* cigar)</code>
  −
| Sets the cigar string to the passed in CIGAR.  This is a SAM formatted CIGAR string.  Internal processing handles switching between SAM/BAM formats when read/written.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool setMateReferenceName(SamFileHeader& header, const char* referenceName)</code>
  −
| Sets the mate reference sequence name.  The mate reference id is calculated using the header.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool set1BasedMatePosition(int32_t matePosition)</code>
  −
| Sets the leftmost mate position.  The value passed in is 1-based (SAM formatted).  Internal processing handles switching between SAM/BAM formats when read/written.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool set0BasedMatePosition(int32_t matePosition)</code>
  −
| Sets the leftmost mate position.  The value passed in is 0-based (BAM formatted).  Internal processing handles switching between SAM/BAM formats when read/written.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool setInsertSize(int32_t insertSize)</code>
  −
| Sets the inferred insert size.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool setSequence(const char* seq)</code>
  −
| Sets the sequence string to the passed in string.  This is a SAM formatted sequence string.  Internal processing handles switching between SAM/BAM formats when read/written.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool setQuality(const char* quality)</code>
  −
| Sets the quality string to the passed in string.  This is a SAM formatted quality string.  Internal processing handles switching between SAM/BAM formats when read/written.
  −
Returns true if successfully set, false if not.
  −
|-
  −
| <code>bool addTag(const char* tag, char vtype, const char* value)</code>
  −
| Adds a tag to the record with the specified tag, vtype, and value.  Vtype can be SAM/BAM vtype.  Internal processing handles switching between SAM/BAM vtypes when read/written.
  −
Returns true if successfully set, false if not.
  −
|}
     −
When set, SAM fields are validated against: [[SAM Validation Criteria]]
+
This class is part of [[C++ Library: libStatGen]].
    +
== Getting/Setting fields in a SAM/BAM Record ==
 +
The '''SamRecord''' class contains accessors to "set" and "get" the fields of a SAM/BAM record. 
   −
== Retrieving fields from a SAM/BAM Record ==
+
The "set" accessors are used for creating a record that is not read from a SAM/BAM fileBy using these set methods to setup the record, they can be pulled back out using the get accessors or the record can be later written as either a SAM/BAM record.   
The '''SamRecord''' class contains accessors to access the fields of a SAM/BAM recordThey assume that the class has already been populated, either by using the set commands or by calling SamFile::ReadRecord.  Not all of the values that can be retrieved using these get accessors have set methodsThat is because they are internally calculated values if they were not read from a file.
     −
The methods found in the SamRecord class for setting fields are:
+
The "get" accessors assume that the class has already been populated, either by using the set commands or by calling SamFile::ReadRecordNot all of the values that can be retrieved using these get accessors have set methodsThat is because they are either read from the file or are internally calculated values.
{| class="wikitable" style="width:100%" border="1"
  −
|+ style="font-size:150%"|'''SamRecord Class Get Methods'''
  −
!  width=""|Method Name
  −
!  width=""|Description
  −
|-
  −
| bool isValid(SamFileHeader& header)
  −
| Returns true if the record is valid.  This performs validation stepsTODO: the method exists, but it does not yet perform any checks, so just returns true.
  −
|-
  −
| int32_t getBlockSize()
  −
| Returns the BAM block size of the record.
  −
|-
  −
| const char* getReferenceName(SamFileHeader& header)
  −
| Returns the reference sequence name (SAM format).
  −
|-
  −
| int32_t getReferenceID()
  −
| Returns the reference sequence ID (BAM format).
  −
|-
  −
| int32_t get1BasedPosition()
  −
| Returns the 1-based (SAM formatted) leftmost position.
  −
|-
  −
| int32_t get0BasedPosition()
  −
| Returns the 0-based (BAM formatted) leftmost position.
  −
|-
  −
| int8_t getReadNameLength()
  −
| Returns the length of the ReadName (QNAME).
  −
|-
  −
| int8_t getMapQuality()
  −
| Returns the map quality.
  −
|-
  −
| int16_t getBin()
  −
| Returns the BAM bin for the record.
  −
|-
  −
| int16_t getCigarLength()
  −
| Returns the length of the CIGAR in BAM format.
  −
|-
  −
| int16_t getFlag()
  −
| Returns the flag.
  −
|-
  −
| int32_t getReadLength()
  −
| Returns the length of the read.
  −
|-
  −
| const char* getMateReferenceName(SamFileHeader& header)
  −
| Returns the mate reference sequence name (SAM format)Returns the mate reference sequence name even if it is the same as the reference sequence name.
  −
|-
  −
| const char* getMateReferenceNameOrEqual(SamFileHeader& header)
  −
| Returns the mate reference sequence name (SAM format).  Returns the mate reference sequence name, unless it is the same as the reference sequence name, then an "=" is returned..
  −
|-
  −
| int32_t getMateReferenceID()
  −
| Returns the mate reference sequence id (BAM format).
  −
|-
  −
| int32_t get1BasedMatePosition()
  −
| Returns the 1-based (SAM formatted) mate leftmost position.
  −
|-
  −
| int32_t get0BasedMatePosition()
  −
| Returns the 0-based (BAM formatted) mate leftmost position.
  −
|-
  −
| int32_t getInsertSize()
  −
| Returns the insert size.
  −
|-
  −
| int32_t get0BasedAlignmentEnd();
  −
| Returns the 0-based inclusive right-most position of the clipped sequence.
  −
|-
  −
| int32_t get1BasedAlignmentEnd();
  −
| Returns the 1-based inclusive right-most position of the clipped sequence.
  −
|-
  −
| int32_t get0BasedUnclippedStart();
  −
| Returns the 0-based inclusive left-most position adjusted for clipped bases.
  −
|-
  −
| int32_t get1BasedUnclippedStart();
  −
| Returns the 1-based inclusive left-most position adjusted for clipped bases.
  −
|-
  −
| int32_t get0BasedUnclippedEnd();
  −
| Returns the 0-based inclusive right-most position adjusted for clipped bases.
  −
|-
  −
| int32_t get1BasedUnclippedEnd();
  −
| Returns the 1-based inclusive right-most position adjusted for clipped bases.
  −
|-
  −
| const char* getReadName()
  −
| Returns the SAM formatted Read Name (QNAME).
  −
|-
  −
| const char* getCigar()
  −
| Returns the SAM formatted CIGAR string.
  −
|-
  −
| const char* getSequence()
  −
| Returns the SAM formatted Sequence string.
  −
|-
  −
| const char* getQuality()
  −
| Returns the SAM formatted Quality string.
  −
|-
  −
| bool getNextSamTag(char* tag, char& vtype, void** value)
  −
| Returns true if a tag was read, false if there are no more tags.
  −
For a true return value, tag is sent to the tag of the tag, vtype is set to the vtype of the tag, and value is a pointer to the value of the tag.  You will then need to use a switch to cast value to int, double, char, or String.
  −
|-
  −
| bool isIntegerType(char vtype)
  −
| Returns true if the passed in vtype is of integer ('c', 'C', 's', 'S', 'i', 'I') type.
  −
|-
  −
| bool isDoubleType(char vtype)
  −
| Returns true if the passed in vtype is of double ('f') type.
  −
|-
  −
| bool isCharType(char vtype)
  −
| Returns true if the passed in vtype is of char ('A') type.
  −
|-
  −
| bool isStringType(char vtype)
  −
| Returns true if the passed in vtype is of String ('Z') type.
  −
|-
  −
|-
  −
|}
      +
See: http://csg.sph.umich.edu//mktrost/doxygen/current/classSamFileHeader.html for documentation.
   −
Example of using getNextSamTag:
+
==Example of using getNextSamTag==
 
<source lang="cpp">
 
<source lang="cpp">
 
   // record is a previously setup SamRecord.
 
   // record is a previously setup SamRecord.
96

edits

Navigation menu