Line 1: |
Line 1: |
− | = CigarRoller=
| + | [[Category:C++]] |
− | This class is part of [[C++ Library: libcsg|libcsg]].
| + | [[Category:libStatGen]] |
| + | [[Category:libStatGen general]] |
| | | |
− | This purpose of this class is to provide utilities for creating and processing CIGAR strings. | + | = Cigar= |
| + | This class is part of [[libStatGen: general]]. |
| | | |
− | == Public Methods ==
| + | The purpose of this class is to provide utilities for processing CIGARs. It has read-only operators that do not allow modification to the class other than for lazy-evaluation. |
− | {| style="margin: 1em 1em 1em 0; background-color: #f9f9f9; border: 1px #aaa solid; border-collapse: collapse;" border="1"
| |
− | |-style="background: #f2f2f2; text-align: center;"
| |
− | ! Method Name !! Description
| |
− | |-
| |
− | | <code>CigarRoller::CigarRoller()</code>
| |
− | | Default constructor initializes as a CIGAR with no operations.
| |
− | |-
| |
− | | <code>CigarRoller::CigarRoller(const char *cigarString)</code>
| |
− | | Constructor that initializes the object with the specified cigarString.
| |
− | |-
| |
− | | <code>CigarRoller & CigarRoller::operator += (CigarRoller &rhs)</code>
| |
− | | Add the contents of the specified CigarRoller to this object.
| |
− | |-
| |
− | | <code>CigarRoller & CigarRoller::operator += (CigarOperator &rhs)</code>
| |
− | | Append the specified cigar operation to this object.
| |
− | |-
| |
− | | <code>void CigarRoller::Add(Operation operation, int count)</code>
| |
− | | Adds the specified operation with the specified count to this object.
| |
− | |-
| |
− | | <code>void CigarRoller::Add(const char *cigarString)</code>
| |
− | | Adds the specified cigarString to this object.
| |
− | |-
| |
− | | <code>void CigarRoller::Set(const char *cigarString)</code>
| |
− | | Sets this object to the specified cigarString.
| |
− | |-
| |
− | | ''' DEPRECATED''' <code>int CigarRoller::getMatchPositionOffset()</code>
| |
− | | DO NOT USE.
| |
− | |-
| |
− | | <code>const char * CigarRoller::getString()</code>
| |
− | | Returns the string representation of this CIGAR object.
| |
− | |-
| |
− | | <code>void CigarRoller::getExpandedString(std::string &s)</code>
| |
− | | Sets the specified string to a string of characters that represent this cigar with no digits (a CIGAR of "3M" would return "MMM")
| |
− | |-
| |
− | | <code>void CigarRoller::clear()</code>
| |
− | | Clear this object so that it has 0 Cigar Operations.
| |
− | |-
| |
− | | <code>CigarOperator & CigarRoller::operator [] (int i)</code>
| |
− | | Return the Cigar Operation at the specified index (starting at 0).
| |
− | |-
| |
− | | <code>bool CigarRoller::operator == (CigarRoller &rhs)</code>
| |
− | | Returns true if two Cigar Rollers are the same (the same operations of the same sizes)
| |
− | |-
| |
− | | <code>int CigarRoller::size()</code>
| |
− | | Return the number of cigar operations in this object.
| |
− | |-
| |
− | | <code>void CigarRoller::Dump()</code>
| |
− | | Write this object as a string to cout.
| |
− | |-
| |
− | | <code>int CigarRoller::getExpectedQueryBaseCount()</code>
| |
− | | Returns the expected read length
| |
− | |-
| |
− | | <code>int CigarRoller::getExpectedReferenceBaseCount()</code>
| |
− | | Return how many bases in the reference are spanned by the given CIGAR string
| |
− | |-
| |
− | | <code>int32_t CigarRoller::getRefOffset(int32_t queryIndex)</code>
| |
− | |Return the reference offset associated with the specified query index or INDEX_NA based on this cigar.
| |
− | See [[C++ Class: CigarRoller#Mapping Between Reference and Read/Query|Mapping Between Reference and Read/Query]] for a more detailed explanation with examples as to how it works.
| |
− | |-
| |
− | | <code>int32_t CigarRoller::getQueryIndex(int32_t refOffset)</code>
| |
− | | Return the query index associated with the specified reference offset or INDEX_NA based on this cigar.
| |
− | See [[C++ Class: CigarRoller#Mapping Between Reference and Read/Query|Mapping Between Reference and Read/Query]] for a more detailed explanation with examples as to how it works.
| |
− | |-
| |
− | | <code>int32_t CigarRoller::getRefPosition(int32_t queryIndex, int32_t queryStartPos)</code>
| |
− | |Return the reference position associated with the specified query index or INDEX_NA based on this cigar and the specified queryStartPos.
| |
− | queryStartPops is the leftmost mapping position of the first matching base in the query.
| |
| | | |
− | See [[C++ Class: CigarRoller#Mapping Between Reference and Read/Query|Mapping Between Reference and Read/Query]] for a more detailed explanation with examples as to how it works. | + | See: http://csg.sph.umich.edu//mktrost/doxygen/current/classCigar.html for documentation. |
− | |-
| |
− | | <code>int32_t CigarRoller::getQueryIndex(int32_t refPosition, int32_t queryStartPos)</code>
| |
− | | Return the query index associated with the specified reference position and queryStartPos or INDEX_NA based on this cigar.
| |
− | queryStartPops is the leftmost mapping position of the first matching base in the query.
| |
| | | |
− | See [[C++ Class: CigarRoller#Mapping Between Reference and Read/Query|Mapping Between Reference and Read/Query]] for a more detailed explanation with examples as to how it works.
| + | The static methods are helpful for determining information about the operator. |
− | |-
| |
− | | <code>uint32_t getNumOverlaps(int32_t start, int32_t end, int32_t queryStartPos)</code>
| |
− | | Return the number of bases that overlap the reference and the read associated with this cigar that falls within the specified region.
| |
− | start : inclusive start position (reference position) of the region to check for overlaps in. (-1 indicates to start at the beginning of the reference.)
| |
− |
| |
− | end : exclusive end position (reference position) of the region to check for overlaps in. (-1 indicates to go to the end of the reference.)
| |
| | | |
− | queryStartPos : leftmost mapping position of the first matching base in the query.
| + | See [[C++ Class: CigarRoller#Mapping Between Reference and Read/Query|Mapping Between Reference and Read/Query]] for a more detailed explanation with examples as to how the mapping between the read/query works. |
| | | |
− | NOTE: ensure that start, end, and queryStartPos are all in the same base (0 or 1).
| + | See [[C++ Class: CigarRoller#Determining the Number of Reference and Read/Query Overlaps|Determining the Number of Reference and Read/Query Overlaps]] for a more detailed explanation with examples as to how determining overlaps works. |
| | | |
− | See [[C++ Class: CigarRoller#Determining the Number of Reference and Read/Query Overlaps|Determining the Number of Reference and Read/Query Overlaps]] for a more detailed explanation with examples as to how it works.
| + | = CigarRoller= |
− | | + | This class is part of [[libStatGen: general]]. |
− | |}
| |
− | | |
− | | |
− | == Overloaded Streaming Operators ==
| |
− | {| style="margin: 1em 1em 1em 0; background-color: #f9f9f9; border: 1px #aaa solid; border-collapse: collapse;" border="1"
| |
− | |-style="background: #f2f2f2; text-align: center;"
| |
− | ! Method Name !! Description
| |
− | |-
| |
− | | <code> std::ostream &operator << (std::ostream &stream, const CigarRoller& roller)</code>
| |
− | | Writes all of the cigar operations contained in this roller to the passed in stream.
| |
− | |-
| |
− | | <code> std::ostream &operator << (std::ostream &stream, const CigarRoller::CigarOperator& o)</code>
| |
− | | Writes the specified cigar operation to the specified stream as <count><char> (3M).
| |
− | |}
| |
− | | |
− | | |
− | | |
− | == Public Enums ==
| |
− | {| style="margin: 1em 1em 1em 0; background-color: #f9f9f9; border: 1px #aaa solid; border-collapse: collapse;" border="1"
| |
− | |-style="background: #f2f2f2; text-align: center;"
| |
− | ! colspan="2"| enum SPACE_TYPE
| |
− | |-
| |
− | ! Enum Value !! Description
| |
− | |-
| |
− | | none
| |
− | | No operation has been specified
| |
− | |-
| |
− | | match
| |
− | | The query sequence and the reference sequence bases are the same for the bases associated with this cigar operation.
| |
− | Both <code>match</code> and <code>mismatch</code> are associated with CIGAR Operation "M"
| |
− | |-
| |
− | | mismatch
| |
− | | The query sequence and the reference sequence bases are different for the bases associated with this cigar operation, but bases exist in both the query and the reference.
| |
− | Both <code>match</code> and <code>mismatch</code> are associated with CIGAR Operation "M"
| |
− | |-
| |
− | | insert
| |
− | | Insertion to the reference (the query sequence contains bases that have no corresponding base in the reference).
| |
− | Associated with CIGAR Operation "I"
| |
− | |-
| |
− | | del
| |
− | |Deletion from the reference (the reference contains bases that have no corresponding base in the query sequence).
| |
− | Associated with CIGAR Operation "D"
| |
− | |-
| |
− | | skip
| |
− | | Skipped region from the reference (the reference contains bases that have no corresponding base in the query sequence).
| |
− | Associated with CIGAR Operation "N"
| |
− | |-
| |
− | | softClip
| |
− | | Soft clip on the read (clipped sequence present in the query sequence)
| |
− | Associated with CIGAR Operation "S"
| |
− | |-
| |
− | | hardClip
| |
− | | Hard clip on the read (clipped sequence not present in the query sequence)
| |
− | Associated with CIGAR Operation "H"
| |
− | |-
| |
− | |pad
| |
− | | Padding (silent deletion from the padded reference sequence)
| |
− | Associated with CIGAR Operation "P"
| |
− | |}
| |
− | | |
− | | |
− | == Public Constants ==
| |
− | {| style="margin: 1em 1em 1em 0; background-color: #f9f9f9; border: 1px #aaa solid; border-collapse: collapse;" border="1"
| |
− | |-style="background: #f2f2f2; text-align: center;"
| |
− | ! Constant !! Value !! Description
| |
− | |-
| |
− | | INDEX_NA
| |
− | | -1
| |
− | | Value associated with an index that is not applicable/does not exist.
| |
− | Used for converting between query and reference indexes/offsets when an associated index/offset does not exist.
| |
− | |}
| |
− | | |
− | | |
− | == Nested Class ==
| |
− | | |
− | === CigarOperation ===
| |
| | | |
− | ==== Public Methods ====
| + | The purpose of this class is to provide accessors for setting, updating, modifying the CIGAR object. It is a child class of Cigar. |
− | {| style="margin: 1em 1em 1em 0; background-color: #f9f9f9; border: 1px #aaa solid; border-collapse: collapse;" border="1"
| |
− | |-style="background: #f2f2f2; text-align: center;"
| |
− | ! Method Name !! Description
| |
− | |-
| |
− | | <code>CigarOperator::CigarOperator(Operation operation, uint32_t count)</code>
| |
− | | Set the cigar operator with the specified operation and count length.
| |
− | |-
| |
− | | <code>char CigarOperator::getChar()</code>
| |
− | | Returns the character code (M, I, D, N, S, H, or P) associated with this operation.
| |
− | |-
| |
− | | <code>bool CigarOperator::operator == (CigarOperator &rhs)</code>
| |
− | | Returns true if the passed in operator is the same as this operator, false if not.
| |
− | |-
| |
− | | <code>bool CigarOperator::operator != (CigarOperator &rhs)</code>
| |
− | | Returns true if the passed in operator is not the same as this operator, false if they are the same.
| |
− | |}
| |
| | | |
| + | See: http://csg.sph.umich.edu//mktrost/doxygen/current/classCigarRoller.html for documentation. |
| | | |
− | == Mapping Between Reference and Read/Query ==
| + | = Mapping Between Reference and Read/Query = |
− | <code>int32_t CigarRoller::getRefOffset(int32_t queryIndex)</code> and <code>int32_t CigarRoller::getQueryIndex(int32_t refOffset)</code> are used to map between the reference and the read. | + | <code>int32_t Cigar::getRefOffset(int32_t queryIndex)</code> and <code>int32_t Cigar::getQueryIndex(int32_t refOffset)</code> are used to map between the reference and the read. |
| | | |
| The queryIndex is the index in the read - from 0 to (read length - 1). | | The queryIndex is the index in the read - from 0 to (read length - 1). |
Line 233: |
Line 68: |
| | | |
| | | |
− | === Determining the Number of Reference and Read/Query Overlaps ===
| + | == Determining the Number of Reference and Read/Query Overlaps == |
| | | |
| A useful concept is determining the number of bases that overlap between the reference and the read in a given region. | | A useful concept is determining the number of bases that overlap between the reference and the read in a given region. |
Line 257: |
Line 92: |
| getNumOverlaps(0,5,5) = 0 - outside of read | | getNumOverlaps(0,5,5) = 0 - outside of read |
| getNumOverlaps(32,40,5) = 0 - outside of read | | getNumOverlaps(32,40,5) = 0 - outside of read |
| + | getNumOverlaps(0,5,1) = 4 - with a different start position, this range overlaps the read with 4 bases |
| + | getNumOverlaps(32,40,32) = 4 - with a different start position, this range overlaps the read with 4 bases |