Difference between revisions of "SAM/BAM Convert Sequence"

From Genome Analysis Wiki
Jump to navigationJump to search
(Created page with '== SAM/BAM support conversion between '=' and the base in a sequence == '''Request:''' To be able to call SamRecord::getSequence and have it return the sequence with '=' or with…')
 
Line 13: Line 13:
 
=== Proposed Solution ===
 
=== Proposed Solution ===
 
....More details coming soon!
 
....More details coming soon!
 +
==== USER API UPDATES ====
  
Added a class, <code>SamQuerySeqWithRef</code>, with methods <code>seqWithEquals</code> and <code>seqWithoutEquals</code>
+
==== Possibly useful to a User ====
 +
Added a class, <code>SamQuerySeqWithRef</code>, with static methods <code>seqWithEquals</code> and <code>seqWithoutEquals</code>.  SamRecord uses these methods to
 
<p>
 
<p>
 +
<source lang="cpp">
 +
// Gets the sequence with '=' in any position where the sequence matches the reference.
 +
// NOTE: 'N' in both the sequence and the reference is not considered a match.
 +
// Parameters:
 +
//    currentSeq sequence that should be converted
 +
//    seq0BasedPos 0 based start position of currentSeq on the reference.
 +
//  cigar cigar string for currentSeq (used for determining how the sequence aligns to the reference)
 +
//    referenceName reference name associated with this sequence
 +
//    refSequence reference sequence object
 +
//    updatedSeq return parameter that this method sets to the current sequence, replacing any matches to the reference with '='.
 +
void SamQuerySeqWithRef::seqWithEquals(const char * currentSeq,
 +
                                      int32_t seq0BasedPos,
 +
                                      Cigar &  cigar,
 +
                                      const char *  referenceName,
 +
                                      const GenomeSequence &  refSequence,
 +
                                      std::string &  updatedSeq)
 +
 +
// Gets the sequence converting '=' to the appropriate base using the reference.
 +
// Parameters:
 +
//    currentSeq sequence that should be converted
 +
//    seq0BasedPos 0 based start position of currentSeq on the reference.
 +
//  cigar cigar string for currentSeq (used for determining how the sequence aligns to the reference)
 +
//    referenceName reference name associated with this sequence
 +
//    refSequence reference sequence object
 +
//    updatedSeq return parameter that this method sets to the current sequence, replacing any "=" with the base from the reference.
 +
void SamQuerySeqWithRef::seqWithoutEquals(const char * currentSeq,
 +
                                      int32_t seq0BasedPos,
 +
                                      Cigar &  cigar,
 +
                                      const char *  referenceName,
 +
                                      const GenomeSequence &  refSequence,
 +
                                      std::string &  updatedSeq)
 +
</source>
 +
 +
==== INTERNAL MODIFICATIONS ====
 +
* Updated GenericSamInterface, BamInterface, and SamInterface to take TranslationType as a parameter to <code>writeRecord</code>.
 +
** <code>SamInterface::writeRecord</code> passes the translation parameter to <code>SamRecord::getSequence</code> when it obtains the sequence to write into the SAM file.

Revision as of 18:48, 7 December 2010

SAM/BAM support conversion between '=' and the base in a sequence

Request: To be able to call SamRecord::getSequence and have it return the sequence with '=' or with the correct bases.

Requester: Goncalo Abecasis & Hyun Min Kang

Date Requested: Nov 23, 2010?

Notes

It was requested to have this handled within the getSequence call so the user doesn't have to convert the sequence after calling getSequence.

Proposed Solution

....More details coming soon!

USER API UPDATES

Possibly useful to a User

Added a class, SamQuerySeqWithRef, with static methods seqWithEquals and seqWithoutEquals. SamRecord uses these methods to

// Gets the sequence with '=' in any position where the sequence matches the reference.
// NOTE: 'N' in both the sequence and the reference is not considered a match.
// Parameters:
//    	currentSeq 	sequence that should be converted
//    	seq0BasedPos 	0 based start position of currentSeq on the reference.
//   	cigar 	cigar string for currentSeq (used for determining how the sequence aligns to the reference)
//    	referenceName 	reference name associated with this sequence
//    	refSequence 	reference sequence object
//    	updatedSeq 	return parameter that this method sets to the current sequence, replacing any matches to the reference with '='.
void SamQuerySeqWithRef::seqWithEquals(const char * 		currentSeq,
                                       int32_t 	 		seq0BasedPos,
                                       Cigar &  		cigar,
                                       const char *  		referenceName,
                                       const GenomeSequence &  	refSequence,
                                       std::string &  		updatedSeq)

// Gets the sequence converting '=' to the appropriate base using the reference.
// Parameters:
//    	currentSeq 	sequence that should be converted
//    	seq0BasedPos 	0 based start position of currentSeq on the reference.
//   	cigar 	cigar string for currentSeq (used for determining how the sequence aligns to the reference)
//    	referenceName 	reference name associated with this sequence
//    	refSequence 	reference sequence object
//    	updatedSeq 	return parameter that this method sets to the current sequence, replacing any "=" with the base from the reference.
void SamQuerySeqWithRef::seqWithoutEquals(const char * 		currentSeq,
                                       int32_t 	 		seq0BasedPos,
                                       Cigar &  		cigar,
                                       const char *  		referenceName,
                                       const GenomeSequence &  	refSequence,
                                       std::string &  		updatedSeq)

INTERNAL MODIFICATIONS

  • Updated GenericSamInterface, BamInterface, and SamInterface to take TranslationType as a parameter to writeRecord.
    • SamInterface::writeRecord passes the translation parameter to SamRecord::getSequence when it obtains the sequence to write into the SAM file.