Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 9: Line 9:     
=== Notes ===
 
=== Notes ===
 +
SAM/BAM format allows '=' in the sequence when it matches the reference base.  Using '=' could make it compress better in BAM format, but a user analyzing the file may want to know what the actual bases are.  This library enhancements performs the conversion between these two formats.
 +
 
It was requested to have this handled within the getSequence call so the user doesn't have to convert the sequence after calling getSequence.
 
It was requested to have this handled within the getSequence call so the user doesn't have to convert the sequence after calling getSequence.
   Line 17: Line 19:  
Just scroll through the files - the diffs are highlighted, but the entire file is included so you have context.
 
Just scroll through the files - the diffs are highlighted, but the entire file is included so you have context.
 
<p>
 
<p>
[http://www.sph.umich.edu/csg/mktrost/codeReviews/2010/12_08/ Sequence Translation Modifications]
+
[http://csg.sph.umich.edu//mktrost/codeReviews/2010/12_08/ Sequence Translation Modifications]
 
<p>
 
<p>
 
The doxygen output for the current git repository is:
 
The doxygen output for the current git repository is:
[http://www.sph.umich.edu/csg/mktrost/dox/current/html/ Current Doxygen]
+
[http://csg.sph.umich.edu//mktrost/dox/current/html/Current Doxygen]
      Line 165: Line 167:  
** <code>BamInterface::writeRecord</code> passes the translation parameter to <code>SamRecord::writeRecordBuffer</code>
 
** <code>BamInterface::writeRecord</code> passes the translation parameter to <code>SamRecord::writeRecordBuffer</code>
 
** <code>SamInterface::writeRecord</code> passes the translation parameter to <code>SamRecord::getSequence</code> when it obtains the sequence to write into the SAM file.
 
** <code>SamInterface::writeRecord</code> passes the translation parameter to <code>SamRecord::getSequence</code> when it obtains the sequence to write into the SAM file.
 +
 +
=== FAQ ===
 +
'''Q: How do you know what the actual base was?  Doesn't that depend on the specific reference used to do the alignment'''
 +
 +
A: The user needs to create a <code>GenomeSequence</code> object, specifying the fasta file.  That object is used to map to the reference.
 +
 +
'''Q: What if I don't know what my reference file is?'''
 +
 +
A:  Then I don't either.  Although....additional validation will be added to check the reference specified versus the SQ fields in the header: [[BAM Convert Sequences#STILL NEEDS WORK|Additional Validation Coming Soon]]
 +
 +
=== STILL NEEDS WORK ===
 +
*Additional validation will be added to check the specified GenomeSequence against the Header from the SamFile.  It will check things like sequence length.  An error will be reported if the reference sequence length does not match the SQ LN field for that reference name.  This will give you an indication of if you are using the wrong reference file.
96

edits

Navigation menu