Line 1: |
Line 1: |
| === Review Sept 17th === | | === Review Sept 17th === |
| ==== Review Discussion Topics ==== | | ==== Review Discussion Topics ==== |
| + | http://genome.sph.umich.edu/wiki/SAM/BAM_Library_FAQs |
| + | http://www.sph.umich.edu/csg/mktrost/doxygen/html/ |
| + | |
| + | Example of using the library to set values: http://www.sph.umich.edu/csg/mktrost/doxygen/html/WriteFiles_8cpp-source.html |
| ===== Return Statuses ===== | | ===== Return Statuses ===== |
| Currently anytime you do anything on a SAM/BAM file, you have to check the status for failure: | | Currently anytime you do anything on a SAM/BAM file, you have to check the status for failure: |
Line 57: |
Line 61: |
| Aborted | | Aborted |
| </pre> | | </pre> |
| + | |
| + | |
| + | ===== Accessing String Values ===== |
| + | SAM/BAM files have strings in them that people will want to read out. |
| + | How should we handle this interface? |
| + | Currently we do a mix of returning const char*, like: |
| + | <source lang="cpp"> |
| + | const char* SamRecord::getSequence() |
| + | { |
| + | myStatus = SamStatus::SUCCESS; |
| + | if(mySequence.Length() == 0) |
| + | { |
| + | // 0 Length, means that it is in the buffer, but has not yet |
| + | // been synced to the string, so do the sync. |
| + | setSequenceAndQualityFromBuffer(); |
| + | } |
| + | return mySequence.c_str(); |
| + | } |
| + | </source> |
| + | and passing in references to strings, like: |
| + | <source lang="cpp"> |
| + | // Set the passed in string to the header line at the specified index. |
| + | // It does NOT clear the current contents of header. |
| + | // NOTE: some indexes will return blank if the entry was deleted. |
| + | bool SamFileHeader::getHeaderLine(unsigned int index, std::string& header) const |
| + | { |
| + | // Check to see if the index is in range of the header records vector. |
| + | if(index < myHeaderRecords.size()) |
| + | { |
| + | // In range of the header records vector, so get the string for |
| + | // that record. |
| + | SamHeaderRecord* hdrRec = myHeaderRecords[index]; |
| + | hdrRec->appendString(header); |
| + | return(true); |
| + | } |
| + | else |
| + | { |
| + | unsigned int commentIndex = index - myHeaderRecords.size(); |
| + | // Check to see if it is in range of the comments. |
| + | if(commentIndex < myComments.size()) |
| + | { |
| + | // It is in range of the comments, so add the type. |
| + | header += "@CO\t"; |
| + | // Add the comment. |
| + | header += myComments[commentIndex]; |
| + | // Add the new line. |
| + | header += "\n"; |
| + | return(true); |
| + | } |
| + | } |
| + | // Invalid index. |
| + | return(false); |
| + | } |
| + | </source> |
| + | |
| + | http://www.sph.umich.edu/csg/mktrost/doxygen/html/SamRecord_8h-source.html |
| | | |
| ===== SamFileHeader ===== | | ===== SamFileHeader ===== |
| *Should this be renamed to SamHeader? | | *Should this be renamed to SamHeader? |
| *Do we like the classes being named starting with Sam? Should it be Bam? | | *Do we like the classes being named starting with Sam? Should it be Bam? |
| + | |
| + | Should we add the following to SamFileHeader: |
| + | <source lang="cpp"> |
| + | ////////////////////////////////// |
| + | // Set methods for header fields. |
| + | bool setVersion(const char* version); |
| + | bool setSortOrder(const char* sortOrder); |
| + | bool addSequenceName(const char* sequenceName); |
| + | bool setSequenceLength(const char* keyID, int sequenceLength); |
| + | bool setGenomeAssemblyId(const char* keyID, const char* genomeAssemblyId); |
| + | bool setMD5Checksum(const char* keyID, const char* md5sum); |
| + | bool setURI(const char* keyID, const char* uri); |
| + | bool setSpecies(const char* keyID, const char* species); |
| + | bool addReadGroupID(const char* readGroupID); |
| + | bool setSample(const char* keyID, const char* sample); |
| + | bool setLibrary(const char* keyID, const char* library); |
| + | bool setDescription(const char* keyID, const char* description); |
| + | bool setPlatformUnit(const char* keyID, const char* platform); |
| + | bool setPredictedMedianInsertSize(const char* keyID, const char* isize); |
| + | bool setSequencingCenter(const char* keyID, const char* center); |
| + | bool setRunDate(const char* keyID, const char* runDate); |
| + | bool setTechnology(const char* keyID, const char* technology); |
| + | bool addProgram(const char* programID); |
| + | bool setProgramVersion(const char* keyID, const char* version); |
| + | bool setCommandLine(const char* keyID, const char* commandLine); |
| + | |
| + | /////////////////////////////////// |
| + | // Get methods for header fields. |
| + | // Returns the number of SQ entries in the header. |
| + | int32_t getSequenceDictionaryCount(); |
| + | // Return the Sort Order value that is set in the Header. |
| + | // If this field does not exist, "" is returned. |
| + | const char* getSortOrder(); |
| + | /// Additional gets for the rest of the fields. |
| + | </source> |
| + | Should these also be added to SamHeaderRG, SamHeaderSQ, etc as appropriate.... |
| | | |
| === Review June 7th === | | === Review June 7th === |