BAM to FASTQ

From Genome Analysis Wiki
Revision as of 16:32, 7 December 2010 by Mktrost (talk | contribs) (Created page with '== BAM to FASTQ == '''Request:''' Software to convert from BAM to FASTQ using the OQ for the quality. <p> '''Requester:''' Hyun Min Kang & Goo Jun <p> '''Date Requested:''' Dece…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

BAM to FASTQ

Request: Software to convert from BAM to FASTQ using the OQ for the quality.

Requester: Hyun Min Kang & Goo Jun

Date Requested: December 7, 2010

Date Needed: Soon

Current Status: On hold per direction from Hyun Min Kang (12/7/2010)

  • On hold to determine if it is useful to update Bingshan's tool (would need conversion to the new BAM Library) to use OQ or if that does not provide much of a benefit.
  • Although having it work quickly & efficiently on unsorted BAMs may be useful.

Notes

  • Bingshan has code Bam2FastQ that already converts from BAM to FASTQ, but does not use OQ for quality.

The code needs to figure out the strand and reverse compliment the reverse strands.

Reverse complimenting means:

  • if the sequence in the BAM is: ACTG, the reverse compliment is: CAGT
  • if the quality in OQ is: 1234, the reverse compliment is 4321.

  • Initial release would require the files to be sorted by ReadName (producing an error if not already sorted).
    • This is done for you by calling SamFile::setSortedValidation(SamFile::QUERY_NAME). Then ReadRecord will fail if the record is not sorted.
    • It would also error if the read is not paired.
  • Later Release: work on unsorted BAM files.

Proposed Solution