Changes

BAM to FASTQ (view source)

Revision as of 17:19, 7 December 2010

1,283 bytes added , 17:19, 7 December 2010

no edit summary

Line 18: Line 18:

<p>

The code needs to figure out the strand and reverse compliment the reverse strands.

+

<p>

+

One file for the first in pair & 1 file for the 2nd in the pair - the order in the two files must match.

<p>

Reverse complimenting means:

Line 27: Line 29:

** It would also error if the read is not paired.

*Later Release: work on unsorted BAM files.

+

** Prefer to sort at the same time as writing the FASTQ files rather than 1 step to sort and a 2nd step to write the FASTQs.

+

** Would be useful to have something implemented within the library (would be useful for dedupping, etc, but might be tricky to implement as API - sometimes the pair may be far apart.

+

*** maybe something like SamFile::getNextReadPair or SamFileHelper::getNextReadPair due to bookkeeping, may be useful to separate it out from the SamFile - either would return handle the logic and return a pair of records

+

*** At some point may have to start writing a file.

+

*** could attempt to just store the readname and FilePosition and use random access to jump around when a pair is found (but that would be inefficient if they are close) - and it would depend on how big the file is to whether or not readname & filePosition would still be storing too much information

+

*** A two scan approach on the original BAM may be the best

+

*Separate suggestion: implement is a smart pileup - which retains a clone of SamRecord until you see the mate pair

+

** useful in the dedupper and variant caller and etc but we probably need to discuss if we decide to implement it

+

=== Proposed Solution ===

Mktrost

Administrators

3,045

edits

Changes

BAM to FASTQ (view source)

Revision as of 17:19, 7 December 2010

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools