From Genome Analysis Wiki
Jump to navigationJump to search
1,283 bytes added
, 17:19, 7 December 2010
Line 18: |
Line 18: |
| <p> | | <p> |
| The code needs to figure out the strand and reverse compliment the reverse strands. | | The code needs to figure out the strand and reverse compliment the reverse strands. |
| + | <p> |
| + | One file for the first in pair & 1 file for the 2nd in the pair - the order in the two files must match. |
| <p> | | <p> |
| Reverse complimenting means: | | Reverse complimenting means: |
Line 27: |
Line 29: |
| ** It would also error if the read is not paired. | | ** It would also error if the read is not paired. |
| *Later Release: work on unsorted BAM files. | | *Later Release: work on unsorted BAM files. |
| + | ** Prefer to sort at the same time as writing the FASTQ files rather than 1 step to sort and a 2nd step to write the FASTQs. |
| + | ** Would be useful to have something implemented within the library (would be useful for dedupping, etc, but might be tricky to implement as API - sometimes the pair may be far apart. |
| + | *** maybe something like SamFile::getNextReadPair or SamFileHelper::getNextReadPair due to bookkeeping, may be useful to separate it out from the SamFile - either would return handle the logic and return a pair of records |
| + | *** At some point may have to start writing a file. |
| + | *** could attempt to just store the readname and FilePosition and use random access to jump around when a pair is found (but that would be inefficient if they are close) - and it would depend on how big the file is to whether or not readname & filePosition would still be storing too much information |
| + | *** A two scan approach on the original BAM may be the best |
| + | *Separate suggestion: implement is a smart pileup - which retains a clone of SamRecord until you see the mate pair |
| + | ** useful in the dedupper and variant caller and etc but we probably need to discuss if we decide to implement it |
| + | |
| | | |
| === Proposed Solution === | | === Proposed Solution === |