3,045
edits
Changes
From Genome Analysis Wiki
Update clipOverlap
--storeOrig : Store the original cigar in the specified tag.
--readName : Original file is sorted by Read Name instead of coordinate.
--noeof : Do not expect an EOF block on a bam file.
--params : Print the parameter settings
Clipping By Coordinate Optional Parameters:
--poolSize : Maximum number of records the program is allowed to allocate
for clipping on Coordinate sorted files. (Default: 5000)
--poolSkipClip : Skip clipping reads to free of usable records when the
poolSize is hit. The default action is to just clip the
first read in a pair to free up the record.
</pre>
== Set the SAM/BAMs record buffer size (<code>--poolSize</code>) ==
To handle coordinate sorted files, SAM/BAM records are buffered until it is known that all following records will have a later start position. To prevent the program from running away with memory, a limit is set to the number of records that can be buffered (defaults to 5005000). If the poolSize is exhausted, the code will write the earliest record awaiting its overlapping mate and any previous records that are being buffered. Depending on whether or not <code>--poolSkipClip</code> is set, it will either, clip the end of the read at the position where the mate is supposed to start or it will not clip either read. An error message is written to stderr to indicate that one of these has happened and an unsuccessful return value is returned (2: NO_MORE_RECS). The resulting file will still be sorted by coordinate. == Skip Clipping Coordinate Sorted Files When Out of Records (<code>--poolSkipClip</code>) == When clipping coordinate sorted SAM/BAM files, we can run out of buffers available in the pool (<code>--poolSize</code>). By default when we run out of pooled records, we can no longer read in new records, so instead we release some of the stored records. We do this by dropping the first record that is being held awaiting its mate.