From Genome Analysis Wiki
no edit summary
The <code>clipOverlap</code> option on the [[bamUtil]] executable clips overlapping read pairs.
The input file and resulting output file is sorted by coordinate (or readName
is specified in the options).
When a read is clipped from the front:
= Usage =
./bam clipOverlap --in <inputFile> --out <outputFile> [--storeOrig <tag>] [--readName] [--poolSize <numRecords allowed to allocate>] [--noeof] [--params]
= Parameters =
--in : the SAM/BAM file to clip overlaping read pairs for --out
: the SAM/BAM file to be written
: Store the original cigar in the specified tag. --readName : Original file is sorted by Read Name instead of coordinate. --stats : Print some statistics on the overlaps. --noeof : Do not expect an EOF block on a bam file. --params : Print the parameter settings
Clipping By Coordinate Optional Parameters:
: Maximum number of records the program is allowed to allocate for clipping on Coordinate sorted files. (Default: 1000000)
--poolSkipClip : Skip clipping reads to free of usable records when the
poolSize is hit. The default action is to just clip the
== Store the original cigar string in a tag (<code>--storeOrig</code>) ==
Use <code>--storeOrig</code> followed by the two character TAG to store the original CIGAR.
== Work on SAM/BAMs sorted by Read Name instead of by coordinate (<code>--readName</code>) ==
If your file is sorted by read name rather than by coordinate, specify <code>--readName</code>. The resulting file will still be sorted by read name.
== Print Overlap Statistics (<code>--stats</code>)==
Print some basic overlap statistics to stderr.
** reads that are only clipped due to orientation are not counted in the other stats
=== Example Output ===
== Set the SAM/BAMs record buffer size (<code>--poolSize</code>) ==
To handle coordinate sorted files, SAM/BAM records are buffered until it is known that all following records will have a later start position. To prevent the program from running away with memory, a limit is set to the number of records that can be buffered (defaults to 1000000).
== Skip Clipping Coordinate Sorted Files When Out of Records (<code>--poolSkipClip</code>) ==
When clipping coordinate sorted SAM/BAM files, we can run out of buffers available in the pool (<code>--poolSize</code>).
With either option, the resulting file will still be sorted by coordinate.
= Return Value =