From Genome Analysis Wiki
Jump to: navigation, search

BamUtil: clipOverlap

785 bytes added, 09:56, 6 January 2014
no edit summary
The <code>clipOverlap</code> option on the [[bamUtil]] executable clips overlapping read pairs.
The input file and resulting output file is sorted by coordinate (or readName is if specified in the options).
When a read is clipped from the front:
= Usage =
./bam clipOverlap --in <inputFile> --out <outputFile> [--storeOrig <tag>] [--readName] [--stats] [--overlapsOnly] [--excludeFlags <flag>] [--poolSize <numRecords allowed to allocate>] [--poolSkipOverlap] [--noeof] [--params] 
= Parameters =
Required Parameters:
--in : the SAM/BAM file to clip overlaping read pairs for --out : the SAM/BAM file to be written
Optional Parameters:
--storeOrig : Store the original cigar in the specified tag. --readName : Original file is sorted by Read Name instead of coordinate. --stats : Print some statistics on the overlaps. --overlapsOnly : Only output overlapping read pairs --excludeFlags : Skip records with any of the specified flags set, default 0x70C --noeof : Do not expect an EOF block on a bam file. --params : Print the parameter settingsto stderr
Clipping By Coordinate Optional Parameters:
--poolSize : Maximum number of records the program is allowed to allocate for clipping on Coordinate sorted files. (Default: 1000000)
--poolSkipClip : Skip clipping reads to free of usable records when the
poolSize is hit. The default action is to just clip the
== Required Parameters==
== Optional Parameters ===== Store the original cigar string in a tag (<code>--storeOrig</code>) ===
Use <code>--storeOrig</code> followed by the two character TAG to store the original CIGAR.
=== Work on SAM/BAMs sorted by Read Name instead of by coordinate (<code>--readName</code>) ===
If your file is sorted by read name rather than by coordinate, specify <code>--readName</code>. The resulting file will still be sorted by read name.
=== Print Overlap Statistics (<code>--stats</code>)===
Print some basic overlap statistics to stderr.
** reads that are only clipped due to orientation are not counted in the other stats
 ==== Example Output ====
Overlap Statistics:
=== Print Only Overlaping Reads (<code>--overlapsOnly</code>)===
Only output Read Pairs that overlap. Drop all other records.
=== Skip Records with any of the Specified Flags (<code>--excludeFlags</code>)===
Skip records with any of the specified flags set, default 0x70C
By default skips reads with any of the following flags set:
* unmapped
* mate unmapped
* secondary alignment
* fails QC checks
* duplicate
==Clipping By Coordinate Optional Parameters===== Set the SAM/BAMs record buffer size (<code>--poolSize</code>) ===
To handle coordinate sorted files, SAM/BAM records are buffered until it is known that all following records will have a later start position. To prevent the program from running away with memory, a limit is set to the number of records that can be buffered (defaults to 1000000).
=== Skip Clipping Coordinate Sorted Files When Out of Records (<code>--poolSkipClip</code>) ===
When clipping coordinate sorted SAM/BAM files, we can run out of buffers available in the pool (<code>--poolSize</code>).
With either option, the resulting file will still be sorted by coordinate.
= Return Value =

Navigation menu