Changes

From Genome Analysis Wiki
Jump to navigationJump to search
785 bytes added ,  09:56, 6 January 2014
no edit summary
Line 6: Line 6:  
The <code>clipOverlap</code> option on the [[bamUtil]] executable clips overlapping read pairs.
 
The <code>clipOverlap</code> option on the [[bamUtil]] executable clips overlapping read pairs.
   −
The input file and resulting output file is sorted by coordinate (or readName is specified in the options).
+
The input file and resulting output file is sorted by coordinate (or readName if specified in the options).
    
When a read is clipped from the front:
 
When a read is clipped from the front:
Line 84: Line 84:     
= Usage =
 
= Usage =
  ./bam clipOverlap --in <inputFile> --out <outputFile> [--storeOrig <tag>] [--readName] [--poolSize <numRecords allowed to allocate>] [--noeof] [--params]
+
  ./bam clipOverlap --in <inputFile> --out <outputFile> [--storeOrig <tag>] [--readName] [--stats] [--overlapsOnly] [--excludeFlags <flag>] [--poolSize <numRecords allowed to allocate>] [--poolSkipOverlap] [--noeof] [--params]
 +
 
    
= Parameters =
 
= Parameters =
 
<pre>
 
<pre>
 
Required Parameters:
 
Required Parameters:
--in : the SAM/BAM file to clip overlaping read pairs for
+
--in           : the SAM/BAM file to clip overlaping read pairs for
--out       : the SAM/BAM file to be written
+
--out         : the SAM/BAM file to be written
 
Optional Parameters:
 
Optional Parameters:
--storeOrig   : Store the original cigar in the specified tag.
+
--storeOrig   : Store the original cigar in the specified tag.
--readName   : Original file is sorted by Read Name instead of coordinate.
+
--readName     : Original file is sorted by Read Name instead of coordinate.
--stats       : Print some statistics on the overlaps.
+
--stats       : Print some statistics on the overlaps.
--noeof       : Do not expect an EOF block on a bam file.
+
--overlapsOnly : Only output overlapping read pairs
--params     : Print the parameter settings
+
--excludeFlags : Skip records with any of the specified flags set, default 0x70C
 +
--noeof       : Do not expect an EOF block on a bam file.
 +
--params       : Print the parameter settings to stderr
 
Clipping By Coordinate Optional Parameters:
 
Clipping By Coordinate Optional Parameters:
--poolSize   : Maximum number of records the program is allowed to allocate
+
--poolSize     : Maximum number of records the program is allowed to allocate
                for clipping on Coordinate sorted files. (Default: 1000000)
+
                for clipping on Coordinate sorted files. (Default: 1000000)
 
--poolSkipClip : Skip clipping reads to free of usable records when the
 
--poolSkipClip : Skip clipping reads to free of usable records when the
 
                poolSize is hit. The default action is to just clip the
 
                poolSize is hit. The default action is to just clip the
Line 105: Line 108:  
</pre>
 
</pre>
   −
 
+
== Required Parameters==
 
{{inBAMInputFile}}
 
{{inBAMInputFile}}
 
{{outBAMOutputFile}}
 
{{outBAMOutputFile}}
{{noeofBGZFParameter}}
  −
{{paramsParameter}}
     −
== Store the original cigar string in a tag (<code>--storeOrig</code>) ==
+
== Optional Parameters ==
 +
=== Store the original cigar string in a tag (<code>--storeOrig</code>) ===
    
Use <code>--storeOrig</code> followed by the two character TAG to store the original CIGAR.
 
Use <code>--storeOrig</code> followed by the two character TAG to store the original CIGAR.
Line 118: Line 120:       −
== Work on SAM/BAMs sorted by Read Name instead of by coordinate (<code>--readName</code>) ==
+
=== Work on SAM/BAMs sorted by Read Name instead of by coordinate (<code>--readName</code>) ===
    
If your file is sorted by read name rather than by coordinate, specify <code>--readName</code>.  The resulting file will still be sorted by read name.
 
If your file is sorted by read name rather than by coordinate, specify <code>--readName</code>.  The resulting file will still be sorted by read name.
      −
== Print Overlap Statistics (<code>--stats</code>)==
+
=== Print Overlap Statistics (<code>--stats</code>)===
 
Print some basic overlap statistics to stderr.
 
Print some basic overlap statistics to stderr.
   Line 135: Line 137:  
** reads that are only clipped due to orientation are not counted in the other stats
 
** reads that are only clipped due to orientation are not counted in the other stats
   −
 
+
==== Example Output ====
=== Example Output ===
   
<pre>
 
<pre>
 
Overlap Statistics:
 
Overlap Statistics:
Line 147: Line 148:  
</pre>
 
</pre>
    +
=== Print Only Overlaping Reads (<code>--overlapsOnly</code>)===
 +
Only output Read Pairs that overlap.  Drop all other records.
 +
 +
=== Skip Records with any of the Specified Flags (<code>--excludeFlags</code>)===
 +
Skip records with any of the specified flags set, default 0x70C
 +
 +
By default skips reads with any of the following flags set:
 +
* unmapped
 +
* mate unmapped
 +
* secondary alignment
 +
* fails QC checks
 +
* duplicate
 +
 +
{{noeofBGZFParameter}}
 +
{{paramsParameter}}
   −
== Set the SAM/BAMs record buffer size (<code>--poolSize</code>) ==
+
==Clipping By Coordinate Optional Parameters==
 +
=== Set the SAM/BAMs record buffer size (<code>--poolSize</code>) ===
    
To handle coordinate sorted files, SAM/BAM records are buffered until it is known that all following records will have a later start position.  To prevent the program from running away with memory, a limit is set to the number of records that can be buffered (defaults to 1000000).
 
To handle coordinate sorted files, SAM/BAM records are buffered until it is known that all following records will have a later start position.  To prevent the program from running away with memory, a limit is set to the number of records that can be buffered (defaults to 1000000).
Line 159: Line 176:       −
== Skip Clipping Coordinate Sorted Files When Out of Records (<code>--poolSkipClip</code>) ==
+
=== Skip Clipping Coordinate Sorted Files When Out of Records (<code>--poolSkipClip</code>) ===
    
When clipping coordinate sorted SAM/BAM files, we can run out of buffers available in the pool (<code>--poolSize</code>).
 
When clipping coordinate sorted SAM/BAM files, we can run out of buffers available in the pool (<code>--poolSize</code>).
Line 171: Line 188:  
With either option, the resulting file will still be sorted by coordinate.
 
With either option, the resulting file will still be sorted by coordinate.
    +
{{PhoneHomeParameters}}
    
= Return Value =
 
= Return Value =

Navigation menu