Difference between revisions of "BamUtil: clipOverlap"

From Genome Analysis Wiki
Jump to navigationJump to search
(Created page with 'clipOverlap Category:BAM Software Category:Software = Overview of the <code>clipOverlap</code> function of <code>bamUtil</code> = The <code>clipOver…')
 
Line 6: Line 6:
 
The <code>clipOverlap</code> option on the [[bamUtil]] executable clips overlapping read pairs.
 
The <code>clipOverlap</code> option on the [[bamUtil]] executable clips overlapping read pairs.
  
'''RESTRICTIONS'''
+
= RESTRICTIONS =
 
 
This tool assumes the file is sorted by ReadName.
 
  
 +
*Assumes the file is sorted by ReadName
 +
*Assumes only 2 reads have matching ReadNames
 +
**It matches in pairs, so if there are 3, the first 2 will be matched and compared, but the 3rd won't.  If there are 4, the first 2 will be matched and the last 2 will be matched and compared.
 +
*Only mapped reads will be clipped
  
 
= Rules for Clipping =
 
= Rules for Clipping =
Line 30: Line 32:
 
| Removes the pad
 
| Removes the pad
 
|-
 
|-
| Last match/mismatch position of the read (the entire read is clipped)
+
| Clip occurs at the last match/mismatch position of the read (the entire read is clipped)
| Entire read is soft clipped, 0-based position changed left as the original (not modified)
+
| Entire read is soft clipped, 0-based position is left as the original (not modified)
 
|}
 
|}
  
Line 39: Line 41:
 
! Clip Location !! How it is handled
 
! Clip Location !! How it is handled
 
|-
 
|-
 +
|If the clip position falls in a skip/deletion
 +
| Removes the entire skip/deletion
 +
|-
 +
|If the position immediately before the clip is a deletion/skip/pad
 +
| Remove the deletion/skip/pad
 +
|-
 +
|If the position immediately before the clip is an insertion
 +
| Leave the insertion, even if it results in a 70M3I27S
 +
|-
 +
|Clip occurs at the first position of the read (the entire read is clipped)
 +
| Entire read is soft clipped, 0-based position is left as the original (not modified)
 
|}
 
|}
  

Revision as of 16:20, 28 October 2011


Overview of the clipOverlap function of bamUtil

The clipOverlap option on the bamUtil executable clips overlapping read pairs.

RESTRICTIONS

  • Assumes the file is sorted by ReadName
  • Assumes only 2 reads have matching ReadNames
    • It matches in pairs, so if there are 3, the first 2 will be matched and compared, but the 3rd won't. If there are 4, the first 2 will be matched and the last 2 will be matched and compared.
  • Only mapped reads will be clipped

Rules for Clipping

Clipping from the front

The first operation after the softclip will be a Match/Mismatch, meaning that any trailing pads, deletions, insertions, or skips will also be soft clipped.

Clip Location How it is handled
If the clip position falls in a skip/deletion Removes the entire skip/deletion
If the position immediately after the clip is a skip/deletion Also removes the skip/deletion
If the position immediately after the clip is an Insert Softclips the insert
If the position immediately after the clip is a Pad Removes the pad
Clip occurs at the last match/mismatch position of the read (the entire read is clipped) Entire read is soft clipped, 0-based position is left as the original (not modified)

Clipping from the back

Clip Location How it is handled
If the clip position falls in a skip/deletion Removes the entire skip/deletion
If the position immediately before the clip is a deletion/skip/pad Remove the deletion/skip/pad
If the position immediately before the clip is an insertion Leave the insertion, even if it results in a 70M3I27S
Clip occurs at the first position of the read (the entire read is clipped) Entire read is soft clipped, 0-based position is left as the original (not modified)


Usage

Parameters

	Required Parameters:
		--in         : the SAM/BAM file to be read
		--out        : the SAM/BAM file to be written
	Optional Parameters:
		--noeof      : do not expect an EOF block on a bam file.
		--params     : print the parameter settings


Input File (--in)

Use --in followed by your file name to specify the SAM/BAM input file.

The program automatically determines if your input file is SAM/BAM/uncompressed BAM without any input other than a filename from the user, unless your input file is stdin.

A - is used to indicate to read from stdin and the extension is used to determine the file type (no extension indicates SAM).

SAM/BAM/Uncompressed BAM from file --in yourFileName
SAM from stdin --in -
BAM from stdin --in -.bam
Uncompressed BAM from stdin --in -.ubam


Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools implementation so pipes between our tools and samtools are supported.

Output File (--out)

Use --out followed by your file name to specify the SAM/BAM output file.

The file extension is used to determine whether to write SAM/BAM/uncompressed BAM. A - is used to indicate stdout and the extension for file type (no extension is SAM).

SAM to file --out yourFileName.sam
BAM to file --out yourFileName.bam
Uncompressed BAM to file --out yourFileName.ubam
SAM to stdout --out -
BAM to stdout --out -.bam
Uncompressed BAM to stdout --out -.ubam


Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools implementation so pipes between our tools and samtools are supported.



Return Value

Returns the SamStatus for the reads/writes.


Example Output