Difference between revisions of "BamUtil: clipOverlap"
(Created page with 'clipOverlap Category:BAM Software Category:Software = Overview of the <code>clipOverlap</code> function of <code>bamUtil</code> = The <code>clipOver…') |
|||
Line 6: | Line 6: | ||
The <code>clipOverlap</code> option on the [[bamUtil]] executable clips overlapping read pairs. | The <code>clipOverlap</code> option on the [[bamUtil]] executable clips overlapping read pairs. | ||
− | + | = RESTRICTIONS = | |
− | |||
− | |||
+ | *Assumes the file is sorted by ReadName | ||
+ | *Assumes only 2 reads have matching ReadNames | ||
+ | **It matches in pairs, so if there are 3, the first 2 will be matched and compared, but the 3rd won't. If there are 4, the first 2 will be matched and the last 2 will be matched and compared. | ||
+ | *Only mapped reads will be clipped | ||
= Rules for Clipping = | = Rules for Clipping = | ||
Line 30: | Line 32: | ||
| Removes the pad | | Removes the pad | ||
|- | |- | ||
− | | | + | | Clip occurs at the last match/mismatch position of the read (the entire read is clipped) |
− | | Entire read is soft clipped, 0-based position | + | | Entire read is soft clipped, 0-based position is left as the original (not modified) |
|} | |} | ||
Line 39: | Line 41: | ||
! Clip Location !! How it is handled | ! Clip Location !! How it is handled | ||
|- | |- | ||
+ | |If the clip position falls in a skip/deletion | ||
+ | | Removes the entire skip/deletion | ||
+ | |- | ||
+ | |If the position immediately before the clip is a deletion/skip/pad | ||
+ | | Remove the deletion/skip/pad | ||
+ | |- | ||
+ | |If the position immediately before the clip is an insertion | ||
+ | | Leave the insertion, even if it results in a 70M3I27S | ||
+ | |- | ||
+ | |Clip occurs at the first position of the read (the entire read is clipped) | ||
+ | | Entire read is soft clipped, 0-based position is left as the original (not modified) | ||
|} | |} | ||
Revision as of 16:20, 28 October 2011
Overview of the clipOverlap
function of bamUtil
The clipOverlap
option on the bamUtil executable clips overlapping read pairs.
RESTRICTIONS
- Assumes the file is sorted by ReadName
- Assumes only 2 reads have matching ReadNames
- It matches in pairs, so if there are 3, the first 2 will be matched and compared, but the 3rd won't. If there are 4, the first 2 will be matched and the last 2 will be matched and compared.
- Only mapped reads will be clipped
Rules for Clipping
Clipping from the front
The first operation after the softclip will be a Match/Mismatch, meaning that any trailing pads, deletions, insertions, or skips will also be soft clipped.
Clip Location | How it is handled |
---|---|
If the clip position falls in a skip/deletion | Removes the entire skip/deletion |
If the position immediately after the clip is a skip/deletion | Also removes the skip/deletion |
If the position immediately after the clip is an Insert | Softclips the insert |
If the position immediately after the clip is a Pad | Removes the pad |
Clip occurs at the last match/mismatch position of the read (the entire read is clipped) | Entire read is soft clipped, 0-based position is left as the original (not modified) |
Clipping from the back
Clip Location | How it is handled |
---|---|
If the clip position falls in a skip/deletion | Removes the entire skip/deletion |
If the position immediately before the clip is a deletion/skip/pad | Remove the deletion/skip/pad |
If the position immediately before the clip is an insertion | Leave the insertion, even if it results in a 70M3I27S |
Clip occurs at the first position of the read (the entire read is clipped) | Entire read is soft clipped, 0-based position is left as the original (not modified) |
Usage
Parameters
Required Parameters: --in : the SAM/BAM file to be read --out : the SAM/BAM file to be written Optional Parameters: --noeof : do not expect an EOF block on a bam file. --params : print the parameter settings
Input File (--in
)
Use --in
followed by your file name to specify the SAM/BAM input file.
The program automatically determines if your input file is SAM/BAM/uncompressed BAM without any input other than a filename from the user, unless your input file is stdin.
A -
is used to indicate to read from stdin and the extension is used to determine the file type (no extension indicates SAM).
SAM/BAM/Uncompressed BAM from file | --in yourFileName
|
SAM from stdin | --in - |
BAM from stdin | --in -.bam |
Uncompressed BAM from stdin | --in -.ubam |
Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools
implementation so pipes between our tools and samtools
are supported.
Output File (--out
)
Use --out
followed by your file name to specify the SAM/BAM output file.
The file extension is used to determine whether to write SAM/BAM/uncompressed BAM. A -
is used to indicate stdout and the extension for file type (no extension is SAM).
SAM to file | --out yourFileName.sam
|
BAM to file | --out yourFileName.bam
|
Uncompressed BAM to file | --out yourFileName.ubam
|
SAM to stdout | --out -
|
BAM to stdout | --out -.bam
|
Uncompressed BAM to stdout | --out -.ubam
|
Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools
implementation so pipes between our tools and samtools
are supported.
Return Value
Returns the SamStatus for the reads/writes.