BamUtil: clipOverlap
Overview of the clipOverlap
function of bamUtil
The clipOverlap
option on the bamUtil executable clips overlapping read pairs.
RESTRICTIONS
- Assumes the file is sorted by ReadName
- Assumes only 2 reads have matching ReadNames
- It matches in pairs, so if there are 3, the first 2 will be matched and compared, but the 3rd won't. If there are 4, the first 2 will be matched and the last 2 will be matched and compared.
- Only mapped reads will be clipped
Rules for Clipping
Clipping from the front
The first operation after the softclip will be a Match/Mismatch, meaning that any trailing pads, deletions, insertions, or skips will also be soft clipped.
Clip Location | How it is handled |
---|---|
If the clip position falls in a skip/deletion | Removes the entire skip/deletion |
If the position immediately after the clip is a skip/deletion | Also removes the skip/deletion |
If the position immediately after the clip is an Insert | Softclips the insert |
If the position immediately after the clip is a Pad | Removes the pad |
Clip occurs at the last match/mismatch position of the read (the entire read is clipped) | Entire read is soft clipped, 0-based position is left as the original (not modified) |
Clipping from the back
Clip Location | How it is handled |
---|---|
If the clip position falls in a skip/deletion | Removes the entire skip/deletion |
If the position immediately before the clip is a deletion/skip/pad | Remove the deletion/skip/pad |
If the position immediately before the clip is an insertion | Leave the insertion, even if it results in a 70M3I27S |
Clip occurs at the first position of the read (the entire read is clipped) | Entire read is soft clipped, 0-based position is left as the original (not modified) |
Usage
Parameters
Required Parameters: --in : the SAM/BAM file to be read --out : the SAM/BAM file to be written Optional Parameters: --noeof : do not expect an EOF block on a bam file. --params : print the parameter settings
Input File (--in
)
Use --in
followed by your file name to specify the SAM/BAM input file.
The program automatically determines if your input file is SAM/BAM/uncompressed BAM without any input other than a filename from the user, unless your input file is stdin.
A -
is used to indicate to read from stdin and the extension is used to determine the file type (no extension indicates SAM).
SAM/BAM/Uncompressed BAM from file | --in yourFileName
|
SAM from stdin | --in - |
BAM from stdin | --in -.bam |
Uncompressed BAM from stdin | --in -.ubam |
Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools
implementation so pipes between our tools and samtools
are supported.
Output File (--out
)
Use --out
followed by your file name to specify the SAM/BAM output file.
The file extension is used to determine whether to write SAM/BAM/uncompressed BAM. A -
is used to indicate stdout and the extension for file type (no extension is SAM).
SAM to file | --out yourFileName.sam
|
BAM to file | --out yourFileName.bam
|
Uncompressed BAM to file | --out yourFileName.ubam
|
SAM to stdout | --out -
|
BAM to stdout | --out -.bam
|
Uncompressed BAM to stdout | --out -.ubam
|
Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools
implementation so pipes between our tools and samtools
are supported.
Return Value
Returns the SamStatus for the reads/writes.