Line 1: |
Line 1: |
| = Overview of the <code>trimBam</code> function of <code>bamUtil</code> = | | = Overview of the <code>trimBam</code> function of <code>bamUtil</code> = |
− | The <code>trimBam</code> option on the [[bamUtil]] executable trims the end of reads in a SAM/BAM file, changing read ends to ‘N’ and quality to ‘!’. | + | The <code>trimBam</code> option on the [[bamUtil]] executable trims the end of reads in a SAM/BAM file, changing read ends to ‘N’ and quality to ‘!’, or by soft clipping (if command-line option, <code>--clip</code> is specified). |
− | | |
| | | |
| = Usage = | | = Usage = |
Line 10: |
Line 9: |
| Alternately, the number of bases from each side can be specified (either or both -L/-R (--left/--right) can be specified): | | Alternately, the number of bases from each side can be specified (either or both -L/-R (--left/--right) can be specified): |
| ./bam trimBam [inFile] [outFile] -L [num-bases-to-trim-from-left] -R [num-bases-to-trim-from-right] | | ./bam trimBam [inFile] [outFile] -L [num-bases-to-trim-from-left] -R [num-bases-to-trim-from-right] |
− | By default Left/Right is as the reads are in the SAM/BAM file.
| |
| | | |
− | Optionally --reverse/-r can be specified to reverse the left/right for reverse reads | + | By default reverse strands are reversed and then the left & right are trimmed. |
| + | |
| + | This means that --left actually trims from the right of the read in the SAM/BAM for reverse reads. |
| + | |
| + | Optionally --ignoreStrand/-i can be specified to ignore the strand information and treat forward/reverse the same. |
| + | |
| + | trimBam will modify the sequences to 'N', and the quality string to '!' unless the optional parameter --clip/-c is specified. If --clip/-c is specified, the ends will be soft clipped instead of modified. |
| + | |
| + | |
| + | == Soft Clipping Notes (--clip/-c) == |
| + | Available in version 1.0.14 and later. |
| + | |
| + | When soft clipping: |
| + | :* if the entire read would be soft clipped, no clipping is done, and instead the read is marked as unmapped |
| + | :* mate information is not updated (start positions/mapping may change after soft clipping) |
| + | :** run samtools fixmate to fix mate information (will first need to sort by read name) |
| + | :* output is not sorted (start positions/mapping may change after soft clipping) |
| + | :** run samtools sort to resort by coordinate (after fixmate) |
| + | :* soft clips already in the read are maintained or added to |
| + | :** if 3 bases were clipped and 2 are specified to be clipped, no change is made to that end |
| + | :** if 3 bases were clipped and 5 are specified to be clipped, 2 additional bases are clipped from that end |
| + | |
| + | === Fixing the mate/resorting === |
| + | In order to update the mate, samtools fixmate must be run. |
| + | |
| + | In order to reorder the file, samtools sort must be run. |
| + | |
| + | Notes about the samtools programs: |
| + | * samtools fixmate requires the file to be sorted by query name. |
| + | * samtools sort cannot write to pipes. |
| | | |
− | trimBam will modify the sequences to 'N', and the quality string to '!' | + | ====Steps==== |
| + | # Run this program and pipe it into samtools sort by query name |
| + | #* <pre>./bam trimBam <your InputFile> - [#basesToTrim] [any other options] -c | samtools sort -n - tempQuerySort</pre> |
| + | # Run samtools fixmate and pipe it into samtools sort by position |
| + | #* <pre> samtools fixmate tempQuerySort.bam - | samtools sort - finalResult</pre> |
| | | |
| = Parameters = | | = Parameters = |
Line 22: |
Line 53: |
| outFile : the SAM/BAM file to be written | | outFile : the SAM/BAM file to be written |
| num-bases-to-trim-on-each-side : the number of bases/qualities to trim from each side | | num-bases-to-trim-on-each-side : the number of bases/qualities to trim from each side |
− | Instead of num-bases-to-trim-on-each-side, -L/-R (or --left/--right) can be specified to indicate the number of bases to trim from the left/right | + | Instead of num-bases-to-trim-on-each-side, -L/-R (or --left/--right) can be specified to indicate the number of bases to trim from the left/right (left/right are reversed for reverse strands) |
| Optional Parameters: | | Optional Parameters: |
− | --reverse : reverse the left/right for reverse reads | + | --ignoreStrand : ignore strand information - do not reverse left/right for reverse reads |
| + | --clip : soft clip the ends rather than setting to N/! |
| </pre> | | </pre> |
| + | {{PhoneHomeParamDesc}} |
| + | |
| + | == Required Parameters== |
| + | |
| + | {{InBAMInputFile|noParam=1st}} |
| + | {{OutBAMOutputFile|noParam=2nd}} |
| + | |
| + | ==Optional parameters== |
| + | === Number of Bases to Trim from Each End (3rd argument) === |
| + | If the 3rd argument a number (with no flag/option), it is the number of bases to trim from each end of the reads. |
| + | |
| + | ===Trim Bases from the Left (<code>--left</code> or <code>-L</code>)=== |
| + | Use <code>--left</code> or <code>-L</code> followed by the number of bases to be trimmed from the left. |
| + | |
| + | By default reverse strands are reversed and then the left is trimmed, meaning that <code>--left</code> actually trims from the right of the read in the SAM/BAM for reverse reads. |
| + | |
| + | Use [[#Ignore the Strand when Trimming (--ignoreStrand or -i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same. |
| + | |
| + | ===Trim Bases from the Right (<code>--right</code> or <code>-R</code>)=== |
| + | Use <code>--right</code> or <code>-R</code> followed by the number of bases to be trimmed from the right. |
| + | |
| + | By default reverse strands are reversed and then the right is trimmed, meaning that <code>--right</code> actually trims from the left of the read in the SAM/BAM for reverse reads. |
| + | |
| + | Use [[#Ignore the Strand when Trimming (--ignoreStrand or -i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same. |
| + | |
| + | === Ignore the Strand when Trimming (<code>--ignoreStrand</code> or <code>-i</code>) === |
| + | Use <code>--ignoreStrand</code> or <code>-i</code> to ignore the strand information and treat forward/reverse the same. When <code>--ignoreStrand</code> or <code>-i</code> is set, do not reverse reverse reads prior to trimming left/right. |
| + | |
| + | === SoftClip the Ends (<code>--clip</code> or <code>-c</code>) === |
| + | Use <code>--clip</code> or <code>-c</code> to soft clip the ends instead of setting to N/! (or set to unmapped if the entire read would be soft clipped). |
| + | |
| + | See [[#Soft Clipping Notes (--clip/-c)|Soft Clipping Notes]] for more information about clipping and post processing that will need to be done. |
| + | |
| + | {{noeofBGZFParameter}} |
| + | |
| + | {{PhoneHomeParameters}} |
| | | |
| = Return Value = | | = Return Value = |
− | Returns the SamStatus for the reads/writes. 0 on success. | + | Returns the SamStatus for the reads/writes. 0 on success, non-0 on failure. |
| | | |
| = Examples = | | = Examples = |
Line 48: |
Line 116: |
| </pre> | | </pre> |
| | | |
− | ==Trim different bases from each side, but treat forward & reverse the same== | + | |
− | Example Input, trimming 1 base from the left and 2 bases from the right: | + | ==Trim different bases from each side, but treat reverse strands the opposite== |
| + | Example Input, trimming 1 base from the left and 2 bases from the right for forward strands and do the opposite for reverse strands: |
| <pre> | | <pre> |
| ./bin trimBam testFiles/testSam.sam results/trimSam.sam -L 1 -R 2 | | ./bin trimBam testFiles/testSam.sam results/trimSam.sam -L 1 -R 2 |
Line 60: |
Line 129: |
| #Bases to trim from the left of forward strands : 1 | | #Bases to trim from the left of forward strands : 1 |
| #Bases to trim from the right of forward strands: 2 | | #Bases to trim from the right of forward strands: 2 |
− | #Bases to trim from the left of reverse strands : 1 | + | #Bases to trim from the left of reverse strands : 2 |
− | #Bases to trim from the right of reverse strands : 2 | + | #Bases to trim from the right of reverse strands : 1 |
| | | |
| Number of records read = 10 | | Number of records read = 10 |
Line 68: |
Line 137: |
| | | |
| | | |
− | ==Trim different bases from each side, but treat reverse strands the opposite== | + | ==Trim different bases from each side, but treat forward & reverse the same== |
− | Example Input, trimming 1 base from the left and 2 bases from the right for forward strands and do the opposite for reverse strands: | + | Example Input, trimming 1 base from the left and 2 bases from the right ignoring strand information: |
| <pre> | | <pre> |
− | ./bin trimBam testFiles/testSam.sam results/trimSam.sam -L 1 -R 2 --reverse | + | ./bin trimBam testFiles/testSam.sam results/trimSam.sam -L 1 -R 2 --ignoreStrand |
| </pre> | | </pre> |
| Example Output: | | Example Output: |
Line 80: |
Line 149: |
| #Bases to trim from the left of forward strands : 1 | | #Bases to trim from the left of forward strands : 1 |
| #Bases to trim from the right of forward strands: 2 | | #Bases to trim from the right of forward strands: 2 |
− | #Bases to trim from the left of reverse strands : 2 | + | #Bases to trim from the left of reverse strands : 1 |
− | #Bases to trim from the right of reverse strands : 1 | + | #Bases to trim from the right of reverse strands : 2 |
| | | |
| Number of records read = 10 | | Number of records read = 10 |
| Number of records written = 10 | | Number of records written = 10 |
| </pre> | | </pre> |
| + | |
| | | |
| | | |