Line 1: |
Line 1: |
| = Overview of the <code>trimBam</code> function of <code>bamUtil</code> = | | = Overview of the <code>trimBam</code> function of <code>bamUtil</code> = |
− | The <code>trimBam</code> option on the [[bamUtil]] executable trims the end of reads in a SAM/BAM file, changing read ends to ‘N’ and quality to ‘!’. | + | The <code>trimBam</code> option on the [[bamUtil]] executable trims the end of reads in a SAM/BAM file, changing read ends to ‘N’ and quality to ‘!’, or by soft clipping (if command-line option, <code>--clip</code> is specified). |
− | | |
| | | |
| = Usage = | | = Usage = |
Line 17: |
Line 16: |
| Optionally --ignoreStrand/-i can be specified to ignore the strand information and treat forward/reverse the same. | | Optionally --ignoreStrand/-i can be specified to ignore the strand information and treat forward/reverse the same. |
| | | |
− | trimBam will modify the sequences to 'N', and the quality string to '!' | + | trimBam will modify the sequences to 'N', and the quality string to '!' unless the optional parameter --clip/-c is specified. If --clip/-c is specified, the ends will be soft clipped instead of modified. |
| + | |
| + | |
| + | == Soft Clipping Notes (--clip/-c) == |
| + | Available in version 1.0.14 and later. |
| + | |
| + | When soft clipping: |
| + | :* if the entire read would be soft clipped, no clipping is done, and instead the read is marked as unmapped |
| + | :* mate information is not updated (start positions/mapping may change after soft clipping) |
| + | :** run samtools fixmate to fix mate information (will first need to sort by read name) |
| + | :* output is not sorted (start positions/mapping may change after soft clipping) |
| + | :** run samtools sort to resort by coordinate (after fixmate) |
| + | :* soft clips already in the read are maintained or added to |
| + | :** if 3 bases were clipped and 2 are specified to be clipped, no change is made to that end |
| + | :** if 3 bases were clipped and 5 are specified to be clipped, 2 additional bases are clipped from that end |
| + | |
| + | === Fixing the mate/resorting === |
| + | In order to update the mate, samtools fixmate must be run. |
| + | |
| + | In order to reorder the file, samtools sort must be run. |
| | | |
| + | Notes about the samtools programs: |
| + | * samtools fixmate requires the file to be sorted by query name. |
| + | * samtools sort cannot write to pipes. |
| + | |
| + | ====Steps==== |
| + | # Run this program and pipe it into samtools sort by query name |
| + | #* <pre>./bam trimBam <your InputFile> - [#basesToTrim] [any other options] -c | samtools sort -n - tempQuerySort</pre> |
| + | # Run samtools fixmate and pipe it into samtools sort by position |
| + | #* <pre> samtools fixmate tempQuerySort.bam - | samtools sort - finalResult</pre> |
| | | |
| = Parameters = | | = Parameters = |
Line 29: |
Line 56: |
| Optional Parameters: | | Optional Parameters: |
| --ignoreStrand : ignore strand information - do not reverse left/right for reverse reads | | --ignoreStrand : ignore strand information - do not reverse left/right for reverse reads |
| + | --clip : soft clip the ends rather than setting to N/! |
| </pre> | | </pre> |
| + | {{PhoneHomeParamDesc}} |
| | | |
| == Required Parameters== | | == Required Parameters== |
− | === Input File (1st argument) ===
| |
| | | |
− | The first argument is the name of the SAM/BAM input file.
| + | {{InBAMInputFile|noParam=1st}} |
| + | {{OutBAMOutputFile|noParam=2nd}} |
| | | |
− | The program automatically determines if your input file is SAM/BAM/uncompressed BAM without any input other than a filename from the user, unless your input file is stdin.
| |
− |
| |
− | A <code>-</code> is used to indicate to read from stdin and the extension is used to determine the file type (no extension indicates SAM).
| |
− |
| |
− | {|border="1" cellspacing="0" cellpadding="2"
| |
− | |SAM/BAM/Uncompressed BAM from file
| |
− | | <code>yourFileName</code>
| |
− | |-
| |
− | |SAM from stdin
| |
− | | <code>-</code>
| |
− | |-
| |
− | |BAM from stdin
| |
− | | <code>-.bam</code>
| |
− | |-
| |
− | |Uncompressed BAM from stdin
| |
− | | <code>-.ubam</code>
| |
− | |}
| |
− |
| |
− |
| |
− | Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the <code>samtools</code> implementation so pipes between our tools and <code>samtools</code> are supported.
| |
− |
| |
− | === output File (2nd argument) ===
| |
− |
| |
− | The second argument is the name of the SAM/BAM output file.
| |
− |
| |
− | The file extension is used to determine whether to write SAM/BAM/uncompressed BAM. A <code>-</code> is used to indicate stdout and the extension for file type (no extension is SAM).
| |
− |
| |
− | {|border="1" cellspacing="0" cellpadding="2"
| |
− | |SAM to file
| |
− | | <code>yourFileName.sam</code>
| |
− | |-
| |
− | |BAM to file
| |
− | | <code>yourFileName.bam</code>
| |
− | |-
| |
− | |Uncompressed BAM to file
| |
− | | <code>yourFileName.ubam</code>
| |
− | |-
| |
− | |SAM to stdout
| |
− | | <code>-</code>
| |
− | |-
| |
− | |BAM to stdout
| |
− | | <code>-.bam</code>
| |
− | |-
| |
− | |Uncompressed BAM to stdout
| |
− | | <code>-.ubam</code>
| |
− | |}
| |
− |
| |
− |
| |
− | Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the <code>samtools</code> implementation so pipes between our tools and <code>samtools</code> are supported.
| |
| ==Optional parameters== | | ==Optional parameters== |
| === Number of Bases to Trim from Each End (3rd argument) === | | === Number of Bases to Trim from Each End (3rd argument) === |
| If the 3rd argument a number (with no flag/option), it is the number of bases to trim from each end of the reads. | | If the 3rd argument a number (with no flag/option), it is the number of bases to trim from each end of the reads. |
| | | |
− | ===Trim Bases from the Left (<code>--left</code> or <code>--L</code>)=== | + | ===Trim Bases from the Left (<code>--left</code> or <code>-L</code>)=== |
− | Use <code>--left</code> or <code>--L</code> followed by the number of bases to be trimmed from the left. | + | Use <code>--left</code> or <code>-L</code> followed by the number of bases to be trimmed from the left. |
| | | |
| By default reverse strands are reversed and then the left is trimmed, meaning that <code>--left</code> actually trims from the right of the read in the SAM/BAM for reverse reads. | | By default reverse strands are reversed and then the left is trimmed, meaning that <code>--left</code> actually trims from the right of the read in the SAM/BAM for reverse reads. |
| | | |
− | Use [[#Ignore the Strand when Trimming (--ignoreStrand or --i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same. | + | Use [[#Ignore the Strand when Trimming (--ignoreStrand or -i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same. |
| | | |
− | ===Trim Bases from the Right (<code>--right</code> or <code>--R</code>)=== | + | ===Trim Bases from the Right (<code>--right</code> or <code>-R</code>)=== |
− | Use <code>--right</code> or <code>--R</code> followed by the number of bases to be trimmed from the right. | + | Use <code>--right</code> or <code>-R</code> followed by the number of bases to be trimmed from the right. |
| | | |
| By default reverse strands are reversed and then the right is trimmed, meaning that <code>--right</code> actually trims from the left of the read in the SAM/BAM for reverse reads. | | By default reverse strands are reversed and then the right is trimmed, meaning that <code>--right</code> actually trims from the left of the read in the SAM/BAM for reverse reads. |
| | | |
− | Use [[#Ignore the Strand when Trimming (--ignoreStrand or --i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same. | + | Use [[#Ignore the Strand when Trimming (--ignoreStrand or -i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same. |
| + | |
| + | === Ignore the Strand when Trimming (<code>--ignoreStrand</code> or <code>-i</code>) === |
| + | Use <code>--ignoreStrand</code> or <code>-i</code> to ignore the strand information and treat forward/reverse the same. When <code>--ignoreStrand</code> or <code>-i</code> is set, do not reverse reverse reads prior to trimming left/right. |
| + | |
| + | === SoftClip the Ends (<code>--clip</code> or <code>-c</code>) === |
| + | Use <code>--clip</code> or <code>-c</code> to soft clip the ends instead of setting to N/! (or set to unmapped if the entire read would be soft clipped). |
| | | |
− | === Ignore the Strand when Trimming (<code>--ignoreStrand</code> or <code>--i</code>) ===
| + | See [[#Soft Clipping Notes (--clip/-c)|Soft Clipping Notes]] for more information about clipping and post processing that will need to be done. |
− | Use <code>--ignoreStrand</code> or <code>--i</code> to ignore the strand information and treat forward/reverese the same. When <code>--ignoreStrand</code> or <code>--i</code> is set, do not reverse reverse reads prior to trimming left/right.
| |
| | | |
| {{noeofBGZFParameter}} | | {{noeofBGZFParameter}} |