Changes

From Genome Analysis Wiki
Jump to navigationJump to search
409 bytes added ,  23:28, 12 November 2017
Line 1: Line 1:  
= Overview of the <code>trimBam</code> function of <code>bamUtil</code> =
 
= Overview of the <code>trimBam</code> function of <code>bamUtil</code> =
The <code>trimBam</code> option on the [[bamUtil]] executable trims the end of reads in a SAM/BAM file, changing read ends to ‘N’ and quality to ‘!’.
+
The <code>trimBam</code> option on the [[bamUtil]] executable trims the end of reads in a SAM/BAM file, changing read ends to ‘N’ and quality to ‘!’, or by soft clipping (if command-line option, <code>--clip</code> is specified).
 
      
= Usage =
 
= Usage =
Line 17: Line 16:  
Optionally --ignoreStrand/-i can be specified to ignore the strand information and treat forward/reverse the same.
 
Optionally --ignoreStrand/-i can be specified to ignore the strand information and treat forward/reverse the same.
   −
trimBam will modify the sequences to 'N', and the quality string to '!'
+
trimBam will modify the sequences to 'N', and the quality string to '!' unless the optional parameter --clip/-c is specified.  If --clip/-c is specified, the ends will be soft clipped instead of modified.
 +
 
 +
 
 +
== Soft Clipping Notes (--clip/-c) ==
 +
Available in version 1.0.14 and later.
 +
 
 +
When soft clipping:
 +
:* if the entire read would be soft clipped, no clipping is done, and instead the read is marked as unmapped
 +
:* mate information is not updated (start positions/mapping may change after soft clipping)
 +
:** run samtools fixmate to fix mate information (will first need to sort by read name)
 +
:* output is not sorted (start positions/mapping may change after soft clipping)
 +
:** run samtools sort to resort by coordinate (after fixmate)
 +
:* soft clips already in the read are maintained or added to
 +
:** if 3 bases were clipped and 2 are specified to be clipped, no change is made to that end
 +
:** if 3 bases were clipped and 5 are specified to be clipped, 2 additional bases are clipped from that end
 +
 
 +
=== Fixing the mate/resorting ===
 +
In order to update the mate, samtools fixmate must be run.
 +
 
 +
In order to reorder the file, samtools sort must be run.
    +
Notes about the samtools programs:
 +
* samtools fixmate requires the file to be sorted by query name.
 +
* samtools sort cannot write to pipes.
 +
 +
====Steps====
 +
# Run this program and pipe it into samtools sort by query name
 +
#* <pre>./bam trimBam <your InputFile> - [#basesToTrim] [any other options] -c | samtools sort -n - tempQuerySort</pre>
 +
# Run samtools fixmate and pipe it into samtools sort by position
 +
#* <pre> samtools fixmate tempQuerySort.bam - | samtools sort - finalResult</pre>
    
= Parameters =
 
= Parameters =
Line 29: Line 56:  
     Optional Parameters:
 
     Optional Parameters:
 
         --ignoreStrand : ignore strand information - do not reverse left/right for reverse reads
 
         --ignoreStrand : ignore strand information - do not reverse left/right for reverse reads
 +
        --clip        : soft clip the ends rather than setting to N/!
 
</pre>
 
</pre>
 
{{PhoneHomeParamDesc}}
 
{{PhoneHomeParamDesc}}
    
== Required Parameters==
 
== Required Parameters==
=== Input File (1st argument) ===
     −
The first argument is the name of the SAM/BAM input file.
+
{{InBAMInputFile|noParam=1st}}
 +
{{OutBAMOutputFile|noParam=2nd}}
   −
The program automatically determines if your input file is SAM/BAM/uncompressed BAM without any input other than a filename from the user, unless your input file is stdin.
  −
  −
A <code>-</code> is used to indicate to read from stdin and the extension is used to determine the file type (no extension indicates SAM).
  −
  −
{|border="1" cellspacing="0" cellpadding="2"
  −
|SAM/BAM/Uncompressed BAM from file
  −
| <code>yourFileName</code>
  −
|-
  −
|SAM from stdin
  −
| <code>-</code>
  −
|-
  −
|BAM from stdin
  −
| <code>-.bam</code>
  −
|-
  −
|Uncompressed BAM from stdin
  −
| <code>-.ubam</code>
  −
|}
  −
  −
  −
Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file).  This matches the <code>samtools</code> implementation so pipes between our tools and <code>samtools</code> are supported.
  −
  −
=== output File (2nd argument) ===
  −
  −
The second argument is the name of the SAM/BAM output file.
  −
  −
The file extension is used to determine whether to write SAM/BAM/uncompressed BAM.  A <code>-</code> is used to indicate stdout and the extension for file type (no extension is SAM).
  −
  −
{|border="1" cellspacing="0" cellpadding="2"
  −
|SAM to file
  −
| <code>yourFileName.sam</code>
  −
|-
  −
|BAM to file
  −
| <code>yourFileName.bam</code>
  −
|-
  −
|Uncompressed BAM to file
  −
| <code>yourFileName.ubam</code>
  −
|-
  −
|SAM to stdout
  −
| <code>-</code>
  −
|-
  −
|BAM to stdout
  −
| <code>-.bam</code>
  −
|-
  −
|Uncompressed BAM to stdout
  −
| <code>-.ubam</code>
  −
|}
  −
  −
  −
Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file).  This matches the <code>samtools</code> implementation so pipes between our tools and <code>samtools</code> are supported.
   
==Optional parameters==
 
==Optional parameters==
 
=== Number of Bases to Trim from Each End (3rd argument) ===
 
=== Number of Bases to Trim from Each End (3rd argument) ===
 
If the 3rd argument a number (with no flag/option), it is the number of bases to trim from each end of the reads.
 
If the 3rd argument a number (with no flag/option), it is the number of bases to trim from each end of the reads.
   −
===Trim Bases from the Left (<code>--left</code> or <code>--L</code>)===
+
===Trim Bases from the Left (<code>--left</code> or <code>-L</code>)===
Use <code>--left</code> or <code>--L</code> followed by the number of bases to be trimmed from the left.
+
Use <code>--left</code> or <code>-L</code> followed by the number of bases to be trimmed from the left.
    
By default reverse strands are reversed and then the left is trimmed, meaning that <code>--left</code> actually trims from the right of the read in the SAM/BAM for reverse reads.
 
By default reverse strands are reversed and then the left is trimmed, meaning that <code>--left</code> actually trims from the right of the read in the SAM/BAM for reverse reads.
   −
Use [[#Ignore the Strand when Trimming (--ignoreStrand or --i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same.
+
Use [[#Ignore the Strand when Trimming (--ignoreStrand or -i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same.
   −
===Trim Bases from the Right (<code>--right</code> or <code>--R</code>)===
+
===Trim Bases from the Right (<code>--right</code> or <code>-R</code>)===
Use <code>--right</code> or <code>--R</code> followed by the number of bases to be trimmed from the right.
+
Use <code>--right</code> or <code>-R</code> followed by the number of bases to be trimmed from the right.
    
By default reverse strands are reversed and then the right is trimmed, meaning that <code>--right</code> actually trims from the left of the read in the SAM/BAM for reverse reads.
 
By default reverse strands are reversed and then the right is trimmed, meaning that <code>--right</code> actually trims from the left of the read in the SAM/BAM for reverse reads.
   −
Use [[#Ignore the Strand when Trimming (--ignoreStrand or --i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same.
+
Use [[#Ignore the Strand when Trimming (--ignoreStrand or -i)|--ignoreStrand/-i]] to ignore the strand information and treat forward/reverse the same.
 +
 
 +
=== Ignore the Strand when Trimming (<code>--ignoreStrand</code> or <code>-i</code>) ===
 +
Use <code>--ignoreStrand</code> or <code>-i</code> to ignore the strand information and treat forward/reverse the same.  When <code>--ignoreStrand</code> or <code>-i</code> is set, do not reverse reverse reads prior to trimming left/right.
 +
 
 +
=== SoftClip the Ends (<code>--clip</code> or <code>-c</code>) ===
 +
Use <code>--clip</code> or <code>-c</code> to soft clip the ends instead of setting to N/! (or set to unmapped if the entire read would be soft clipped).
   −
=== Ignore the Strand when Trimming (<code>--ignoreStrand</code> or <code>--i</code>) ===
+
See [[#Soft Clipping Notes (--clip/-c)|Soft Clipping Notes]] for more information about clipping and post processing that will need to be done.
Use <code>--ignoreStrand</code> or <code>--i</code> to ignore the strand information and treat forward/reverese the same.  When <code>--ignoreStrand</code> or <code>--i</code> is set, do not reverse reverse reads prior to trimming left/right.
      
{{noeofBGZFParameter}}
 
{{noeofBGZFParameter}}

Navigation menu