Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,073 bytes added ,  14:07, 6 January 2014
no edit summary
Line 1: Line 1: −
[[Category:libbam]]
+
[[Category:BamUtil|filter]]
 +
[[Category:BAM Software]]
 +
[[Category:Software]]
   −
== filter ==
+
= Overview of the <code>filter</code> function of <code>bamUtil</code> =
 
+
The <code>filter</code> option on the [[bamUtil]] executable writes the alignments, filtering them by clipping ends with too high of a mismatch percentage and by marking reads unmapped if the quality of mismatches is too high.
The <code>filter</code> option on the [[Bam|bam executable]] writes the alignments filtering them by clipping ends with too high of a mismatch percentage and by marking reads unmapped if the quality of mismatches is too high.
      
The following modifications may occur in an alignment:
 
The following modifications may occur in an alignment:
Line 10: Line 11:  
* FLAG updated to reflect a read is unmapped if it is below the quality of mismatches is too high, or clipping would cause an entire read to be clipped.
 
* FLAG updated to reflect a read is unmapped if it is below the quality of mismatches is too high, or clipping would cause an entire read to be clipped.
   −
=== NOTES ===
+
The POS and FLAG fields of an alignment are reflected in the mate's alignment.  Thus, the mate also needs to be updated.
The POS and FLAG fields of an alignment are reflected in the mate's alignment.  Thus, when the mate also needs to be updated.
      
Also, if the file was sorted, and a POS was changed, the file may no longer be sorted.
 
Also, if the file was sorted, and a POS was changed, the file may no longer be sorted.
   −
'''NOTE: This program does NOT update the mate or resort the file.'''
+
'''NOTE: This program does NOT update the mate or re-sort the file.'''
    +
== Fixing the mate/resorting ==
 
In order to update the mate, samtools fixmate must be run.  
 
In order to update the mate, samtools fixmate must be run.  
   Line 25: Line 26:  
* samtools sort cannot write to pipes.
 
* samtools sort cannot write to pipes.
   −
Steps:
+
===Steps===
 
# Run this program and pipe it into samtools sort by query name
 
# Run this program and pipe it into samtools sort by query name
 
#* <pre>./bam filter --in <your InputFile> --refFile <your reference file> --out -.bam <any other options> | samtools sort -n - tempQuerySort</pre>
 
#* <pre>./bam filter --in <your InputFile> --refFile <your reference file> --out -.bam <any other options> | samtools sort -n - tempQuerySort</pre>
Line 31: Line 32:  
#* <pre> samtools fixmate tempQuerySort.bam - | samtools sort - finalResult</pre>
 
#* <pre> samtools fixmate tempQuerySort.bam - | samtools sort - finalResult</pre>
   −
For Example:
+
===Example===
 
  ~/pipeFilter/bam/bam filter --in ../../originalBamFile.bam --refFile ~/data/human.g1k.v37.fa --out -.bam | samtools sort -n - tempQuerySort; samtools fixmate tempQuerySort.bam - | samtools sort - newResult
 
  ~/pipeFilter/bam/bam filter --in ../../originalBamFile.bam --refFile ~/data/human.g1k.v37.fa --out -.bam | samtools sort -n - tempQuerySort; samtools fixmate tempQuerySort.bam - | samtools sort - newResult
      −
=== Parameters ===
+
= Usage =
 +
 
 +
./bam filter --in <inputFilename>  --refFile <referenceFilename>  --out <outputFilename> [--noeof] [--qualityThreshold <qualThresh>] [--defaultQualityInt <defaultQual>] [--mismatchThreshold <mismatchThresh>] [--params]
 +
 
 +
= Parameters =
 
<pre>
 
<pre>
Required Parameters:
+
    Required Parameters:
--in      : the SAM/BAM file to be read
+
        --in      : the SAM/BAM file to be read
--refFile  : the reference file
+
        --refFile  : the reference file
--out      : the SAM/BAM file to write to
+
        --out      : the SAM/BAM file to write to
Optional Parameters:
+
    Optional Parameters:
--noeof            : do not expect an EOF block on a bam file.
+
        --noeof            : do not expect an EOF block on a bam file.
--qualityThreshold  : maximum sum of the mismatch qualities before marking
+
        --qualityThreshold  : maximum sum of the mismatch qualities before marking
                      a read unmapped. (Defaults to 60)
+
                              a read unmapped. (Defaults to 60)
--defaultQualityInt : quality value to use for mismatches that do not have a quality
+
        --defaultQualityInt : quality value to use for mismatches that do not have a quality
                      (Defaults to 20)
+
                              (Defaults to 20)
--mismatchThreshold : decimal value indicating the maximum ration of mismatches to
+
        --mismatchThreshold : decimal value indicating the maximum ratio of mismatches to
                      matches and mismatches allowed before clipping from the ends
+
                              matches and mismatches allowed before clipping from the ends
                      (Defaults to .10)
+
                              (Defaults to .10)
 +
        --params            : print the parameter settings
 +
</pre>
 +
{{PhoneHomeParamDesc}}
   −
</pre>
+
== Required Parameters==
 +
{{InBAMInputFile}}
 +
{{refFile}}
 +
{{OutBAMOutputFile}}
 +
 
 +
== Optional Parameters ==
 +
{{noeofBGZFParameter}}
 +
=== Quality Threshold (<code>--qualityThreshold</code>) ===
 +
In <code>filter</code>, when the sum of the mismatching base qualities is higher than the <code>--qualityThreshold</code>, the read is marked as unmapped.  The default threshold is 60.
 +
 
 +
=== Default Quality (<code>--defaultQualityInt</code>) ===
 +
<code>filter</code> filters reads based on the sum of the base qualities of mismatches.  Some reads, however, do not have base qualities.  Use <code>--defaultQualityInt</code> to specify the base qualities to use for mismatches that do not have quality values.  The default is 20.
 +
 
 +
=== Mismatch Threshold (<code>--mismatchThreshold</code>) ===
 +
<code>filter</code> clips the ends of reads if the ratio of mismatches to matches and mismatches is higher than the decimal parameter, <code>--mismatchThreshold</code>.  The default is .10.
 +
 
 +
{{paramsParameter}}
   −
=== Usage ===
+
{{PhoneHomeParameters}}
   −
./bam filter --in <inputFilename>  --refFile <referenceFilename>  --out <outputFilename> [--noeof] [--qualityThreshold <qualThresh>] [--defaultQualityInt <defaultQual>] [--mismatchThreshold <mismatchThresh>]
     −
=== Return Value ===
+
= Return Value =
 
*    0: all records are successfully read and written.
 
*    0: all records are successfully read and written.
 
* non-0: at least one record was not successfully read or written.
 
* non-0: at least one record was not successfully read or written.
   −
=== Example Output ===
+
= Example Output =
 
<pre>
 
<pre>
The following parameters are available.  Ones with "[]" are in effect:
  −
  −
Input Parameters
  −
--in [../../originalBamFile.bam],
  −
--out [-.bam], --refFile [/home/mktrost/data/human.g1k.v37.fa], --noeof,
  −
--qualityThreshold [60], --defaultQualityInt [20], --mismatchThreshold [0.10]
      
open and prefetch reference genome /home/mktrost/data/human.g1k.v37.fa: done.
 
open and prefetch reference genome /home/mktrost/data/human.g1k.v37.fa: done.
Line 79: Line 96:       −
=== FAQ ===
+
= FAQ =
 
This section contains information about what the filters mean, how they work, etc.
 
This section contains information about what the filters mean, how they work, etc.
   Line 100: Line 117:     
'''How is the mismatch threshold checked?'''
 
'''How is the mismatch threshold checked?'''
:This is requires a bit more logic...and thus gets its own section.
+
:This is requires a bit more logic...and thus gets its own section: [[Bam Executable: Filter#Mismatch Threshold|Mismatch Threshold]].
   −
==== Mismatch Threshold ====
+
= Mismatch Threshold =
    
Mismatch threshold is:
 
Mismatch threshold is:
Line 116: Line 133:  
This base will be clipped since 1 is > than the mismatch threshold (assuming that wasn't set to 1).
 
This base will be clipped since 1 is > than the mismatch threshold (assuming that wasn't set to 1).
   −
If the first base is a mismatch, it is
+
If the first base is a match, it is
 
:<math>{0 \over 1 + 0} = {0 \over 1} = 0</math>
 
:<math>{0 \over 1 + 0} = {0 \over 1} = 0</math>
 
At this point, this base will not be clipped since it is not greater than the mismatch threshold.
 
At this point, this base will not be clipped since it is not greater than the mismatch threshold.

Navigation menu