Line 1: |
Line 1: |
− | [[Category:libbam]] | + | [[Category:BamUtil|filter]] |
| + | [[Category:BAM Software]] |
| + | [[Category:Software]] |
| | | |
− | == filter == | + | = Overview of the <code>filter</code> function of <code>bamUtil</code> = |
− | | + | The <code>filter</code> option on the [[bamUtil]] executable writes the alignments, filtering them by clipping ends with too high of a mismatch percentage and by marking reads unmapped if the quality of mismatches is too high. |
− | The <code>filter</code> option on the [[Bam|bam executable]] writes the alignments filtering them by clipping ends with too high of a mismatch percentage and by marking reads unmapped if the quality of mismatches is too high. | |
| | | |
| The following modifications may occur in an alignment: | | The following modifications may occur in an alignment: |
Line 10: |
Line 11: |
| * FLAG updated to reflect a read is unmapped if it is below the quality of mismatches is too high, or clipping would cause an entire read to be clipped. | | * FLAG updated to reflect a read is unmapped if it is below the quality of mismatches is too high, or clipping would cause an entire read to be clipped. |
| | | |
− | === NOTES ===
| + | The POS and FLAG fields of an alignment are reflected in the mate's alignment. Thus, the mate also needs to be updated. |
− | The POS and FLAG fields of an alignment are reflected in the mate's alignment. Thus, when the mate also needs to be updated. | |
| | | |
| Also, if the file was sorted, and a POS was changed, the file may no longer be sorted. | | Also, if the file was sorted, and a POS was changed, the file may no longer be sorted. |
| | | |
− | '''NOTE: This program does NOT update the mate or resort the file.''' | + | '''NOTE: This program does NOT update the mate or re-sort the file.''' |
| | | |
| + | == Fixing the mate/resorting == |
| In order to update the mate, samtools fixmate must be run. | | In order to update the mate, samtools fixmate must be run. |
| | | |
Line 25: |
Line 26: |
| * samtools sort cannot write to pipes. | | * samtools sort cannot write to pipes. |
| | | |
− | Steps: | + | ===Steps=== |
| # Run this program and pipe it into samtools sort by query name | | # Run this program and pipe it into samtools sort by query name |
| #* <pre>./bam filter --in <your InputFile> --refFile <your reference file> --out -.bam <any other options> | samtools sort -n - tempQuerySort</pre> | | #* <pre>./bam filter --in <your InputFile> --refFile <your reference file> --out -.bam <any other options> | samtools sort -n - tempQuerySort</pre> |
Line 31: |
Line 32: |
| #* <pre> samtools fixmate tempQuerySort.bam - | samtools sort - finalResult</pre> | | #* <pre> samtools fixmate tempQuerySort.bam - | samtools sort - finalResult</pre> |
| | | |
− | For Example:
| + | ===Example=== |
| ~/pipeFilter/bam/bam filter --in ../../originalBamFile.bam --refFile ~/data/human.g1k.v37.fa --out -.bam | samtools sort -n - tempQuerySort; samtools fixmate tempQuerySort.bam - | samtools sort - newResult | | ~/pipeFilter/bam/bam filter --in ../../originalBamFile.bam --refFile ~/data/human.g1k.v37.fa --out -.bam | samtools sort -n - tempQuerySort; samtools fixmate tempQuerySort.bam - | samtools sort - newResult |
| | | |
| | | |
− | === Parameters === | + | = Usage = |
| + | |
| + | ./bam filter --in <inputFilename> --refFile <referenceFilename> --out <outputFilename> [--noeof] [--qualityThreshold <qualThresh>] [--defaultQualityInt <defaultQual>] [--mismatchThreshold <mismatchThresh>] [--params] |
| + | |
| + | = Parameters = |
| <pre> | | <pre> |
| Required Parameters: | | Required Parameters: |
Line 47: |
Line 52: |
| --defaultQualityInt : quality value to use for mismatches that do not have a quality | | --defaultQualityInt : quality value to use for mismatches that do not have a quality |
| (Defaults to 20) | | (Defaults to 20) |
− | --mismatchThreshold : decimal value indicating the maximum ration of mismatches to | + | --mismatchThreshold : decimal value indicating the maximum ratio of mismatches to |
| matches and mismatches allowed before clipping from the ends | | matches and mismatches allowed before clipping from the ends |
| (Defaults to .10) | | (Defaults to .10) |
| --params : print the parameter settings | | --params : print the parameter settings |
| </pre> | | </pre> |
| + | {{PhoneHomeParamDesc}} |
| | | |
− | === Usage === | + | == Required Parameters== |
| + | {{InBAMInputFile}} |
| + | {{refFile}} |
| + | {{OutBAMOutputFile}} |
| + | |
| + | == Optional Parameters == |
| + | {{noeofBGZFParameter}} |
| + | === Quality Threshold (<code>--qualityThreshold</code>) === |
| + | In <code>filter</code>, when the sum of the mismatching base qualities is higher than the <code>--qualityThreshold</code>, the read is marked as unmapped. The default threshold is 60. |
| + | |
| + | === Default Quality (<code>--defaultQualityInt</code>) === |
| + | <code>filter</code> filters reads based on the sum of the base qualities of mismatches. Some reads, however, do not have base qualities. Use <code>--defaultQualityInt</code> to specify the base qualities to use for mismatches that do not have quality values. The default is 20. |
| + | |
| + | === Mismatch Threshold (<code>--mismatchThreshold</code>) === |
| + | <code>filter</code> clips the ends of reads if the ratio of mismatches to matches and mismatches is higher than the decimal parameter, <code>--mismatchThreshold</code>. The default is .10. |
| + | |
| + | {{paramsParameter}} |
| + | |
| + | {{PhoneHomeParameters}} |
| | | |
− | ./bam filter --in <inputFilename> --refFile <referenceFilename> --out <outputFilename> [--noeof] [--qualityThreshold <qualThresh>] [--defaultQualityInt <defaultQual>] [--mismatchThreshold <mismatchThresh>] [--params]
| |
| | | |
− | === Return Value ===
| + | = Return Value = |
| * 0: all records are successfully read and written. | | * 0: all records are successfully read and written. |
| * non-0: at least one record was not successfully read or written. | | * non-0: at least one record was not successfully read or written. |
| | | |
− | === Example Output ===
| + | = Example Output = |
| <pre> | | <pre> |
| | | |
Line 73: |
Line 96: |
| | | |
| | | |
− | === FAQ ===
| + | = FAQ = |
| This section contains information about what the filters mean, how they work, etc. | | This section contains information about what the filters mean, how they work, etc. |
| | | |
Line 94: |
Line 117: |
| | | |
| '''How is the mismatch threshold checked?''' | | '''How is the mismatch threshold checked?''' |
− | :This is requires a bit more logic...and thus gets its own section. | + | :This is requires a bit more logic...and thus gets its own section: [[Bam Executable: Filter#Mismatch Threshold|Mismatch Threshold]]. |
| | | |
− | ==== Mismatch Threshold ====
| + | = Mismatch Threshold = |
| | | |
| Mismatch threshold is: | | Mismatch threshold is: |
Line 110: |
Line 133: |
| This base will be clipped since 1 is > than the mismatch threshold (assuming that wasn't set to 1). | | This base will be clipped since 1 is > than the mismatch threshold (assuming that wasn't set to 1). |
| | | |
− | If the first base is a mismatch, it is | + | If the first base is a match, it is |
| :<math>{0 \over 1 + 0} = {0 \over 1} = 0</math> | | :<math>{0 \over 1 + 0} = {0 \over 1} = 0</math> |
| At this point, this base will not be clipped since it is not greater than the mismatch threshold. | | At this point, this base will not be clipped since it is not greater than the mismatch threshold. |