Line 1: |
Line 1: |
| + | [[Category:BamUtil|validate]] |
| + | [[Category:BAM Software]] |
| [[Category:Software]] | | [[Category:Software]] |
− | [[Category:StatGen Download]]
| |
− | [[Category:BAM Software]]
| |
| | | |
− | == Status ==
| + | = Status = |
| | | |
| The initial version of a SAM/BAM Validator is complete, but does not yet validate all fields or produce all desired statistics. Future releases will add more validation and more statistics. | | The initial version of a SAM/BAM Validator is complete, but does not yet validate all fields or produce all desired statistics. Future releases will add more validation and more statistics. |
| | | |
− | == Download ==
| + | = Download = |
− | http://genome.sph.umich.edu/wiki/Software#Download | + | http://genome.sph.umich.edu/wiki/BamUtil |
− | The BAM Validator is found in stagen/src/bam and is called bam (statgen/src/bin/bam).
| + | After compiling, the BAM Validator is found in bamUtil/bin/bam and is the "validate" subprogram (bamUtil/bin/bam validate). |
| | | |
− | == Purpose ==
| + | = Purpose = |
| | | |
| The BamValidator processes the specified SAM/BAM file: | | The BamValidator processes the specified SAM/BAM file: |
Line 20: |
Line 20: |
| | | |
| | | |
− | === Valid SAM/BAM File Requirements ===
| + | == Valid SAM/BAM File Requirements == |
| | | |
| A valid SAM/BAM file meets the validation criteria specified in [[SAM Validation Criteria]]. | | A valid SAM/BAM file meets the validation criteria specified in [[SAM Validation Criteria]]. |
| | | |
− | === Statistic Generation ===
| + | == Statistic Generation == |
| | | |
| Statistics are generated by the BAM Validator if the <code>--disableStatistics</code> option is not set. A description of the statistics generated are found at: [[C++ Class: SamFile#Statistic Generation|Sam File Statistics]] | | Statistics are generated by the BAM Validator if the <code>--disableStatistics</code> option is not set. A description of the statistics generated are found at: [[C++ Class: SamFile#Statistic Generation|Sam File Statistics]] |
| | | |
− | == How to Use the Bam Validator Executable ==
| + | = Usage = |
− | === Parameters ===
| |
− | <pre>
| |
− | Required Parameters:
| |
− | --in : the SAM/BAM file to be validated
| |
− | Optional Parameters:
| |
− | --noeof : do not expect an EOF block on a bam file.
| |
− | --so_flag : validate the file is sorted based on the header's @HD SO flag.
| |
− | --so_coord : validate the file is sorted based on the coordinate.
| |
− | --so_query : validate the file is sorted based on the query name.
| |
− | --maxErrors : Number of records with errors/invalids to allow before quiting.
| |
− | -1 (default) indicates to not quit until the entire file is validated.
| |
− | 0 indicates not to read/validate anything.
| |
− | --verbose : Print specific error details rather than just a summary
| |
− | --printableErrors : Maximum number of records with errors to print the details of
| |
− | before suppressing them when in verbose (defaults to 100)
| |
− | --disableStatistics : Turn off statistic generation
| |
− | --params : Print the parameter settings
| |
− | </pre>
| |
− | | |
− | === Usage ===
| |
| | | |
| ./bam validate --in <inputFile> [--noeof] [--so_flag|--so_coord|--so_query] [--maxErrors <numErrors>] [--verbose] [--printableErrors <numReportedErrors>] [--disableStatistics] [--params] | | ./bam validate --in <inputFile> [--noeof] [--so_flag|--so_coord|--so_query] [--maxErrors <numErrors>] [--verbose] [--printableErrors <numReportedErrors>] [--disableStatistics] [--params] |
| | | |
− | ==== Recommended Usage ====
| + | == Recommended Usage == |
| If you don't want the file statistics, use --disableStatistics. | | If you don't want the file statistics, use --disableStatistics. |
| | | |
Line 64: |
Line 44: |
| ./bam validate --in <inputFile> --verbose | | ./bam validate --in <inputFile> --verbose |
| | | |
− | === Output === | + | = Parameters = |
| + | <pre> |
| + | Required Parameters: |
| + | --in : the SAM/BAM file to be validated |
| + | Optional Parameters: |
| + | --noeof : do not expect an EOF block on a bam file. |
| + | --refFile : the reference file |
| + | --so_flag : validate the file is sorted based on the header's @HD SO flag. |
| + | --so_coord : validate the file is sorted based on the coordinate. |
| + | --so_query : validate the file is sorted based on the query name. |
| + | --maxErrors : Number of records with errors/invalids to allow before quiting. |
| + | -1 (default) indicates to not quit until the entire file is validated. |
| + | 0 indicates not to read/validate anything. |
| + | --verbose : Print specific error details rather than just a summary |
| + | --printableErrors : Maximum number of records with errors to print the details of |
| + | before suppressing them when in verbose (defaults to 100) |
| + | --disableStatistics : Turn off statistic generation |
| + | --params : Print the parameter settings |
| + | </pre> |
| + | {{PhoneHomeParamDesc}} |
| + | |
| + | == Required Parameters == |
| + | {{inBAMInputFile|hdr======}} |
| + | |
| + | == Optional Parameters == |
| + | {{noeofBGZFParameter}} |
| + | {{refFile}} |
| + | |
| + | === Validate Sort Order (<code>--so_flag</code>, <code>--so_coord</code>,<code>--so_query</code>)=== |
| + | Validate the sort order of the file: |
| + | * <code>--so_flag</code> - based on the flag in the header |
| + | * <code>--so_coord</code> - based on the coordinates/positions |
| + | * <code>--so_query</code> - based on the query/read names |
| + | |
| + | === Print Specific Errors (<code>--maxErrors</code>)=== |
| + | Use <code>--maxErrors</code> followed by a number to specify the maximum number of records with errors/invalids to process before quiting. |
| + | |
| + | -1 (default) indicates to not quit until the entire file is validated. |
| + | |
| + | 0 indicates not to read/validate anything. |
| + | |
| + | === Print Specific Errors (<code>--verbose</code>)=== |
| + | Use <code>--verbose</code> to print specific error details rather than just a summary. |
| + | |
| + | === Maxium Number of Record Error Details to Print (<code>--printableErrors</code>)=== |
| + | Use <code>--printableErrors</code> followed by a number to specify the maximum number of records with errors to print the details of before suppressing them. This parameter is only valid when [[#Print Specific Errors (--verbose)|<code>--verbose</code>]] is also specified. |
| + | |
| + | The default is 100. |
| + | |
| + | === Disable Statistic Generation (<code>--disableStatistics</code>)=== |
| + | Use <code>--disableStatistics</code> to turn off statistic generation (statistics are generated by default). |
| + | |
| + | {{paramsParameter}} |
| + | |
| + | {{PhoneHomeParameters}} |
| + | |
| + | = Output = |
| The error details (--verbose) and the statistics are printed to stderr. If you want that to go to a file you need to redirect stderr. | | The error details (--verbose) and the statistics are printed to stderr. If you want that to go to a file you need to redirect stderr. |
| | | |
Line 71: |
Line 107: |
| | | |
| | | |
− | === Return Value ===
| + | = Return Value = |
| * 0: all records are successfully read, are valid, and are properly sorted. | | * 0: all records are successfully read, are valid, and are properly sorted. |
| * non-0: at least one record was not successfully read, not valid, or not properly sorted. | | * non-0: at least one record was not successfully read, not valid, or not properly sorted. |
| | | |
− | === Example Outputs ===
| + | = Example Outputs = |
| | | |
− | ==== Valid File ====
| + | == Valid File == |
| <pre> | | <pre> |
| ./bam validate --in ~/data/bamExample/37mer_alt.bwa.bam | | ./bam validate --in ~/data/bamExample/37mer_alt.bwa.bam |
Line 102: |
Line 138: |
| </pre> | | </pre> |
| | | |
− | ==== Invalid File ====
| + | == Invalid File == |
| <pre> | | <pre> |
| ./bam validate --in test/testFiles/testInvalid.sam | | ./bam validate --in test/testFiles/testInvalid.sam |
Line 136: |
Line 172: |
| </pre> | | </pre> |
| | | |
− | ==== Invalid File with Verbose ====
| + | == Invalid File with Verbose == |
| Printable errors is specified to produce a smaller example that does not print all the errors since that would take up more space. | | Printable errors is specified to produce a smaller example that does not print all the errors since that would take up more space. |
| | | |