Changes

From Genome Analysis Wiki
Jump to: navigation, search

BamUtil: validate

1,333 bytes added, 13:05, 6 January 2014
no edit summary
[[Category:BamUtil|validate]][[Category:BAM Software]][[Category:Software|BamValidator]]== Status ==
The initial version of a SAM/BAM Validator is complete, but does not yet validate all fields or produce all desired statistics. Future releases will add more validation and more statistics.
== Download ==Click the link to download the tar of the source codehttp: [[Media:bam//genome.0sph.0.2.tgz|bam.0.0.2.tgz]] If you use this software, please e-mail me, Mary Kate Trost, at mktrost@umich.edu /wiki/BamUtil This version After compiling, the BAM Validator is found in bamUtil/bin/bam and is recommended for Unix users with access to the GNU C++ compiler"validate" subprogram (bamUtil/bin/bam validate).
To install the BAM Library and the BAM Validator, unpack the downloaded file (tar xvf) and type make. The BAM Validator is found in pipeline/bam and is called bam (pipeline/bam/bam).  == Purpose ==
The BamValidator processes the specified SAM/BAM file:
=== Valid SAM/BAM File Requirements ===
A valid SAM/BAM file meets the validation criteria specified in [[SAM Validation Criteria]].
=== Statistic Generation ===
Statistics are generated by the BAM Validator if the <code>--disableStatistics</code> option is not set. A description of the statistics generated are found at: [[C++ Class: SamFile#Statistic Generation|Sam File Statistics]]
== How to Use the Bam Validator Executable ===== Parameters ===<pre> Required Parameters: --in : the SAM/BAM file to be validated Optional Parameters: --noeof : do not expect an EOF block on a bam file. --so_flag : validate the file is sorted based on the header's @HD SO flag. --so_coord : validate the file is sorted based on the coordinate. --so_query : validate the file is sorted based on the query name. --maxErrors : Number of records with errors/invalids to allow before quiting. -1 (default) indicates to not quit until the entire file is validated. 0 indicates not to read/validate anything. --verbose : Print specific error details rather than just a summary --printableErrors : Maximum number of records with errors to print the details of before suppressing them when in verbose (defaults to 100) --disableStatistics : Turn off statistic generation --params : Print the parameter settings</pre> === Usage ===
./bam validate --in <inputFile> [--noeof] [--so_flag|--so_coord|--so_query] [--maxErrors <numErrors>] [--verbose] [--printableErrors <numReportedErrors>] [--disableStatistics] [--params]
==== Recommended Usage ====
If you don't want the file statistics, use --disableStatistics.
./bam validate --in <inputFile> --verbose
= Parameters =
<pre>
Required Parameters:
--in : the SAM/BAM file to be validated
Optional Parameters:
--noeof : do not expect an EOF block on a bam file.
--refFile : the reference file
--so_flag : validate the file is sorted based on the header's @HD SO flag.
--so_coord : validate the file is sorted based on the coordinate.
--so_query : validate the file is sorted based on the query name.
--maxErrors : Number of records with errors/invalids to allow before quiting.
-1 (default) indicates to not quit until the entire file is validated.
0 indicates not to read/validate anything.
--verbose : Print specific error details rather than just a summary
--printableErrors : Maximum number of records with errors to print the details of
before suppressing them when in verbose (defaults to 100)
--disableStatistics : Turn off statistic generation
--params : Print the parameter settings
</pre>
{{PhoneHomeParamDesc}}
 
== Required Parameters ==
{{inBAMInputFile|hdr======}}
 
== Optional Parameters ==
{{noeofBGZFParameter}}
{{refFile}}
 
=== Validate Sort Order (<code>--so_flag</code>, <code>--so_coord</code>,<code>--so_query</code>)===
Validate the sort order of the file:
* <code>--so_flag</code> - based on the flag in the header
* <code>--so_coord</code> - based on the coordinates/positions
* <code>--so_query</code> - based on the query/read names
 
=== Print Specific Errors (<code>--maxErrors</code>)===
Use <code>--maxErrors</code> followed by a number to specify the maximum number of records with errors/invalids to process before quiting.
 
-1 (default) indicates to not quit until the entire file is validated.
 
0 indicates not to read/validate anything.
 
=== Print Specific Errors (<code>--verbose</code>)===
Use <code>--verbose</code> to print specific error details rather than just a summary.
=== Maxium Number of Record Error Details to Print (<code>--printableErrors</code>)===Use <code>--printableErrors</code> followed by a number to specify the maximum number of records with errors to print the details of before suppressing them. This parameter is only valid when [[#Print Specific Errors (--verbose)|<code>--verbose</code>]] is also specified. The default is 100. === Disable Statistic Generation (<code>--disableStatistics</code>)===Use <code>--disableStatistics</code> to turn off statistic generation (statistics are generated by default). {{paramsParameter}} {{PhoneHomeParameters}} = Output =The error details (--verbose) and the statistics are printed to stderr. If you want that to go to a file you need to redirect stderr. For a bash shell, redirect to stderr by doing: ./bam validate --in <inputFile> --verbose 2> outputFile.txt  = Return Value ===
* 0: all records are successfully read, are valid, and are properly sorted.
* non-0: at least one record was not successfully read, not valid, or not properly sorted.
=== Example Outputs ===
==== Valid File ====
<pre>
./bam validate --in ~/data/bamExample/37mer_alt.bwa.bam
</pre>
==== Invalid File ====
<pre>
./bam validate --in test/testFiles/testInvalid.sam
</pre>
==== Invalid File with Verbose ====
Printable errors is specified to produce a smaller example that does not print all the errors since that would take up more space.
Returning: 7 (INVALID)
</pre>
 
 
== Libraries ==
*[[C++ Library: libbam|libbam.a]]
*[[C++ Library: libcsg|libcsg.a]]

Navigation menu