Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,418 bytes added ,  14:05, 6 January 2014
no edit summary
Line 1: Line 1:  +
[[Category:BamUtil|validate]]
 +
[[Category:BAM Software]]
 
[[Category:Software]]
 
[[Category:Software]]
[[Category:BAM Software]]
  −
[[Category:BamUtil | validate]]
     −
== Status  ==
+
= Status  =
    
The initial version of a SAM/BAM Validator is complete, but does not yet validate all fields or produce all desired statistics.  Future releases will add more validation and more statistics.
 
The initial version of a SAM/BAM Validator is complete, but does not yet validate all fields or produce all desired statistics.  Future releases will add more validation and more statistics.
   −
== Download ==
+
= Download =
http://genome.sph.umich.edu/wiki/Software#Download
+
http://genome.sph.umich.edu/wiki/BamUtil
The BAM Validator is found in stagen/src/bam and is called bam (statgen/src/bin/bam).  
+
After compiling, the BAM Validator is found in bamUtil/bin/bam and is the "validate" subprogram (bamUtil/bin/bam validate).  
   −
== Purpose ==
+
= Purpose =
    
The BamValidator processes the specified SAM/BAM file:
 
The BamValidator processes the specified SAM/BAM file:
Line 20: Line 20:       −
=== Valid SAM/BAM File Requirements ===
+
== Valid SAM/BAM File Requirements ==
    
A valid SAM/BAM file meets the validation criteria specified in [[SAM Validation Criteria]].
 
A valid SAM/BAM file meets the validation criteria specified in [[SAM Validation Criteria]].
   −
=== Statistic Generation ===
+
== Statistic Generation ==
    
Statistics are generated by the BAM Validator if the <code>--disableStatistics</code> option is not set.  A description of the statistics generated are found at: [[C++ Class: SamFile#Statistic Generation|Sam File Statistics]]
 
Statistics are generated by the BAM Validator if the <code>--disableStatistics</code> option is not set.  A description of the statistics generated are found at: [[C++ Class: SamFile#Statistic Generation|Sam File Statistics]]
   −
== How to Use the Bam Validator Executable ==
+
= Usage =
=== Parameters ===
  −
<pre>
  −
    Required Parameters:
  −
        --in : the SAM/BAM file to be validated
  −
    Optional Parameters:
  −
        --noeof            : do not expect an EOF block on a bam file.
  −
        --so_flag          : validate the file is sorted based on the header's @HD SO flag.
  −
        --so_coord          : validate the file is sorted based on the coordinate.
  −
        --so_query          : validate the file is sorted based on the query name.
  −
        --maxErrors        : Number of records with errors/invalids to allow before quiting.
  −
                              -1 (default) indicates to not quit until the entire file is validated.
  −
                              0 indicates not to read/validate anything.
  −
        --verbose          : Print specific error details rather than just a summary
  −
        --printableErrors  : Maximum number of records with errors to print the details of
  −
                              before suppressing them when in verbose (defaults to 100)
  −
        --disableStatistics : Turn off statistic generation
  −
        --params            : Print the parameter settings
  −
</pre>
  −
 
  −
=== Usage ===
      
  ./bam validate --in <inputFile> [--noeof] [--so_flag|--so_coord|--so_query] [--maxErrors <numErrors>] [--verbose] [--printableErrors <numReportedErrors>] [--disableStatistics] [--params]
 
  ./bam validate --in <inputFile> [--noeof] [--so_flag|--so_coord|--so_query] [--maxErrors <numErrors>] [--verbose] [--printableErrors <numReportedErrors>] [--disableStatistics] [--params]
   −
==== Recommended Usage ====
+
== Recommended Usage ==
 
If you don't want the file statistics, use --disableStatistics.
 
If you don't want the file statistics, use --disableStatistics.
   Line 64: Line 44:  
  ./bam validate --in <inputFile> --verbose
 
  ./bam validate --in <inputFile> --verbose
   −
=== Output ===
+
= Parameters =
 +
<pre>
 +
Required Parameters:
 +
--in : the SAM/BAM file to be validated
 +
Optional Parameters:
 +
--noeof            : do not expect an EOF block on a bam file.
 +
--refFile          : the reference file
 +
--so_flag          : validate the file is sorted based on the header's @HD SO flag.
 +
--so_coord          : validate the file is sorted based on the coordinate.
 +
--so_query          : validate the file is sorted based on the query name.
 +
--maxErrors        : Number of records with errors/invalids to allow before quiting.
 +
                      -1 (default) indicates to not quit until the entire file is validated.
 +
                      0 indicates not to read/validate anything.
 +
--verbose          : Print specific error details rather than just a summary
 +
--printableErrors  : Maximum number of records with errors to print the details of
 +
                      before suppressing them when in verbose (defaults to 100)
 +
--disableStatistics : Turn off statistic generation
 +
--params            : Print the parameter settings
 +
</pre>
 +
{{PhoneHomeParamDesc}}
 +
 
 +
== Required Parameters ==
 +
{{inBAMInputFile|hdr======}}
 +
 
 +
== Optional Parameters ==
 +
{{noeofBGZFParameter}}
 +
{{refFile}}
 +
 
 +
=== Validate Sort Order (<code>--so_flag</code>, <code>--so_coord</code>,<code>--so_query</code>)===
 +
Validate the sort order of the file:
 +
* <code>--so_flag</code> - based on the flag in the header
 +
* <code>--so_coord</code> - based on the coordinates/positions
 +
* <code>--so_query</code> - based on the query/read names
 +
 
 +
=== Print Specific Errors (<code>--maxErrors</code>)===
 +
Use <code>--maxErrors</code> followed by a number to specify the maximum number of records with errors/invalids to process before quiting.
 +
 
 +
-1 (default) indicates to not quit until the entire file is validated.
 +
 
 +
0 indicates not to read/validate anything.
 +
 
 +
=== Print Specific Errors (<code>--verbose</code>)===
 +
Use <code>--verbose</code> to print specific error details rather than just a summary.
 +
 
 +
=== Maxium Number of Record Error Details to Print  (<code>--printableErrors</code>)===
 +
Use <code>--printableErrors</code> followed by a number to specify the maximum number of records with errors to print the details of before suppressing them.  This parameter is only valid when [[#Print Specific Errors (--verbose)|<code>--verbose</code>]] is also specified.
 +
 
 +
The default is 100.
 +
 
 +
=== Disable Statistic Generation (<code>--disableStatistics</code>)===
 +
Use <code>--disableStatistics</code> to turn off statistic generation (statistics are generated by default).
 +
 
 +
{{paramsParameter}}
 +
 
 +
{{PhoneHomeParameters}}
 +
 
 +
= Output =
 
The error details (--verbose) and the statistics are printed to stderr.  If you want that to go to a file you need to redirect stderr.
 
The error details (--verbose) and the statistics are printed to stderr.  If you want that to go to a file you need to redirect stderr.
   Line 71: Line 107:       −
=== Return Value ===
+
= Return Value =
 
*    0: all records are successfully read, are valid, and are properly sorted.
 
*    0: all records are successfully read, are valid, and are properly sorted.
 
* non-0: at least one record was not successfully read, not valid, or not properly sorted.
 
* non-0: at least one record was not successfully read, not valid, or not properly sorted.
   −
=== Example Outputs ===
+
= Example Outputs =
   −
==== Valid File ====
+
== Valid File ==
 
<pre>
 
<pre>
 
./bam validate --in ~/data/bamExample/37mer_alt.bwa.bam
 
./bam validate --in ~/data/bamExample/37mer_alt.bwa.bam
Line 102: Line 138:  
</pre>
 
</pre>
   −
==== Invalid File ====
+
== Invalid File ==
 
<pre>
 
<pre>
 
./bam validate --in test/testFiles/testInvalid.sam  
 
./bam validate --in test/testFiles/testInvalid.sam  
Line 136: Line 172:  
</pre>
 
</pre>
   −
==== Invalid File with Verbose ====  
+
== Invalid File with Verbose ==  
 
Printable errors is specified to produce a smaller example that does not print all the errors since that would take up more space.
 
Printable errors is specified to produce a smaller example that does not print all the errors since that would take up more space.
  

Navigation menu