Difference between revisions of "BamUtil"
Line 7: | Line 7: | ||
* [[C++ Executable: bam#Read and Validate a SAM/BAM file|Read and Validate a SAM/BAM file]] | * [[C++ Executable: bam#Read and Validate a SAM/BAM file|Read and Validate a SAM/BAM file]] | ||
* [[C++ Executable: bam#Read a SAM/BAM file and write as a SAM/BAM file|Read a SAM/BAM file and write as a SAM/BAM file]] | * [[C++ Executable: bam#Read a SAM/BAM file and write as a SAM/BAM file|Read a SAM/BAM file and write as a SAM/BAM file]] | ||
+ | * [[C++ Executable: bam#Print SAM/BAM header]] | ||
* [[C++ Executable: bam#Dump a BAM index file|Dump a BAM index file into an easy to read text version]] | * [[C++ Executable: bam#Dump a BAM index file|Dump a BAM index file into an easy to read text version]] | ||
* [[C++ Executable: bam#Read & Write indexed BAM file|Read an indexed BAM file reference by reference id -1 to 22 and write it out as a SAM/BAM file]] | * [[C++ Executable: bam#Read & Write indexed BAM file|Read an indexed BAM file reference by reference id -1 to 22 and write it out as a SAM/BAM file]] | ||
This executable is built using the [[C++ Library: bam|bam library]]. | This executable is built using the [[C++ Library: bam|bam library]]. | ||
+ | |||
+ | Just running ./bam will print the Usage information for the bam executable. | ||
Line 41: | Line 44: | ||
./bam validate --in <inputFile> [--noeof] [--so_flag|--so_coord|--so_query] [--quitAfterErrorNum <numErrors>] [--maxReportedErrors <numReportedErrors>] | ./bam validate --in <inputFile> [--noeof] [--so_flag|--so_coord|--so_query] [--quitAfterErrorNum <numErrors>] [--maxReportedErrors <numReportedErrors>] | ||
− | |||
=== Return Value === | === Return Value === | ||
* 0: all records are successfully read, are valid, and are properly sorted. | * 0: all records are successfully read, are valid, and are properly sorted. | ||
* non-0: at least one record was not successfully read, not valid, or not properly sorted. | * non-0: at least one record was not successfully read, not valid, or not properly sorted. | ||
− | |||
=== Example Output === | === Example Output === | ||
Line 77: | Line 78: | ||
=== Usage === | === Usage === | ||
./bam <inputFile> <outputFile.sam/bam/ubam (ubam is uncompressed bam)> [NOEOF] | ./bam <inputFile> <outputFile.sam/bam/ubam (ubam is uncompressed bam)> [NOEOF] | ||
− | |||
=== Return Value === | === Return Value === | ||
Returns the SamStatus for the reads/writes. | Returns the SamStatus for the reads/writes. | ||
− | |||
=== Example Output === | === Example Output === | ||
Line 87: | Line 86: | ||
Number of records read = 10 | Number of records read = 10 | ||
Number of records written = 10 | Number of records written = 10 | ||
+ | </pre> | ||
+ | |||
+ | |||
+ | == Print SAM/BAM header== | ||
+ | The <code>dump_header</code> option on the bam executable prints the header of the specified SAM/BAM file to cout. | ||
+ | |||
+ | === Parameters === | ||
+ | <pre> | ||
+ | Required Parameters: | ||
+ | filename : the sam/bam filename whose header should be printed. | ||
+ | </pre> | ||
+ | |||
+ | === Usage === | ||
+ | |||
+ | ./bam dump_header <inputFile> | ||
+ | |||
+ | === Return Value === | ||
+ | * 0: the header was successfully read and printed. | ||
+ | * non-0: the header was not successfully read or was not printed. (Returns the SamStatus.) | ||
+ | |||
+ | |||
+ | === Example Output === | ||
+ | <pre> | ||
+ | @SQ SN:1 LN:247249719 | ||
+ | @SQ SN:2 LN:242951149 | ||
+ | @SQ SN:3 LN:199501827 | ||
</pre> | </pre> | ||
Revision as of 12:43, 21 May 2010
bam Executable
When the pipeline is compiled, the SAM/BAM executable, "bam" is generated in the pipeline/bam/ directory.
The software reads the beginning of an input file to determine if it is SAM/BAM. To determine the format (SAM/BAM) of the output file, the software checks the output file's extension. If the extension is ".bam" it writes a BAM file, otherwise it writes a SAM file.
The bam executable has the following functions.
- Read and Validate a SAM/BAM file
- Read a SAM/BAM file and write as a SAM/BAM file
- C++ Executable: bam#Print SAM/BAM header
- Dump a BAM index file into an easy to read text version
- Read an indexed BAM file reference by reference id -1 to 22 and write it out as a SAM/BAM file
This executable is built using the bam library.
Just running ./bam will print the Usage information for the bam executable.
Read and Validate a SAM/BAM file
The validate
option on the bam executable validates a SAM/BAM file.
The validation checks that the file is sorted as specified in the user options. Default is unsorted, in which case, no order validation is done.
SAM fields are validated against: SAM Validation Criteria
NOTE: Currently only minimal validation is currently done.
Parameters
Required Parameters: --in : the SAM/BAM file to be validated Optional Parameters: --noeof : do not expect an EOF block on a bam file. --so_flag : validate the file is sorted based on the header's @HD SO flag. --so_coord : validate the file is sorted based on the coordinate. --so_query : validate the file is sorted based on the query name. --quitAfterErrorNum : Number of records with errors/invalids to allow before quiting. -1 (default) indicates to not quit until the entire file is validated. 0 indicates not to read/validate anything. --maxReportedErrors : Maximum number of errors to print (defaults to 100)
Usage
./bam validate --in <inputFile> [--noeof] [--so_flag|--so_coord|--so_query] [--quitAfterErrorNum <numErrors>] [--maxReportedErrors <numReportedErrors>]
Return Value
- 0: all records are successfully read, are valid, and are properly sorted.
- non-0: at least one record was not successfully read, not valid, or not properly sorted.
Example Output
The following parameters are in effect: Input Parameters --in [t.sam], --noeof, --quitAfterErrorNum [-1], --maxReportedErrors [100] SortOrder : --so_flag, --so_coord, --so_query Record 1 FAIL_PARSE: Too few columns in the Record Record 2 FAIL_PARSE: Too few columns in the Record Number of records read = 2 Number of valid records = 0 Returning: 5 (FAIL_PARSE)
Read a SAM/BAM file and write as a SAM/BAM file
This executable takes 2/3 arguments. The first argument is the input file. The second argument is the output file. The executable converts the first file into the format of the second file. So if you want to convert a BAM file to a SAM file, from the pipeline/bam/ directory you just call:
./bam <bamFile>.bam <newSamFile>.sam
Don't forget to put in the paths to the executable and your test files.
The third argument, NOEOF
, specifies that the End-Of-File Block should not be checked for when opening the file.
Usage
./bam <inputFile> <outputFile.sam/bam/ubam (ubam is uncompressed bam)> [NOEOF]
Return Value
Returns the SamStatus for the reads/writes.
Example Output
Number of records read = 10 Number of records written = 10
Print SAM/BAM header
The dump_header
option on the bam executable prints the header of the specified SAM/BAM file to cout.
Parameters
Required Parameters: filename : the sam/bam filename whose header should be printed.
Usage
./bam dump_header <inputFile>
Return Value
- 0: the header was successfully read and printed.
- non-0: the header was not successfully read or was not printed. (Returns the SamStatus.)
Example Output
@SQ SN:1 LN:247249719 @SQ SN:2 LN:242951149 @SQ SN:3 LN:199501827
Dump a BAM index file
Usage
./bam dump_index <bamIndexFile>
Return Value
- -1 if the bam index file could not be opened.
- 0 if the bam index file could be opened.
Read & Write indexed BAM file
Usage
./bam read_indexed_bam <inputFilename> <outputFile.sam/bam> <bamIndexFile>
Return Value
- 0