BamUtil: convert
Overview of the convert
function of bamUtil
The convert
option on the bamUtil executable reads a SAM/BAM file and writes it as a SAM/BAM file.
The executable converts the input file into the format of the output file.
It has options to allow for the conversion of the sequence to/from '=' from/to the actual bases by using the reference sequence.
If you want to convert a BAM file to a SAM file, just call:
<pathToExe>/bam --in <bamFile>.bam --out <newSamFile>.sam
Don't forget to put in the paths to the executable and your test files.
Parameters
Required Parameters: --in : the SAM/BAM file to be read --out : the SAM/BAM file to be written Optional Parameters: --refFile : reference file name --noeof : do not expect an EOF block on a bam file. --params : print the parameter settings --recover : attempt to recover the input bam file. Optional Sequence Parameters (only specify one): --seqOrig : Leave the sequence as is (default & used if reference is not specified). --seqBases : Convert any '=' in the sequence to the appropriate base using the reference (requires --ref). --seqEquals : Convert any bases that match the reference to '=' (requires --ref).
Sequence Representation Parameters
The sequence parameters options specify how to represent the sequence if the reference is specified (refFile option). If the reference is not specified or seqOrig is specified, no modifications are made to the sequence. If the reference and seqBases is specified, any matches between the sequence and the reference are represented in the sequence as the appropriate base. If the reference and seqEquals is specified, any matches between the sequence and the reference are represented in the sequence as '='.
Examples
ExtendedCigar: SSMMMDDMMMIMNNNMPMSSS Sequence: AATAA CTAGA T AGGG Reference: TAACCCTA ACCCT A Sequence with Orig: AATAACTAGATAGGG Sequence with Bases: AATAACTAGATAGGG Sequence with Equals: AA======G===GGG
ExtendedCigar: SSMMMDDMMMIMNNNMPMSSS Sequence: AATGA CTGGA T AGGG Reference: TAACCCTA ACCCT A Sequence with Orig: AATGACTGGATAGGG Sequence with Bases: AATGACTGGATAGGG Sequence with Equals: AA=G===GG===GGG
ExtendedCigar: SSMMMDDMMMIMNNNMPMSSS Sequence: AAT=A CT=GA T AGGG Reference: TAACCCTA ACCCT A Sequence with Orig: AAT=ACT=GATAGGG Sequence with Bases: AATGACTGGATAGGG Sequence with Equals: AA======G===GGG
ExtendedCigar: SSMMMDDMMMIMNNNMPMSSS Sequence: AA=== ===G= = =GGG Reference: TAACCCTA ACCCT A Sequence with Orig: AA======G===GGG Sequence with Bases: AATAACTAGATAGGG Sequence with Equals: AA======G===GGG
Usage
./bam convert --in <inputFile> --out <outputFile.sam/bam/ubam (ubam is uncompressed bam)> [--refFile <reference filename>] [--seqBases|--seqEquals|--seqOrig] [--noeof] [--params]
Return Value
Returns the SamStatus for the reads/writes.
Example Output
Number of records read = 10 Number of records written = 10