Line 8: |
Line 8: |
| | | |
| It has options to allow for the conversion of the sequence to/from '=' from/to the actual bases by using the reference sequence. | | It has options to allow for the conversion of the sequence to/from '=' from/to the actual bases by using the reference sequence. |
| + | |
| + | It also has an option to left shift indels in the CIGARs before writing the output file. |
| | | |
| If you want to convert a BAM file to a SAM file, just call: | | If you want to convert a BAM file to a SAM file, just call: |
Line 15: |
Line 17: |
| Don't forget to put in the paths to the executable and your test files. | | Don't forget to put in the paths to the executable and your test files. |
| | | |
− | = Parameters =
| |
− | <pre> Required Parameters:
| |
− | --in : the SAM/BAM file to be read
| |
− | --out : the SAM/BAM file to be written
| |
− | Optional Parameters:
| |
− | --refFile : reference file name
| |
− | --noeof : do not expect an EOF block on a bam file.
| |
− | --params : print the parameter settings
| |
− | --recover : attempt to recover the input bam file.
| |
− | Optional Sequence Parameters (only specify one):
| |
− | --seqOrig : Leave the sequence as is (default & used if reference is not specified).
| |
− | --seqBases : Convert any '=' in the sequence to the appropriate base using the reference (requires --ref).
| |
− | --seqEquals : Convert any bases that match the reference to '=' (requires --ref).
| |
− | </pre>
| |
− | == input File (<code>--in</code>) ==
| |
− |
| |
− | Use <code>--in</code> followed by your file name to specify the SAM/BAM input file.
| |
| | | |
− | The program automatically determines if your input file is SAM/BAM/uncompressed BAM without any input other than a filename from the user, unless your input file is stdin.
| + | = Usage = |
| | | |
− | A <code>-</code> is used to indicate to read from stdin and the extension is used to determine the file type (no extension indicates SAM).
| + | ./bam convert --in <inputFile> --out <outputFile.sam/bam/ubam (ubam is uncompressed bam)> [--refFile <reference filename>] [--useBases|--useEquals|--useOrigSeq] [--lshift] [--noeof] [--params] |
| | | |
− | {|border="1" cellspacing="0" cellpadding="2"
| |
− | |SAM/BAM/Uncompressed BAM from file
| |
− | | <code>--in yourFileName</code>
| |
− | |-
| |
− | |SAM from stdin
| |
− | | <code>--in -</code>
| |
− | |-
| |
− | |BAM from stdin
| |
− | | <code>--in -.bam</code>
| |
− | |-
| |
− | |Uncompressed BAM from stdin
| |
− | | <code>--in -.ubam</code>
| |
− | |}
| |
| | | |
| + | = Parameters = |
| + | <pre> Required Parameters: |
| + | --in : the SAM/BAM file to be read |
| + | --out : the SAM/BAM file to be written |
| + | Optional Parameters: |
| + | --refFile : reference file name |
| + | --lshift : left shift indels when writing records |
| + | --noeof : do not expect an EOF block on a bam file |
| + | --params : print the parameter settings |
| + | --recover : attempt error recovery while reading a bam file |
| + | Optional Sequence Parameters (only specify one): |
| + | --useOrigSeq : Leave the sequence as is (default & used if reference is not specified) |
| + | --useBases : Convert any '=' in the sequence to the appropriate base using the reference (requires --refFile) |
| + | --useEquals : Convert any bases that match the reference to '=' (requires --refFile) |
| + | </pre> |
| + | {{PhoneHomeParamDesc}} |
| | | |
− | Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the <code>samtools</code> implementation so pipes between our tools and <code>samtools</code> are supported.
| + | == Required Parameters== |
− | | + | {{InBAMInputFile}} |
− | == output File (<code>--out</code>) == | + | {{OutBAMOutputFile}} |
− | | |
− | Use <code>--out</code> followed by your file name to specify the SAM/BAM output file.
| |
− | | |
− | The file extension is used to determine whether to write SAM/BAM/uncompressed BAM. A <code>-</code> is used to indicate stdout and the extension for file type (no extension is SAM).
| |
| | | |
− | {|border="1" cellspacing="0" cellpadding="2"
| + | == Optional Parameters == |
− | |SAM to file
| + | {{refFile}} |
− | | <code>--out yourFileName.sam</code>
| |
− | |-
| |
− | |BAM to file
| |
− | | <code>--out yourFileName.bam</code>
| |
− | |-
| |
− | |Uncompressed BAM to file
| |
− | | <code>--out yourFileName.ubam</code>
| |
− | |-
| |
− | |SAM to stdout
| |
− | | <code>--out -</code>
| |
− | |-
| |
− | |BAM to stdout
| |
− | | <code>--out -.bam</code>
| |
− | |-
| |
− | |Uncompressed BAM to stdout
| |
− | | <code>--out -.ubam</code>
| |
− | |}
| |
| | | |
| + | === Left Shift Indels in the CIGAR (<code>--lshift</code>) === |
| | | |
− | Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the <code>samtools</code> implementation so pipes between our tools and <code>samtools</code> are supported.
| + | Left shift indels as far as they can go in the read. |
| | | |
| + | {{noeofBGZFParameter}} |
| + | {{paramsParameter}} |
| | | |
− | == Recover a corrupted BAM file (<code>--recover</code>) == | + | === Recover a corrupted BAM file (<code>--recover</code>) === |
| | | |
| See [[#BAM File Recovery |BAM File Recovery]]. | | See [[#BAM File Recovery |BAM File Recovery]]. |
| | | |
− | | + | == Sequence Representation Parameters (<code>--useOrigSeq</code>, <code>--useBases</code>, <code>--useEquals</code>, <code>--refFile</code>) == |
− | == Sequence Representation Parameters (<code>--seqOrig</code>, <code>--seqBases</code>, <code>--seqEquals</code>, <code>--refFile</code>) == | |
| | | |
| The sequence parameters options specify how to represent the sequence if the reference is specified (refFile option). | | The sequence parameters options specify how to represent the sequence if the reference is specified (refFile option). |
| | | |
− | If the reference is not specified or seqOrig is specified, no modifications are made to the sequence. | + | If the reference is not specified or useOrigSeq is specified, no modifications are made to the sequence. |
| | | |
− | If the reference and seqBases is specified, any matches between the sequence and the reference are represented in the sequence as the appropriate base. | + | If the reference and useBases is specified, any matches between the sequence and the reference are represented in the sequence as the appropriate base. |
| | | |
− | If the reference and seqEquals is specified, any matches between the sequence and the reference are represented in the sequence as '='. | + | If the reference and useEquals is specified, any matches between the sequence and the reference are represented in the sequence as '='. |
| | | |
| === Examples === | | === Examples === |
Line 129: |
Line 98: |
| Sequence with Equals: AA======G===GGG | | Sequence with Equals: AA======G===GGG |
| | | |
| + | {{PhoneHomeParameters}} |
| | | |
| = BAM File Recovery = | | = BAM File Recovery = |
Line 148: |
Line 118: |
| In real cases, we have recovered better than 94% of reads from a set of severely damaged files (numerous 64K chunks of a RAID were lost), and better than 99.9% recovery from a moderately damaged file (3 disk pages were corrupt). | | In real cases, we have recovered better than 94% of reads from a set of severely damaged files (numerous 64K chunks of a RAID were lost), and better than 99.9% recovery from a moderately damaged file (3 disk pages were corrupt). |
| | | |
− | = Usage =
| |
− |
| |
− | ./bam convert --in <inputFile> --out <outputFile.sam/bam/ubam (ubam is uncompressed bam)> [--refFile <reference filename>] [--seqBases|--seqEquals|--seqOrig] [--recover] [--noeof] [--params]
| |
| | | |
| = Return Value = | | = Return Value = |
| | | |
− | Returns the SamStatus for the reads/writes. | + | Returns the SamStatus for the reads/writes (0 for success, non-0 for failure). |
| | | |
| = Example Output = | | = Example Output = |