Line 1: |
Line 1: |
− | =This functionality will be released on 5/17/2012=
| |
− |
| |
| = Overview of the <code>bam2FastQ</code> function of <code>[[bamUtil]]</code> = | | = Overview of the <code>bam2FastQ</code> function of <code>[[bamUtil]]</code> = |
| The <code>bam2FastQ</code> option on the [[bamUtil]] converts a BAM file into FastQ files. This is necessary when only BAM files are delivered but a new alignment is desired. By converting BAM to FastQ files new alignments can be done using FastQ files | | The <code>bam2FastQ</code> option on the [[bamUtil]] converts a BAM file into FastQ files. This is necessary when only BAM files are delivered but a new alignment is desired. By converting BAM to FastQ files new alignments can be done using FastQ files |
Line 16: |
Line 14: |
| | | |
| When processing files sorted by read name, the only requirement is that matching read names are next to each other. It does not need to be in strict alphabetical order. | | When processing files sorted by read name, the only requirement is that matching read names are next to each other. It does not need to be in strict alphabetical order. |
| + | |
| + | Read Names in paired-end FASTQ files are appended with "/1" for the first in the pair, and "/2" for the second in the pair. Override these defaults using [[#First in Pair FastQ ReadName Extension (--firstRNExt)|--firstRNExt]] and [[#Second in Pair FastQ ReadName Extension (--secondRNExt)|--secondRNExt]] |
| + | |
| + | Sequences marked as Reverse strands in the SAM/BAM file are reverse complemented prior to writing to the FASTQ files. To skip this step, specify [[#Do Not Reverse Complement Reverse Strands (--noReverseComp)|--noReverseComp]] |
| | | |
| Any errors and a summary of how many pairs and unpaired reads were processed are written to stderr. | | Any errors and a summary of how many pairs and unpaired reads were processed are written to stderr. |
Line 83: |
Line 85: |
| | | |
| {{inBAMInputFile}} | | {{inBAMInputFile}} |
| + | |
| + | == BAM File Is Sorted By Read Name (<code>--readname</code>) == |
| + | |
| + | The bam2FastQ program by default checks the sort order in the SAM/BAM header when converting to FASTQ, and if that is not specified, assumes it is sorted by coordinate. |
| + | |
| + | To override the default and force it to assume the file is sorted by readname, specify the <code>--readName</code> option |
| + | |
| + | The file does not need to be strictly sorted by read name. The only requirement is that matching read names are next to each other. |
| + | |
| + | == Reference File for Converting '=' in the Sequence to Bases <code>--refFile</code>== |
| + | If the SAM/BAM file contains '=' in the sequence instead of the actual bases, the bam2FastQ program needs to convert the '=' back to the bases. To do that it needs the reference. Specify the reference by using <code>--refFile</code> followed by the reference filename. |
| + | |
| + | For example: |
| + | ./bam bam2FastQ --in myFile.bam --refFile myPath/myRefFile.fa |
| | | |
| == Output FastQ File Base Name (<code>--outBase</code>) == | | == Output FastQ File Base Name (<code>--outBase</code>) == |
Line 93: |
Line 109: |
| | | |
| The value specified by this parameter is overridden by <code>--firstOut</code>, <code>--secondOut</code>, and <code>--unpairedOut</code>, but is used for whichever output files are not specified. | | The value specified by this parameter is overridden by <code>--firstOut</code>, <code>--secondOut</code>, and <code>--unpairedOut</code>, but is used for whichever output files are not specified. |
− |
| |
| | | |
| == Output FastQ File Name For the First End of Paired End (<code>--firstOut</code>) == | | == Output FastQ File Name For the First End of Paired End (<code>--firstOut</code>) == |
Line 105: |
Line 120: |
| For example: | | For example: |
| ./bam bam2FastQ --in myFile.bam --firstOut myFileEnd1.fastq | | ./bam bam2FastQ --in myFile.bam --firstOut myFileEnd1.fastq |
− |
| |
| | | |
| == Output FastQ File Name For the Second End of Paired End (<code>--secondOut</code>) == | | == Output FastQ File Name For the Second End of Paired End (<code>--secondOut</code>) == |
Line 117: |
Line 131: |
| For example: | | For example: |
| ./bam bam2FastQ --in myFile.bam --secondOut myFileEnd2.fastq | | ./bam bam2FastQ --in myFile.bam --secondOut myFileEnd2.fastq |
− |
| |
| | | |
| == Output FastQ File Name For Unpaired Reads (<code>--unpairedOut</code>) == | | == Output FastQ File Name For Unpaired Reads (<code>--unpairedOut</code>) == |
Line 130: |
Line 143: |
| ./bam bam2FastQ --in myFile.bam --unpairedOut myFileUnpaired.fastq | | ./bam bam2FastQ --in myFile.bam --unpairedOut myFileUnpaired.fastq |
| | | |
| + | == First in Pair FastQ ReadName Extension (<code>--firstRNExt</code>) == |
| | | |
− | == BAM File Is Sorted By Read Name (<code>--readname</code>) == | + | <code>--firstRNExt</code> overrides the default "/1" that is appended to the Read Name of the first-end of a read pair with the specified value. |
| + | |
| + | == Second in Pair FastQ ReadName Extension (<code>--secondRNExt</code>) == |
| + | |
| + | <code>--secondRNExt</code> overrides the default "/2" that is appended to the Read Name of the second-end of a read pair with the specified value. |
| + | |
| + | == Include the Read Name on the "+" line of the FASTQ (<code>--rnPlus</code>) == |
| + | |
| + | By default the read name is not included on the "+" line of the FASTQ files. To include the read name and the extension for paired-end reads, specify <code>--rnPlus</code>. |
| + | |
| + | == Do Not Reverse Complement Reverse Strands (<code>--noReverseComp</code>) == |
| + | |
| + | By default, reads marked as reverse in the BAM file are reverse complemented prior to writing to the FASTQ files. <code>--noReverseComp</code> disables this feature, and skips the reverse complement step. |
| + | |
| + | For example, if a sequence is ACCGTG marked as reverse, the default FASTQ record will be written as: CACGGT |
| | | |
− | The bam2FastQ program by default checks the sort order in the SAM/BAM header when converting to FASTQ, and if that is not specified, assumes it is sorted by coordinate.
| + | Specifying <code>--noReverseComp</code> would result in a FASTQ sequence of ACCGTG |
| | | |
− | To override the default and force it to assume the file is sorted by readname, specify the <code>--readName</code> option
| + | {{noeofBGZFParameter}} |
| + | {{paramsParameter}} |
| | | |
− | The file does not need to be strictly sorted by read name. The only requirement is that matching read names are next to each other.
| |
| | | |
| + | = Return Value = |
| | | |
− | == Reference File for Converting '=' in the Sequence to Bases <code>--refFile</code>==
| + | Returns -1 if input parameters are invalid. |
− | If the SAM/BAM file contains '=' in the sequence instead of the actual bases, the bam2FastQ program needs to convert the '=' back to the bases. To do that it needs the reference. Specify the reference by using <code>--refFile</code> followed by the reference filename.
| |
| | | |
− | For example:
| + | Returns the SamStatus for the reads/writes (0 on success). |
− | ./bam bam2FastQ --in myFile.bam --refFile myPath/myRefFile.fa
| |