Line 26: |
Line 26: |
| | | |
| === Output Files === | | === Output Files === |
− | This program produces 3 output fastq files.
| + | By default, this program produces 3 output fastq files. |
| # unpaired reads | | # unpaired reads |
| # first end of paired reads | | # first end of paired reads |
| # second end of paired reads | | # second end of paired reads |
| + | |
| + | If the [[#Generate 1 Paired-End Output File (--merge)|<code>--merge</code>]] option is specified, the program produces 2 output fastq files. |
| + | # unpaired reads |
| + | # interleaved paired-end reads |
| | | |
| The default fastq file names are determined by taking the base name of the input file and adding an extension for each filetype. | | The default fastq file names are determined by taking the base name of the input file and adding an extension for each filetype. |
| {|border="1" cellspacing="0" cellpadding="2" | | {|border="1" cellspacing="0" cellpadding="2" |
− | ! Output File Contents !! Extension | + | ! colspan="2"|Default !!colspan="2"|[[#Generate 1 Paired-End Output File (--merge)|<code>--merge</code>]] |
| + | |- |
| + | ! Output File Contents !! Extension !! Output File Contents !! Extension |
| |- | | |- |
| + | |unpaired reads |
| + | | .fastq |
| |unpaired reads | | |unpaired reads |
| | .fastq | | | .fastq |
Line 40: |
Line 48: |
| |first end of paired reads | | |first end of paired reads |
| | _1.fastq | | | _1.fastq |
| + | | rowspan="2"|interleaved paired-end reads |
| + | (both first & second end) |
| + | | rowspan="2"|_interleaved.fastq |
| |- | | |- |
| |second end of paired reads | | |second end of paired reads |
Line 45: |
Line 56: |
| |} | | |} |
| | | |
− | If the inputFile was "myPath/myFile.bam", the resulting fastq's would be: | + | If the inputFile was "myPath/myFile.bam", the resulting fastqs would be: |
| #myPath/myFile.fastq | | #myPath/myFile.fastq |
| #myPath/myFile_1.fastq | | #myPath/myFile_1.fastq |
| #myPath/myFile_2.fastq | | #myPath/myFile_2.fastq |
| + | |
| + | With the [[#Generate 1 Paired-End Output File (--merge)|<code>--merge</code>]] option, the resulting fastqs would be: |
| + | #myPath/myFile.fastq |
| + | #myPath/myFile_interleaved.fastq |
| | | |
| Instead of using the inputFile base name as the output file base, you can specify a different base name by using the [[#Output FastQ File Base Name (--outBase)|--outBase]] option. | | Instead of using the inputFile base name as the output file base, you can specify a different base name by using the [[#Output FastQ File Base Name (--outBase)|--outBase]] option. |
| | | |
| You can optionally directly specify the output fastq filenames using: | | You can optionally directly specify the output fastq filenames using: |
− | * --firstOut firstReadInAPair.fastq | + | * --firstOut firstReadInAPair.fastq (also used for the interleaved filename with [[#Generate 1 Paired-End Output File (--merge)|<code>--merge</code>]]. |
| * --secondOut secondReadInAPair.fastq | | * --secondOut secondReadInAPair.fastq |
| * --unpairedOut unpairedReads.fastq | | * --unpairedOut unpairedReads.fastq |
Line 59: |
Line 74: |
| | | |
| = Usage = | | = Usage = |
− | ./bam bam2FastQ --in <inputFile> [--readName] [--refFile <referenceFile>] [--outBase <outputFileBase>] [--firstOut <1stReadInPairOutFile>] [--secondOut <2ndReadInPairOutFile>] [--unpairedOut <unpairedOutFile>] [--firstRNExt <firstInPairReadNameExt>] [--secondRNExt <secondInPairReadNameExt>] [--rnPlus] [--noReverseComp] [--noeof] [--params] | + | ./bam bam2FastQ --in <inputFile> [--readName] [--refFile <referenceFile>] [--outBase <outputFileBase>] [--firstOut <1stReadInPairOutFile>] [--merge|--secondOut <2ndReadInPairOutFile>] [--unpairedOut <unpairedOutFile>] [--firstRNExt <firstInPairReadNameExt>] [--secondRNExt <secondInPairReadNameExt>] [--rnPlus] [--noReverseComp] [--noeof] [--params] |
| | | |
| = Parameters = | | = Parameters = |
Line 68: |
Line 83: |
| --readname : Process the BAM as readName sorted instead | | --readname : Process the BAM as readName sorted instead |
| of coordinate if the header does not indicate a sort order. | | of coordinate if the header does not indicate a sort order. |
| + | --merge : Generate 1 interleaved (merged) FASTQ for paired-ends (unpaired in a separate file) |
| + | use firstOut to override the filename of the interleaved file. |
| --refFile : Reference file for converting '=' in the sequence to the actual base | | --refFile : Reference file for converting '=' in the sequence to the actual base |
| if '=' are found and the refFile is not specified, 'N' is written to the FASTQ | | if '=' are found and the refFile is not specified, 'N' is written to the FASTQ |
− | --outBase : Base output name for generated output files
| |
− | --firstOut : Output name for the first in pair file
| |
− | over-rides setting of outBase
| |
− | --secondOut : Output name for the second in pair file
| |
− | over-rides setting of outBase
| |
− | --unpairedOut : Output name for unpaired reads
| |
− | over-rides setting of outBase
| |
| --firstRNExt : read name extension to use for first read in a pair | | --firstRNExt : read name extension to use for first read in a pair |
| default is "/1" | | default is "/1" |
Line 85: |
Line 95: |
| --noeof : Do not expect an EOF block on a bam file. | | --noeof : Do not expect an EOF block on a bam file. |
| --params : Print the parameter settings to stderr | | --params : Print the parameter settings to stderr |
| + | Optional OutputFile Names: |
| + | --outBase : Base output name for generated output files |
| + | --firstOut : Output name for the first in pair file |
| + | over-rides setting of outBase |
| + | --secondOut : Output name for the second in pair file |
| + | over-rides setting of outBase |
| + | --unpairedOut : Output name for unpaired reads |
| + | over-rides setting of outBase |
| </pre> | | </pre> |
| | | |
| + | == Required Parameters == |
| {{inBAMInputFile}} | | {{inBAMInputFile}} |
| | | |
− | == BAM File Is Sorted By Read Name (<code>--readname</code>) == | + | == Optional Parameters == |
| + | === BAM File Is Sorted By Read Name (<code>--readname</code>) === |
| | | |
| The bam2FastQ program by default checks the sort order in the SAM/BAM header when converting to FASTQ, and if that is not specified, assumes it is sorted by coordinate. | | The bam2FastQ program by default checks the sort order in the SAM/BAM header when converting to FASTQ, and if that is not specified, assumes it is sorted by coordinate. |
Line 97: |
Line 117: |
| The file does not need to be strictly sorted by read name. The only requirement is that matching read names are next to each other. | | The file does not need to be strictly sorted by read name. The only requirement is that matching read names are next to each other. |
| | | |
− | == Reference File for Converting '=' in the Sequence to Bases <code>--refFile</code>== | + | === Generate 1 Paired-End Output File (<code>--merge</code>) === |
| + | |
| + | Use the <code>--merge</code> option to generate 1 interleaved (merged) FASTQ for paired-ends instead of 2 files. Unpaired reads are still written to a separate file. |
| + | |
| + | The default extension for the output file is "_interleaved". |
| + | |
| + | Use [[#Output FastQ File Name For the First End of Paired End (--firstOut)|<code>--firstOut</code>]] to override the filename of the interleaved file. |
| + | |
| + | This parameter was added in version 1.0.10. |
| + | |
| + | === Reference File for Converting '=' in the Sequence to Bases (<code>--refFile</code>) === |
| If the SAM/BAM file contains '=' in the sequence instead of the actual bases, the bam2FastQ program needs to convert the '=' back to the bases. To do that it needs the reference. Specify the reference by using <code>--refFile</code> followed by the reference filename. | | If the SAM/BAM file contains '=' in the sequence instead of the actual bases, the bam2FastQ program needs to convert the '=' back to the bases. To do that it needs the reference. Specify the reference by using <code>--refFile</code> followed by the reference filename. |
| | | |
Line 103: |
Line 133: |
| ./bam bam2FastQ --in myFile.bam --refFile myPath/myRefFile.fa | | ./bam bam2FastQ --in myFile.bam --refFile myPath/myRefFile.fa |
| | | |
− | == Output FastQ File Base Name (<code>--outBase</code>) == | + | === First in Pair FastQ ReadName Extension (<code>--firstRNExt</code>) === |
| + | |
| + | <code>--firstRNExt</code> overrides the default "/1" that is appended to the Read Name of the first-end of a read pair with the specified value. |
| + | |
| + | === Second in Pair FastQ ReadName Extension (<code>--secondRNExt</code>) === |
| + | |
| + | <code>--secondRNExt</code> overrides the default "/2" that is appended to the Read Name of the second-end of a read pair with the specified value. |
| + | |
| + | === Include the Read Name on the "+" line of the FASTQ (<code>--rnPlus</code>) === |
| + | |
| + | By default the read name is not included on the "+" line of the FASTQ files. To include the read name and the extension for paired-end reads, specify <code>--rnPlus</code>. |
| + | |
| + | === Do Not Reverse Complement Reverse Strands (<code>--noReverseComp</code>) === |
| + | |
| + | By default, reads marked as reverse in the BAM file are reverse complemented prior to writing to the FASTQ files. <code>--noReverseComp</code> disables this feature, and skips the reverse complement step. |
| + | |
| + | For example, if a sequence is ACCGTG marked as reverse, the default FASTQ record will be written as: CACGGT |
| + | |
| + | Specifying <code>--noReverseComp</code> would result in a FASTQ sequence of ACCGTG |
| + | |
| + | {{noeofBGZFParameter}} |
| + | {{paramsParameter}} |
| + | |
| + | == Optional Output Filenames == |
| + | |
| + | === Output FastQ File Base Name (<code>--outBase</code>) === |
| | | |
| You can replace the default output base name by using the <code>--outBase</code> option. | | You can replace the default output base name by using the <code>--outBase</code> option. |
Line 113: |
Line 168: |
| The value specified by this parameter is overridden by <code>--firstOut</code>, <code>--secondOut</code>, and <code>--unpairedOut</code>, but is used for whichever output files are not specified. | | The value specified by this parameter is overridden by <code>--firstOut</code>, <code>--secondOut</code>, and <code>--unpairedOut</code>, but is used for whichever output files are not specified. |
| | | |
− | == Output FastQ File Name For the First End of Paired End (<code>--firstOut</code>) == | + | With the [[#Generate 1 Paired-End Output File (--merge)|<code>--merge</code>]] option, the resulting fastq's would instead be: |
| + | #myNewPath/myFastQBase.fastq |
| + | #myNewPath/myFastQBase_interleaved.fastq |
| + | |
| + | === Output FastQ File Name For the First End of Paired End (<code>--firstOut</code>) === |
| | | |
| This setting overides the default and <code>--outBase</code> file name. | | This setting overides the default and <code>--outBase</code> file name. |
Line 119: |
Line 178: |
| The entire filename and extension must be specified. | | The entire filename and extension must be specified. |
| | | |
− | Does not affect the filenames for the first end or for unpaired reads. | + | Does not affect the filenames for the second end or for unpaired reads. |
| | | |
| For example: | | For example: |
| ./bam bam2FastQ --in myFile.bam --firstOut myFileEnd1.fastq | | ./bam bam2FastQ --in myFile.bam --firstOut myFileEnd1.fastq |
| | | |
− | == Output FastQ File Name For the Second End of Paired End (<code>--secondOut</code>) == | + | === Output FastQ File Name For the Second End of Paired End (<code>--secondOut</code>) === |
| | | |
| This setting overides the default and <code>--outBase</code> file name. | | This setting overides the default and <code>--outBase</code> file name. |
Line 135: |
Line 194: |
| ./bam bam2FastQ --in myFile.bam --secondOut myFileEnd2.fastq | | ./bam bam2FastQ --in myFile.bam --secondOut myFileEnd2.fastq |
| | | |
− | == Output FastQ File Name For Unpaired Reads (<code>--unpairedOut</code>) == | + | === Output FastQ File Name For Unpaired Reads (<code>--unpairedOut</code>) === |
| | | |
| This setting overides the default and <code>--outBase</code> file names. | | This setting overides the default and <code>--outBase</code> file names. |
Line 141: |
Line 200: |
| The entire filename and extension must be specified. | | The entire filename and extension must be specified. |
| | | |
− | Does not affect the filenames for the two paired end fastq files. | + | Does not affect the filenames for the paired-end fastq files. |
| | | |
| For example: | | For example: |
| ./bam bam2FastQ --in myFile.bam --unpairedOut myFileUnpaired.fastq | | ./bam bam2FastQ --in myFile.bam --unpairedOut myFileUnpaired.fastq |
| | | |
− | == First in Pair FastQ ReadName Extension (<code>--firstRNExt</code>) ==
| + | {{PhoneHomeParameters}} |
− | | |
− | <code>--firstRNExt</code> overrides the default "/1" that is appended to the Read Name of the first-end of a read pair with the specified value.
| |
− | | |
− | == Second in Pair FastQ ReadName Extension (<code>--secondRNExt</code>) ==
| |
− | | |
− | <code>--secondRNExt</code> overrides the default "/2" that is appended to the Read Name of the second-end of a read pair with the specified value.
| |
− | | |
− | == Include the Read Name on the "+" line of the FASTQ (<code>--rnPlus</code>) ==
| |
− | | |
− | By default the read name is not included on the "+" line of the FASTQ files. To include the read name and the extension for paired-end reads, specify <code>--rnPlus</code>.
| |
− | | |
− | == Do Not Reverse Complement Reverse Strands (<code>--noReverseComp</code>) ==
| |
− | | |
− | By default, reads marked as reverse in the BAM file are reverse complemented prior to writing to the FASTQ files. <code>--noReverseComp</code> disables this feature, and skips the reverse complement step.
| |
− | | |
− | For example, if a sequence is ACCGTG marked as reverse, the default FASTQ record will be written as: CACGGT
| |
− | | |
− | Specifying <code>--noReverseComp</code> would result in a FASTQ sequence of ACCGTG
| |
− | | |
− | {{noeofBGZFParameter}} | |
− | {{paramsParameter}}
| |
− | | |
| | | |
| = Return Value = | | = Return Value = |
Line 174: |
Line 211: |
| Returns -1 if input parameters are invalid. | | Returns -1 if input parameters are invalid. |
| | | |
− | Returns the SamStatus for the reads/writes (0 on success). | + | Returns the SamStatus for the reads/writes (0 on success, non-0 on failure). |