Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,936 bytes added ,  23:53, 5 March 2016
Line 81: Line 81:  
= Parameters =
 
= Parameters =
 
<pre>
 
<pre>
Required Parameters:
+
        Required Parameters:
--in      : the SAM/BAM file to convert to FastQ
+
                --in      : the SAM/BAM file to convert to FastQ
Optional Parameters:
+
        Optional Parameters:
--readname      : Process the BAM as readName sorted instead
+
                --readname      : Process the BAM as readName sorted instead
                  of coordinate if the header does not indicate a sort order.
+
                                  of coordinate if the header does not indicate a sort order.
--merge        : Generate 1 interleaved (merged) FASTQ for paired-ends (unpaired in a separate file)
+
                --splitRG      : Split into RG specific fastqs.
                  use firstOut to override the filename of the interleaved file.
+
                --qualField    : Use the base quality from the specified tag
--refFile      : Reference file for converting '=' in the sequence to the actual base
+
                                  rather than from the Quality field (default)
                  if '=' are found and the refFile is not specified, 'N' is written to the FASTQ
+
                --merge        : Generate 1 interleaved (merged) FASTQ for paired-ends (unpaired in a separate file)
--firstRNExt    : read name extension to use for first read in a pair
+
                                  use firstOut to override the filename of the interleaved file.
                  default is "/1"
+
                --refFile      : Reference file for converting '=' in the sequence to the actual base
--secondRNExt  : read name extension to use for second read in a pair
+
                                  if '=' are found and the refFile is not specified, 'N' is written to the FASTQ
                  default is "/2"
+
                --firstRNExt    : read name extension to use for first read in a pair
--rnPlus        : Add the Read Name/extension to the '+' line of the fastq records
+
                                  default is "/1"
--noReverseComp : Do not reverse complement reads marked as reverse
+
                --secondRNExt  : read name extension to use for second read in a pair
--noeof        : Do not expect an EOF block on a bam file.
+
                                  default is "/2"
--params        : Print the parameter settings to stderr
+
                --rnPlus        : Add the Read Name/extension to the '+' line of the fastq records
Optional OutputFile Names:
+
                --noReverseComp : Do not reverse complement reads marked as reverse
--outBase      : Base output name for generated output files
+
                --region        : Only convert reads containing the specified region/nucleotide.
--firstOut      : Output name for the first in pair file
+
                                  Position formatted as: chr:pos:base
                  over-rides setting of outBase
+
                                  pos (0-based) & base are optional.
--secondOut    : Output name for the second in pair file
+
                --gzip          : Compress the output FASTQ files using gzip
                  over-rides setting of outBase
+
                --noeof        : Do not expect an EOF block on a bam file.
--unpairedOut  : Output name for unpaired reads
+
                --params        : Print the parameter settings to stderr
                  over-rides setting of outBase
+
        Optional OutputFile Names:
 +
                --outBase      : Base output name for generated output files
 +
                --firstOut      : Output name for the first in pair file
 +
                                  over-rides setting of outBase
 +
                --secondOut    : Output name for the second in pair file
 +
                                  over-rides setting of outBase
 +
                --unpairedOut  : Output name for unpaired reads
 +
                                  over-rides setting of outBase
 
</pre>
 
</pre>
   Line 119: Line 126:     
The file does not need to be strictly sorted by read name.  The only requirement is that matching read names are next to each other.
 
The file does not need to be strictly sorted by read name.  The only requirement is that matching read names are next to each other.
 +
 +
=== Split into RG Specific FASTQs (<code>--splitRG</code>) ===
 +
 +
Create RG specific FASTQ files.
 +
 +
Cannot be specified with firstOut/secondOut/unpairedOut since there will be a different filename for each RG.
 +
 +
Cannot write to stdout when <code>--splitRG</code> is specified.
 +
 +
Output filenames will be <outBase>.<RG>_1.fastq, <outBase>.<RG>_2.fastq, and <outBase>.<RG>.fastq.  A fastq list file <outBase>.list will be created containing MERGE_NAME (the RG tag's SM value or outBase if the value is empty), fastq 1, fastq 2 (or . if it is a single ended fastq), and the RG tag string.
 +
 +
=== Use the Base Quality from the Specified Tag (<code>--qualField</code>) ===
 +
 +
By default, the quality field is used for the Base Qualities in the FASTQ file.  Specify <code>--qualField <tagName></code> to use the base qualities from the specified tag instead of the quality field.
 +
    
=== Generate 1 Paired-End Output File (<code>--merge</code>) ===
 
=== Generate 1 Paired-End Output File (<code>--merge</code>) ===
Line 155: Line 177:     
Specifying <code>--noReverseComp</code> would result in a FASTQ sequence of ACCGTG
 
Specifying <code>--noReverseComp</code> would result in a FASTQ sequence of ACCGTG
 +
 +
=== Only Convert the Specified Region (<code>--region</code>) ===
 +
 +
Only convert reads containing the specified region/nucleotide.
 +
 +
Position formatted as: chr:pos:base
 +
 +
pos (0-based) & base are optional.
    
{{noeofBGZFParameter}}
 
{{noeofBGZFParameter}}

Navigation menu