Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 33: Line 33:  
== Input Data:==  
 
== Input Data:==  
 
*Raw Sequence (FASTQ) files  
 
*Raw Sequence (FASTQ) files  
*Sequence Index file containing fastqs & RG info
+
*FASTQ List file mapping fastq pairs to sample (optional: Read Group information)
 
*Reference files  
 
*Reference files  
 
*(Optional) Configuration file to override default options  
 
*(Optional) Configuration file to override default options  
Line 41: Line 41:  
These are the FASTQ files that need to be mapped to BAM files.  
 
These are the FASTQ files that need to be mapped to BAM files.  
   −
These files are specified in the [[#Sequence Index File|Sequence Index File]].  
+
These files are specified in the [[#FASTQ List File|FASTQ List File]].  
   −
=== Sequence Index File ===  
+
=== FASTQ List File ===  
This file specifies the FASTQ files that need to be processed and the Read Group information for them.  
+
This file specifies the FASTQ files that need to be processed.  It maps the FASTQ pairs to the associated Sample ID.  Optionally Read Group information for the FASTQ pairs can be specified.  If the Read Group information is not specified, it is inferred.  
   −
This file is specified either via the command line parameter <code>--index_file</code> or via the configuration file setting <code>INDEX_FILE</code>.   
+
This file is specified either via the command line parameter <code>--list</code> or via the configuration file setting <code>FASTQ_LIST</code>.   
    
The command-line setting takes precedence over the configuration file setting.  
 
The command-line setting takes precedence over the configuration file setting.  
   −
The Sequence Index is a tab delimited file that starts with a header line.  The columns may be in any order.  
+
The FASTQ list is a tab delimited file that starts with a header line.  The columns may be in any order.  
    
Following the header line, there is one line per single-end read and one line per paired-end read (only 1 line per pair).  
 
Following the header line, there is one line per single-end read and one line per paired-end read (only 1 line per pair).  

Navigation menu