Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 60: Line 60:     
Required Column Names:  
 
Required Column Names:  
* MERGE_NAME - base name for the resulting BAM file for the sample (used to group multiple fastqs or fastq pairs into a single BAM)  
+
* MERGE_NAME - base name for the resulting BAM file for the sample (used to group multiple fastqs or fastq pairs into a single BAM)
 +
** The SAMPLE column can be specified instead of MERGE_NAME.  SAMPLE will be used for both the sample and the base name.
 
* FASTQ1 - name of the fastq or the first in the pair if paired-end.  (Only 1 line per pair)  
 
* FASTQ1 - name of the fastq or the first in the pair if paired-end.  (Only 1 line per pair)  
    
Optional Column Names:  
 
Optional Column Names:  
 
* FASTQ2 - name of the 2nd fastq in paired-end reads.  Specify '.' if the column exists, but this line is single-ended.  
 
* FASTQ2 - name of the 2nd fastq in paired-end reads.  Specify '.' if the column exists, but this line is single-ended.  
* RGID - Read Group ID for this entry  
+
* RGID - Read Group ID for this entry
 +
** If this field is not specified, the first line of the fastq will be used to determine the RG.
 +
*** If the first line does not match the expected format for determining RG, incrementing numbers per fastq file will be used.
 
* SAMPLE - Sample Name for this entry  
 
* SAMPLE - Sample Name for this entry  
 +
** If SAMPLE is not specified, MERGE_NAME will be used for the sample name
 
* LIBRARY - Library for this entry  
 
* LIBRARY - Library for this entry  
 +
** If LIBRARY is not specified, the sample name will be used
 
* CENTER - Center Name for this entry  
 
* CENTER - Center Name for this entry  
 +
** If CENTER is not specified, it will default to "unknown"
 
* PLATFORM - Platform for this entry  
 
* PLATFORM - Platform for this entry  
 +
** If PLATFORM is not specified, it will default to ILLUMINA
   −
The RGID, SAMPLE, LIBRARY, CENTER, and PLATFORM are used to populate the Read Group information for this entry.  These fields are optional.  Either leave the column header out of the file or specify '.' if the column header exists, but the data is N/A.  As long as the RGID field is specified non-N/A fields are added to the BAM file.
+
The RGID, SAMPLE, LIBRARY, CENTER, and PLATFORM are used to populate the Read Group information for this entry.   
    
  MERGE_NAME FASTQ1 FASTQ2 RGID SAMPLE LIBRARY CENTER PLATFORM  
 
  MERGE_NAME FASTQ1 FASTQ2 RGID SAMPLE LIBRARY CENTER PLATFORM  
Line 79: Line 86:  
  Sample2 fastq/S2/F2.fastq.gz . RGID2 SampleID2 Lib2 UM ILLUMINA  
 
  Sample2 fastq/S2/F2.fastq.gz . RGID2 SampleID2 Lib2 UM ILLUMINA  
   −
The <code>--fastq</code>/<code>FASTQ</code> setting can be used to specify a prefix to the FASTQ1/FASTQ2 file paths that should be applied before using the files.  
+
The <code>--fastq_prefix</code>/<code>FASTQ_PREFIX</code> setting can be used to specify a prefix to the FASTQ1/FASTQ2 file paths that should be applied before using the files.
    
=== Reference Files ===  
 
=== Reference Files ===  

Navigation menu