Difference between revisions of "BamUtil: mergeBam"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 10: Line 10:
 
=== Usage===
 
=== Usage===
 
<pre>
 
<pre>
  rgMergeBam [-v] [-l listFile] [-o outFile] [-L logFile] [inputBAMfile1] [inputBAMfile2] ...
+
  rgMergeBam [-v]options) --list=<RGAListFile> --out=<outBamFile>
 +
rgMergeBam [-v] [-L logFile] [-l listFile] [-o outFile]
  
Required arguments:
+
Required parameters :
    -l listFile : File containing RG tags including [BAM] [ID] [SM] [LB]
+
--out/-o : Output BAM file (sorted)
    -o outFile  : output BAM file name
+
--list/-l : RGAList File. Tab-delimited list consisting of following columns (with headers):
Optional arguments:
+
BAM* : Input BAM file name to be merged
    -L logFile  : log file name. default is listFile.log
+
ID* : Unique read group identifier
    -v : turn on verbose mode
+
SM* : Sample name
 +
LB : Library name
 +
DS : Description
 +
PU : Platform unit
 +
PI : Predicted median insert size
 +
CN : Name of sequencing center producing the read
 +
DT : Date the rn was produced
 +
PL : Platform/technology used to produce the read
 +
* (Required fields)
 +
Optional parameters :  
 +
--log/-L : Log file
 +
--verbose/-v : Turn on verbose mode
 
</pre>
 
</pre>

Revision as of 18:29, 1 November 2010

RGMergeBAM : Merge multiple BAM files appending ReadGroup IDs

rgMergeBam merges multiple sorted BAM files into one BAM file like 'samtools merge' command, but merges BAM headers.

  • Checks that the HD and SQ tags are identical across the BAM files
  • Adds @RG headers from a tabular input file containing the fields' info
  • Adds RG:Z:[RGID] tag for each record based on the source BAM file
  • Ensures that the headers are identical across the input files and that input/output BAM records are sorted


Usage

 rgMergeBam [-v]options) --list=<RGAListFile> --out=<outBamFile>
rgMergeBam [-v] [-L logFile] [-l listFile] [-o outFile]

Required parameters :
--out/-o : Output BAM file (sorted)
--list/-l : RGAList File. Tab-delimited list consisting of following columns (with headers):
	BAM* : Input BAM file name to be merged
	ID* : Unique read group identifier
	SM* : Sample name
	LB : Library name
	DS : Description
	PU : Platform unit
	PI : Predicted median insert size
	CN : Name of sequencing center producing the read
	DT : Date the rn was produced
	PL : Platform/technology used to produce the read
	* (Required fields)
Optional parameters : 
--log/-L : Log file
--verbose/-v : Turn on verbose mode