Difference between revisions of "BamUtil: mergeBam"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 1: Line 1:
== RGMergeBAM : Merge multiple BAM files appending ReadGroup IDs==
+
= Overview of the <code>rgMergeBam</code> function of <code>bamUtil</code> =
 +
The <code>rgMergeBam</code> option on the [[bamUtil]] executable merges multiple BAM files appending ReadGroup IDs.
  
 
rgMergeBam merges multiple sorted BAM files into one BAM file like 'samtools merge' command, but merges BAM headers.
 
rgMergeBam merges multiple sorted BAM files into one BAM file like 'samtools merge' command, but merges BAM headers.
Line 8: Line 9:
  
  
=== Usage===
+
= Usage=
 
<pre>
 
<pre>
rgMergeBam [-v] [--log logFile] --list <listFile> --out <outFile>
+
./bam rgMergeBam [-v] [--log logFile] --list <listFile> --out <outFile>
 +
</pre>
  
 +
=Parameters=
 +
<pre>
 
Required parameters :
 
Required parameters :
 
--out/-o : Output BAM file (sorted)
 
--out/-o : Output BAM file (sorted)
Line 31: Line 35:
 
</pre>
 
</pre>
  
 +
[[Category:BamUtil|rgMergeBam]]
 +
[[Category:BAM Software]]
 
[[Category:Software]]
 
[[Category:Software]]
[[Category:StatGen Download]]
 
[[Category:BAM Software]]
 

Revision as of 14:51, 3 April 2012

Overview of the rgMergeBam function of bamUtil

The rgMergeBam option on the bamUtil executable merges multiple BAM files appending ReadGroup IDs.

rgMergeBam merges multiple sorted BAM files into one BAM file like 'samtools merge' command, but merges BAM headers.

  • Checks that the HD and SQ tags are identical across the BAM files
  • Adds @RG headers from a tabular input file containing the fields' info
  • Adds RG:Z:[RGID] tag for each record based on the source BAM file
  • Ensures that the headers are identical across the input files and that input/output BAM records are sorted


Usage

./bam rgMergeBam [-v] [--log logFile] --list <listFile> --out <outFile>

Parameters

Required parameters :
--out/-o : Output BAM file (sorted)
--list/-l : RGAList File. Tab-delimited list consisting of following columns (with headers):
	BAM* : Input BAM file name to be merged
	ID* : Unique read group identifier
	SM* : Sample name
	LB : Library name
	DS : Description
	PU : Platform unit
	PI : Predicted median insert size
	CN : Name of sequencing center producing the read
	DT : Date the rn was produced
	PL : Platform/technology used to produce the read
	* (Required fields)
Optional parameters : 
--log/-L : Log file
--verbose/-v : Turn on verbose mode