BamUtil: mergeBam

From Genome Analysis Wiki
Revision as of 18:30, 1 November 2010 by Mktrost (talk | contribs)
Jump to navigationJump to search

RGMergeBAM : Merge multiple BAM files appending ReadGroup IDs

rgMergeBam merges multiple sorted BAM files into one BAM file like 'samtools merge' command, but merges BAM headers.

  • Checks that the HD and SQ tags are identical across the BAM files
  • Adds @RG headers from a tabular input file containing the fields' info
  • Adds RG:Z:[RGID] tag for each record based on the source BAM file
  • Ensures that the headers are identical across the input files and that input/output BAM records are sorted


Usage

rgMergeBam [-v] [--log logFile] --list listFile --out outFile

Required parameters :
--out/-o : Output BAM file (sorted)
--list/-l : RGAList File. Tab-delimited list consisting of following columns (with headers):
	BAM* : Input BAM file name to be merged
	ID* : Unique read group identifier
	SM* : Sample name
	LB : Library name
	DS : Description
	PU : Platform unit
	PI : Predicted median insert size
	CN : Name of sequencing center producing the read
	DT : Date the rn was produced
	PL : Platform/technology used to produce the read
	* (Required fields)
Optional parameters : 
--log/-L : Log file
--verbose/-v : Turn on verbose mode