BamUtil: mergeBam

From Genome Analysis Wiki
(Redirected from BamUtil: rgMergeBam)
Jump to navigationJump to search

Overview of the mergeBam function of bamUtil

The mergeBam option on the bamUtil executable merges multiple BAM files appending ReadGroup IDs if necessary.

As of version 1.0.7, this program was renamed from rgMergeBam to mergeBam.

mergeBam merges multiple sorted SAM/BAM files into one BAM file like 'samtools merge' command, but merges BAM headers.

  • Checks that the non RG header fields are identical across the BAM files
  • Checks that the input SAM/BAM records are sorted
  • If --list option is used:
    • Ensures that the headers are identical across the input files
    • Adds @RG headers from a tabular input file containing the fields' info
    • Adds RG:Z:[RGID] tag for each record based on the source BAM file
  • If --in is used:
    • Merges the RG headers from the files, checking that they RG IDs are unique or if they are the same that the rest of the fields are the same


Usage

 mergeBam [-v] [--log logFile] [--ignorePI] --list <listFile> --out <outFile>

Parameters

Required parameters :
--out/-o : Output BAM file (sorted)
--in/-i  : BAM file to be input, must be more than one of these options.
            cannot be used with --list/-l
--list/-l : RGAList File. Tab-delimited list consisting of following columns (with headers):
	BAM* : Input BAM file name to be merged
	ID* : Unique read group identifier
	SM* : Sample name
	LB : Library name
	DS : Description
	PU : Platform unit
	PI : Predicted median insert size
	CN : Name of sequencing center producing the read
	DT : Date the rn was produced
	PL : Platform/technology used to produce the read
	* (Required fields)
Optional parameters : 
--ignorePI/-I : Ignore the RG PI field when comparing headers
--log/-L : Log file
--verbose/-v : Turn on verbose mode
	PhoneHome:
		--noPhoneHome       : disable PhoneHome (default enabled)
		--phoneHomeThinning : adjust the PhoneHome thinning parameter (default 50)

Required Parameters

Input Files (--in, --list)

Use multiple --in parameters each followed by an input file name to specify the SAM/BAM input files (more than one --in must be specified).

Alternatively, a RGAList file can be specified using --list. The specified file contains a header row with a tab delimited list of the included columns followed by a row for each BAM file with a tab-delimited list of the values for each column.

The possible header column name are (* indicates required fields):

  • BAM* : Input BAM file name to be merged
  • ID* : Unique read group identifier
  • SM* : Sample name
  • LB : Library name
  • DS : Description
  • PU : Platform unit
  • PI : Predicted median insert size
  • CN : Name of sequencing center producing the read
  • DT : Date the rn was produced
  • PL : Platform/technology used to produce the read


The program automatically determines if your input files are SAM/BAM/uncompressed BAM.

Output File (--out)

Use --out followed by your file name to specify the SAM/BAM output file.

The file extension is used to determine whether to write SAM/BAM/uncompressed BAM. A - is used to indicate stdout and the extension for file type (no extension is SAM).

SAM to file --out yourFileName.sam
BAM to file --out yourFileName.bam
Uncompressed BAM to file --out yourFileName.ubam
SAM to stdout --out -
BAM to stdout --out -.bam
Uncompressed BAM to stdout --out -.ubam


Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools implementation so pipes between our tools and samtools are supported.

Optional Parameters

Specify Log Filename (--ignorePI or -I)

Use --ignorePI to ignore the RG PI field when comparing headers. The field from the first header will be used in the output file.

This parameter was added in version 1.0.10.

Specify Log Filename (--log)

Use --log followed by the log filename to specify the log filename. Default is the output file basename with a .log extension

Verbose (--verbose)

Use --verbose to turn on verbose mode.


PhoneHome Parameters

See PhoneHome for more information on how PhoneHome works and what it does.

Turn off PhoneHome (--noPhoneHome)

Use the --noPhoneHome option to completely disable PhoneHome. PhoneHome is enabled by default based on the thinning parameter.

Adjust the Frequency of PhoneHome (--phoneHomeThinning)

Use --phoneHomeThinning to modify the percentage of the time that PhoneHome will run (0-100).

  • By default, --phoneHomeThinning is set to 50, running 50% of the time.
  • PhoneHome will only occur if the run's random number modulo 100 is less than the --phoneHomeThinning value.
  • N/A if --noPhoneHome is set.


Return Value

Returns 0 on success, non-0 on failure.