3,045
edits
Changes
From Genome Analysis Wiki
BamUtil
,[[Category:BAM Software]]
The software reads the beginning of an input file to determine if it is SAM/BAM. To determine the format (SAM/BAM) of the output file, the software checks the output file's extension. If the extension is "".bam" " it writes a BAM file, otherwise it writes a SAM file.
The bam executable has the following functions.
== validate ==
The <<code>>validate<</code> > option on the bam executable reads and validates a SAM/BAM file. This option is documented at: [[BamValidator]]
== convert ==
The <<code>>convert<</code> > option on the bam executable reads a SAM/BAM file and writes it as a SAM/BAM file.
The executable converts the input file into the format of the output file. So if you want to convert a BAM file to a SAM file, from the pipeline/bam/ directory you just call:
./bam --in <<bamFile>>.bam --out <<newSamFile>>.sam
Don't forget to put in the paths to the executable and your test files.
=== Parameters ===
Required Parameters:
--in : the SAM/BAM file to be read
--noeof : do not expect an EOF block on a bam file.
--params : print the parameter settings
=== Usage ===
./bam convert --in <<inputFile> > --out <<outputFile.sam/bam/ubam (ubam is uncompressed bam)> > [--noeof] [--params]
=== Example Output ===
Number of records read = 10
Number of records written = 10
== dumpHeader ==
The <<code>>dumpHeader<</code> > option on the bam executable prints the header of the specified SAM/BAM file to cout.
=== Parameters ===
Required Parameters:
filename : the sam/bam filename whose header should be printed.
=== Usage ===
./bam dumpHeader <<inputFile>>
=== Return Value ===
=== Example Output ===
@SQ SN:1 LN:247249719
@SQ SN:2 LN:242951149
@SQ SN:3 LN:199501827
== splitChromosome ==
The <<code>>splitChromosome<</code> > option on the bam executable splits an indexed BAM file into multiple files based on the Chromosome (Reference Name).
The files all have the same base name, but with an _# where # corresponds with the associated reference id from the BAM file.
=== Parameters ===
Required Parameters:
--in : the BAM file to be split
--noeof : do not expect an EOF block on a bam file.
--bamIndex : the path/name of the bam index file
(if not specified, uses the --in value + "".bai"")
--bamout : write the output files in BAM format (default).
--samout : write the output files in SAM format.
--params : print the parameter settings
=== Usage ===
./bam splitChromosome --in <<inputFilename> > --out <<outputFileBaseName> > [--bamIndex <<bamIndexFile>>] [--noeof] [--bamout|--samout] [--params]
=== Example Output ===
Reference ID -1 has 2 records
Reference ID 0 has 5 records
Number of records = 10
Returning: 0 (SUCCESS)
== writeRegion ==
The <<code>>writeRegion<</code> > option on the bam executable writes the alignments in the indexed BAM file that fall into the specified region (reference id and start/end position).
=== Parameters ===
Required Parameters:
--in : the BAM file to be read
--noeof : do not expect an EOF block on a bam file.
--bamIndex : the path/name of the bam index file
(if not specified, uses the --in value + "".bai"")
--refName : the BAM reference Name to read (either this or refID can be specified)
--refID : the BAM reference ID to read (defaults to -1: unmapped)
--end : exclusive 0-based end position (defaults to -1: meaning til the end of the reference)
--params : print the parameter settings
=== Usage ===
./bam writeRegion --in <<inputFilename> > --out <<outputFilename> > [--bamIndex <<bamIndexFile>>] [--noeof] [--refName <<reference Name> > | --refID <<reference ID>>] [--start <<0-based start pos>>] [--end <<0-based end psoition>>] [--params]
=== Return Value ===
=== Example Output ===
Wrote t.sam with 2 records.
== dumpRefInfo ==
The <<code>>dumpRefInfo<</code> > option on the bam executable prints the SAM/BAM file's reference information.
=== Parameters ===
Required Parameters:
--in : the SAM/BAM file to be read
--printRecordRefs : print the reference information for the records in the file (grouped by reference).
--params : print the parameter settings
=== Usage ===
./bam dumpRefInfo --in <<inputFilename> > [--noeof] [--printRecordRefs] [--params]
=== Return Value ===
== dumpIndex ==
The <<code>>dumpIndex<</code> > option on the bam executable prints BAM index file in an easy to read format.
=== Parameters ===
Required Parameters:
--bamIndex : the path/name of the bam index file to display
--summary : only print a summary - 1 line per reference.
--params : print the parameter settings
=== Usage ===
./bam dumpIndex --bamIndex <<bamIndexFile> > [--refID <<ref#>>] [--summary] [--params]
=== Return Value ===
== readIndexedBam ==
The <<code>>readIndexedBam<</code> > option on the bam executable reads an indexed BAM file reference id by reference id -1 to the max reference id and writes it out as a SAM/BAM file.
=== Parameters ===
Required Parameters:
inputFilename - path/name of the input BAM file
outputFile.sam/bam - path/name of the output file
bamIndexFile - path/name of the BAM index file
=== Usage ===
./bam readIndexedBam <<inputFilename> <> <outputFile.sam/bam> <> <bamIndexFile>>
=== Return Value ===
== filter ==
The <<code>>filter<</code> > option on the bam executable filters the reads in a a SAM/BAM file. This option is documented at: [[Bam Executable: Filter]]
== readReference ==
The <<code>>readReference<</code> > option on the bam executable prints the specified region of the reference sequence in an easy to read format.
=== Parameters ===
Required Parameters:
--refFile : the reference
--numBases : number of bases from start to display
--params : print the parameter settings
=== Usage ===
./bam readReference --refFile <<referenceFilename> > --refName <<reference Name> > --start <<0 based start> > --end <<0 based end>>|--numBases <<number of bases> > [--params]
=== Return Value ===
=== Example Output ===
open and prefetch reference genome /home/mktrost/data/human.g1k.v37.fa: done.
GGCAAAATGTATATAATTATGGCATGAGGTATGCAACTTTAGGCAAGGAAGCAAAAGCAGAAACCATGAAA