bamUtil is a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable,
Where to Find It
The bamUtil repository is available both via release downloads (coming soon) and via github.
On github, you can both browse and download the latest version of the repository as well as explore the history of changes.
You can access the latest version with or without git.
If you download from github or use git to keep up to date, you also need to download our library: libStatGen.
The releases will be available both with and without libStatGen included. If you download the verison without libStatGen included, you will also need to download libStatGen separately. (It will be available without libStatGen in case you already have a downloaded version of libStatGen that you want to use.
Release downloads are Coming Soon.
Using Git To Track the Current Development Version
Clone (get your own copy)
You can create your own git clone (copy) using:
git clone https://github.com/statgen/bamUtil.git
git clone git://github.com/statgen/bamUtil.git
Either of these commands create a directory called
bamUtil in the current directory.
cd bamUtil and compile.
Get the latest Updates (update your copy)
To update your copy to the latest version (a major advantage of using git):
Downloading From GitHub Without Git
Periodically download the latest copy from github from the "Downloads" link on the webpage: https://github.com/statgen/bamUtil/archives/master.
The downloaded tar file is named "statgen-bamUtil-someHexNumber.tar.gz". The directory created when it is untared shares the same base name. I recommend that you do not change the name of the directory. If you want one called bamUtil, create a link to this directory. The hex number in the directory name identifies the version of the repository that you downloaded and is necessary to easily troubleshoot any issues you encounter. If you must rename the directory, be sure to record the hex number that was on the download for future reference.
After obtaining the bamUtil repository (either by download or from github), compile the code using
make all. This creates the executable,
bam, in the
bamUtil/bin/ directory, the debug executable in the
bamUtil/bin/debug/ directory, and the profiling executable in the
The software reads the beginning of an input file to determine if it is SAM/BAM. To determine the format (SAM/BAM) of the output file, the software checks the output file's extension. If the extension is ".bam" it writes a BAM file, otherwise it writes a SAM file.
The bam executable has the following functions.
- Rewrite SAM/BAM Files
- convert - Read a SAM/BAM file and write as a SAM/BAM file (optionally converts between '=' & bases in the sequence)
- splitChromosome - Split BAM by Chromosome
- writeRegion - Write the alignments in the indexed BAM file that fall into the specified region and/or have the specified read name
- findCigars - Output just the reads that contain any of the specified CIGAR operations
- readIndexedBam - Read an indexed BAM file reference by reference id -1 to the max reference id and write it out as a SAM/BAM file
- Modify & write SAM/BAM Files
- filter - Filter reads by clipping ends with too high of a mismatch percentage and by marking reads unmapped if the quality of mismatches is too high
- revert - Revert SAM/BAM replacing the specified fields with their previous values (if known) and removes specified tags
- squeeze - reduces files size by dropping OQ fields, duplicates, specified tags, using '=' when a base matches the reference, binning quality scores, and replacing readNames with unique integers
- Informational Tools
- Print Information in Readable Form:
This executable is built using C++ Library: libStatGen.
Just running ./bam will print the Usage information for the bam executable.