bamUtil is a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable, <code>bam</code>.
== Where to Find It ==
The bamUtil repository is available both via release downloads (coming soon) and via github.
On github, you can both browse and download the latest version of the repository as well as explore the history of changes.
You can access the latest version with or without git.
download from github or use git to keep up to date, you also need to download our library: [[C++ Library: libStatGen|libStatGen ]].
The releases will be available both with and without libStatGen included. If you download the verison without libStatGen included, you will also need to download libStatGen separately.( It will be available without libStatGen in case you already have a downloaded version of libStatGen that you want to use.
=== Releases === Release downloads are '''Coming Soon'''.
=== Using github ===
==== Using Git To Track the Current Development Version ====
===== Clone (get your own copy) =====
You can create your own git clone (copy) using:
git clone https://github.com/statgen/bamUtil.git
git clone git://github.com/statgen/bamUtil.git
Either of these commands create a directory called <code>bamUtil </code> in the current directory.
Then just <code>cd bamUtil</code> and [[BamUtil#Building|compile]].
===== Get the latest Updates (update your copy) ===== To update your copy to the latest version (a major advantage of using git): # <code>cd pathToYourCopy/ bamUtil</ code># <code>make clean</code># <code>git pull</code> # <code>make all</code>
=== Git Refresher === If you decide to use git, but need a refresher, see [[ How To Use Git]] or [https:// statgen. sph. umich.edu/wiki/How_To_Use_Git Notes on how to use git] ( if you have access)
==== Downloading From GitHub Without Git ==== Periodically download the latest copy from github from the "Downloads" link on the webpage: https:/ /github. com/ statgen/bamUtil /archives/master.
The downloaded tar file is named "statgen- bamUtil-someHexNumber. tar. gz". The directory created when it is untared shares the same base name. I recommend that you do not change the name of the directory. If you want one called bamUtil, create a link to this directory. The hex number in the directory name identifies the version of the repository that you downloaded and is necessary to easily troubleshoot any issues you encounter. If you must rename the directory, be sure to record the hex number that was on the download for future reference.
Building = = After obtaining the bamUtil repository ( either by download or from github) , compile the code using <code>make all</code>. This creates the executable, <code>bam</code>, in the <code>bamUtil/bin/</code> directory, the debug executable in the <code>bamUtil/bin/debug/</code> directory, and the profiling executable in the <code>bamUtil/bin/profile/</code> directory.
= Programs =
The software reads the beginning of an input file to determine if it is SAM/ BAM. To determine the format (SAM/ BAM) of the output file, the software checks the output file's extension. If the extension is ". bam" it writes a BAM file, otherwise it writes a SAM file.
The bam executable has the following functions.
* Rewrite SAM/BAM Files ** [[BamUtil : convert| '''convert''' - Read a SAM/ BAM file and write as a SAM/ BAM file (optionally converts between '=' & bases in the sequence)]]** [[ BamUtil: splitChromosome| '''splitChromosome''' - Split BAM by Chromosome]]** [[BamUtil: writeRegion| '''writeRegion''' - Write the alignments in the indexed BAM file that fall into the specified region and/or have the specified read name]]** [[BamUtil: findCigars|'''findCigars''' - Output just the reads that contain any of the specified CIGAR operations]]** [[BamUtil: readIndexedBam|'''readIndexedBam''' - Read an indexed BAM file reference by reference id -1 to the max reference id and write it out as a SAM/BAM file ]]
Modify & write SAM/ BAM Files** [[BamUtil: filter| '''filter''' - Filter reads by clipping ends with too high of a mismatch percentage and by marking reads unmapped if the quality of mismatches is too high]]** [[BamUtil: revert| '''revert''' - Revert SAM/ BAM replacing the specified fields with their previous values (if known) and removes specified tags]]** [[BamUtil: squeeze| '''squeeze''' - reduces files size by dropping OQ fields, duplicates, specified tags, using '=' when a base matches the reference, binning quality scores, and replacing readNames with unique integers]]
* Informational Tools ** [[BamUtil: validate|'''validate''' - Read and Validate a SAM/BAM file]] ** [[ BamUtil: diff|'''diff''' - Print the diffs between 2 bams]] ** [[BamUtil: stats|'''stats''' - Print the diffs between 2 bams]]
* Print Information in Readable Form:
** [[BamUtil: dumpHeader|'''dumpHeader''' - Print SAM/BAM header]]
** [[BamUtil: dumpRefInfo|'''dumpRefInfo''' - Print SAM/BAM Reference Information]]
** [[BamUtil: dumpIndex|'''dumpIndex''' - Dump a BAM index file into an easy to read text version]]
** [[BamUtil: readReference|'''readReference''' - Print the reference string for the specified region]]
This executable is built using [[C++ Library: libStatGen]].
Just running ./bam will print the Usage information for the bam executable.