Difference between revisions of "BamUtil"

From Genome Analysis Wiki
Jump to: navigation, search
(Releases)
Line 29: Line 29:
  
  
[[Media:BamUtilLibStatGen.1.0.7.tgz|BamUtilLibStatGen.1.0.7.tgz‎]] - Released 1/29/2013
+
[[Media:BamUtilLibStatGen.1.0.9.tgz|BamUtilLibStatGen.1.0.9.tgz‎]] - Released 7/7/2013
  
'''BamUtilLibStatGen.1.0.7 Release Notes'''
+
'''BamUtilLibStatGen.1.0.9 Release Notes'''
* Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.7]]  
+
* Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.9]]  
* Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.7]]
+
* Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.9]]
* Update to fix some compile issues on ubuntu 12.10
+
* Update to [[BamUtil: mergeBam|mergeBam]]
* Update use of SamRecord::getStringTag to expect the return of a const string pointer due to libStatGen v1.0.7 updates
+
** Update to ignore PG lines with duplicate IDs
* Update SamReferenceInfo usage due to libStatGen v1.0.7 updates
+
** Update to accept merges of matching RG lines
* Update to [[BamUtil: diff|diff]]
+
** Update to log to stderr if no log/out file is specified
**  Fix DIFF to test and properly handle running out of available records.  Previously no message was printed when this happened and there was a bug for which file it freed
 
* Update to [[BamUtil: clipOverlap|clipOverlap]]
 
** Update to facilitate adding other overlap handling functions
 
* Update to [[BamUtil: mergeBam|mergeBam]] (formerly RGMergeBam)
 
** Rename RGMergeBam to MergeBam
 
** Update to handle files that already have an RG
 
  
 
'''Older Releases'''
 
'''Older Releases'''
 +
* There is no version 1.0.8.  It was skipped to stya in line with libStatGen versions (libStatGen 1.0.8 added vcf support)
 +
*[[Media:BamUtilLibStatGen.1.0.7.tgz|BamUtilLibStatGen.1.0.7.tgz‎]] - Released 1/29/2013
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.7]]
 +
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.7]]
 +
** Update to fix some compile issues on ubuntu 12.10
 +
** Update use of SamRecord::getStringTag to expect the return of a const string pointer due to libStatGen v1.0.7 updates
 +
** Update SamReferenceInfo usage due to libStatGen v1.0.7 updates
 +
** Update to [[BamUtil: diff|diff]]
 +
***  Fix DIFF to test and properly handle running out of available records.  Previously no message was printed when this happened and there was a bug for which file it freed
 +
** Update to [[BamUtil: clipOverlap|clipOverlap]]
 +
*** Update to facilitate adding other overlap handling functions
 +
** Update to [[BamUtil: mergeBam|mergeBam]] (formerly RGMergeBam)
 +
*** Rename RGMergeBam to MergeBam
 +
*** Update to handle files that already have an RG
 +
 
*[[Media:BamUtilLibStatGen.1.0.6.tgz|BamUtilLibStatGen.1.0.6.tgz‎]] - Released 11/14/2012
 
*[[Media:BamUtilLibStatGen.1.0.6.tgz|BamUtilLibStatGen.1.0.6.tgz‎]] - Released 11/14/2012
 
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.6]]  
 
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.6]]  
Line 80: Line 89:
 
To install an official release, unpack the downloaded file (tar xvf), cd into the bamUtil_x.x.x directory and type make all.
 
To install an official release, unpack the downloaded file (tar xvf), cd into the bamUtil_x.x.x directory and type make all.
  
*[[Media:BamUtil.1.0.7.tgz|BamUtil.1.0.7.tgz‎]] - Released 1/29/2013
+
* [[Media:BamUtil.1.0.9.tgz|BamUtil.1.0.9.tgz‎]] - Released 7/7/2013
  
'''BamUtil.1.0.7 Release Notes'''
+
'''BamUtil.1.0.9 Release Notes'''
* Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.7]] or above
+
* Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.9]] (version 1.0.7 should also work)
* Update to fix some compile issues on ubuntu 12.10
+
* Update to [[BamUtil: mergeBam|mergeBam]]
* Update use of SamRecord::getStringTag to expect the return of a const string pointer due to libStatGen v1.0.7 updates
+
** Update to ignore PG lines with duplicate IDs
* Update SamReferenceInfo usage due to libStatGen v1.0.7 updates
+
** Update to accept merges of matching RG lines
* Update to [[BamUtil: diff|diff]]
+
** Update to log to stderr if no log/out file is specified
**  Fix DIFF to test and properly handle running out of available records.  Previously no message was printed when this happened and there was a bug for which file it freed
 
* Update to [[BamUtil: clipOverlap|clipOverlap]]
 
** Update to facilitate adding other overlap handling functions
 
* Update to [[BamUtil: mergeBam|mergeBam]] (formerly RGMergeBam)
 
** Rename RGMergeBam to MergeBam
 
** Update to handle files that already have an RG
 
  
  
 
'''Older Releases'''
 
'''Older Releases'''
 +
*[[Media:BamUtil.1.0.7.tgz|BamUtil.1.0.7.tgz‎]] - Released 1/29/2013
 +
** Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.7]] or above
 +
** Update to fix some compile issues on ubuntu 12.10
 +
** Update use of SamRecord::getStringTag to expect the return of a const string pointer due to libStatGen v1.0.7 updates
 +
** Update SamReferenceInfo usage due to libStatGen v1.0.7 updates
 +
** Update to [[BamUtil: diff|diff]]
 +
***  Fix DIFF to test and properly handle running out of available records.  Previously no message was printed when this happened and there was a bug for which file it freed
 +
** Update to [[BamUtil: clipOverlap|clipOverlap]]
 +
*** Update to facilitate adding other overlap handling functions
 +
** Update to [[BamUtil: mergeBam|mergeBam]] (formerly RGMergeBam)
 +
*** Rename RGMergeBam to MergeBam
 +
*** Update to handle files that already have an RG
 
*[[Media:BamUtil.1.0.6.tgz|BamUtil.1.0.6.tgz‎]] - Released 11/14/2012
 
*[[Media:BamUtil.1.0.6.tgz|BamUtil.1.0.6.tgz‎]] - Released 11/14/2012
 
** Update to [[BamUtil: trimBam|trimBam]]
 
** Update to [[BamUtil: trimBam|trimBam]]

Revision as of 12:30, 7 July 2013


bamUtil Overview

bamUtil is a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable, bam.


Getting Help

If you have any questions please use the bamUtil Google Group to ask questions or recommend improvements to bamUtil.

Alternatively, you can e-mail me, Mary Kate Wing, at mktrost@umich.edu.


Where to Find It

The bamUtil repository is available both via release downloads and via github.

On github, https://github.com/statgen/bamUtil, you can both browse and download the bamUtil source code as well as explore the history of changes.

You can obtain the source either with or without git.

The releases may be available both with and without libStatGen included.

If you do not use the release version that already contains libStatGen, you need to download the library: libStatGen.

If you try to compile bamUtil and it cannot find libStatGen, it will fail and provide instructions of what to do next:

  • if libStatGen is in a different location then expected
    • follow the directions to set the path to libStatGen
  • if libStatGen is not downloaded and you have git
    • make libStatGen will download via git and build libStatGen
  • if libStatGen is not downloaded and you don't have git

Using Git To Track the Current Development Version

Clone (get your own copy)

You can create your own git clone (copy) using:

git clone https://github.com/statgen/bamUtil.git

or

git clone git://github.com/statgen/bamUtil.git

Either of these commands create a directory called bamUtil in the current directory.

Then just cd bamUtil and compile.

Get the latest Updates (update your copy)

To update your copy to the latest version (a major advantage of using git):

  1. cd pathToYourCopy/bamUtil
  2. make clean
  3. git pull
  4. make all

Git Refresher

If you decide to use git, but need a refresher, see How To Use Git or Notes on how to use git (if you have access)


Downloading From GitHub Without Git

If you download the latest code/version, make sure you periodically update it by downloading a newer version.

From github you can download:

  1. Latest Code (master branch)
    via Website
    1. Goto: https://github.com/statgen/bamUtil
    2. Click on the Download ZIP button on the right side panel.
    via Command Line
    wget https://github.com/statgen/bamUtil/archive/master.tar.gz
    or
    wget https://github.com/statgen/bamUtil/archive/master.zip
  2. Specific Release (via a tag)
    via Website
    1. Goto: https://github.com/statgen/bamUtil/releases to see the available releases
    2. Click zip or tar.gz for the desired version.
    via Command Line
    wget https://github.com/statgen/bamUtil/archive/<tagName>.tar.gz
    or
    wget https://github.com/statgen/bamUtil/archive/<tagName>.zip


After downloading the file, uncompress (unzip/untar) it. The directory created will be named bamUtil-<name of version you downloaded>.

Building

After obtaining the bamUtil repository (either by download or from github), compile the code using:

make all  

Object (.o) files are compiled into the obj directory with a subdirectory debug and profile for the debugging and profiling objects.

This creates the executable(s) in the bamUtil/bin/ directory, the debug executable(s) in the bamUtil/bin/debug/ directory, and the profiling executable(s) in the bamUtil/bin/profile/ directory.

make install installs the opt binary if you have permission.

make test compiles for opt, debug, and profile and runs the tests (found in the test subdirectory).

To see all make options, type make help.


If compilation fails due to warnings being treated as errors, please contact us so we can fix the warnings. As a work-around to get it to compile, you can disable the treatment of warnings as errors by editing libStatGen/general/Makefile to remove -Werror.

Releases

If you prefer to run the last official release rather than the latest development version, you can download that here.

There are two versions of the release, one that include libStatGen and one that does not. If you already have libStatGen installed and want to use your own copy, use the version that does not include libStatGen.

Full Release (includes libStatGen)

To install an official release, unpack the downloaded file (tar xvf), cd into the bamUtil_x.x.x directory and type make all.


BamUtilLibStatGen.1.0.9.tgz‎ - Released 7/7/2013

BamUtilLibStatGen.1.0.9 Release Notes

Older Releases

  • There is no version 1.0.8. It was skipped to stya in line with libStatGen versions (libStatGen 1.0.8 added vcf support)
  • BamUtilLibStatGen.1.0.7.tgz‎ - Released 1/29/2013
    • Contains: libStatGen version 1.0.7
    • Contains: bamUtil version 1.0.7
    • Update to fix some compile issues on ubuntu 12.10
    • Update use of SamRecord::getStringTag to expect the return of a const string pointer due to libStatGen v1.0.7 updates
    • Update SamReferenceInfo usage due to libStatGen v1.0.7 updates
    • Update to diff
      • Fix DIFF to test and properly handle running out of available records. Previously no message was printed when this happened and there was a bug for which file it freed
    • Update to clipOverlap
      • Update to facilitate adding other overlap handling functions
    • Update to mergeBam (formerly RGMergeBam)
      • Rename RGMergeBam to MergeBam
      • Update to handle files that already have an RG


Release of just BamUtil (does not include libStatGen)

To install an official release, unpack the downloaded file (tar xvf), cd into the bamUtil_x.x.x directory and type make all.

BamUtil.1.0.9 Release Notes

  • Requires, but does not include: libStatGen version 1.0.9 (version 1.0.7 should also work)
  • Update to mergeBam
    • Update to ignore PG lines with duplicate IDs
    • Update to accept merges of matching RG lines
    • Update to log to stderr if no log/out file is specified


Older Releases

  • BamUtil.1.0.7.tgz‎ - Released 1/29/2013
    • Requires, but does not include: libStatGen version 1.0.7 or above
    • Update to fix some compile issues on ubuntu 12.10
    • Update use of SamRecord::getStringTag to expect the return of a const string pointer due to libStatGen v1.0.7 updates
    • Update SamReferenceInfo usage due to libStatGen v1.0.7 updates
    • Update to diff
      • Fix DIFF to test and properly handle running out of available records. Previously no message was printed when this happened and there was a bug for which file it freed
    • Update to clipOverlap
      • Update to facilitate adding other overlap handling functions
    • Update to mergeBam (formerly RGMergeBam)
      • Rename RGMergeBam to MergeBam
      • Update to handle files that already have an RG
  • BamUtil.1.0.6.tgz‎ - Released 11/14/2012
    • Update to trimBam
      • Update to allow trimming a different number of bases from each end of the read
  • BamUtil.1.0.5.tgz‎ - Released 10/24/2012
    • Update to dedup
      • Update logic for which pair to keep if they have the same quality
    • Update to polishBam
      • Update to print the number of successful header additions
    • Update to recab
      • Update to print the number of base skipped due to the base quality
    • General Updates
      • Update to add compile option to compile without C++0x/C++11
  • BamUtil.1.0.4.tgz‎ - Released skipped
  • BamUtil.1.0.3.tgz‎ - Released 09/19/2012
    • Adds: dedup recab
    • General Updates
      • Update Logger to write to stderr if output is stdout
    • Update to stats
      • Add required/exclude flags
      • Exclude Clips if excluding umapped
      • Add --withinRegion flag
      • Update phred/qual counts to be uint64_t instead of int to avoid overflow
    • Update to validate
      • Detect header failures
    • Update to diff
      • Update to specify chromosome/pos in ZP as a string rather than int so both can be shown
    • Update to readReference
      • Output error message if the reference name is not found
    • Update to splitChromosome
      • Update to actually split the chromosomes and not just hard coded to output chromosomes ids 0-22
    • Update Makefile to have cloneLib for cloning libStatGen
  • BamUtil.1.0.2.tgz‎ - Released 05/16/2012
  • BamUtil.1.0.1.tgz‎ - Released 05/04/2012
  • BamUtil.1.0.0.tgz‎ - Released 10/10/2011

Programs

The software reads the beginning of an input file to determine if it is SAM/BAM. To determine the format (SAM/BAM) of the output file, the software checks the output file's extension. If the extension is ".bam" it writes a BAM file, otherwise it writes a SAM file.

The bam executable has the following functions.


This executable is built using C++ Library: libStatGen.

Just running ./bam will print the Usage information for the bam executable.