Changes

From Genome Analysis Wiki
Jump to navigationJump to search
7,509 bytes added ,  17:14, 11 September 2021
no edit summary
Line 1: Line 1: −
[[Category:Software|Bam]]
+
[[Category:bamUtil]]
[[Category:libbam]]
+
[[Category:C++]]
= bam Executable =
+
[[Category:Software]]
When the pipeline is compiled, the SAM/BAM executable, "bam" is generated in the pipeline/bam/ directory.
     −
The software reads the beginning of an input file to determine if it is SAM/BAM.  To determine the format (SAM/BAM) of the output file, the software checks the output file's extension.  If the extension is ".bam" it writes a BAM file, otherwise it writes a SAM file.
+
= bamUtil Overview =
   −
The bam executable has the following functions.
+
bamUtil is a repository that contains several programs that perform operations on SAM/BAM files.  All of these programs are built into a single executable, <code>bam</code>.
* [[C++ Executable: bam#validate|validate - Read and Validate a SAM/BAM file]]
  −
* [[C++ Executable: bam#convert|convert - Read a SAM/BAM file and write as a SAM/BAM file]]
  −
* [[C++ Executable: bam#dumpHeader|dumpHeader - Print SAM/BAM header]]
  −
* [[C++ Executable: bam#splitChromosome|splitChromosome - Split BAM by Chromosome]]
  −
* [[C++ Executable: bam#writeRegion|writeRegion - Write the alignments in the indexed BAM file that fall into the specified region]]
  −
* [[C++ Executable: bam#dumpRefInfo|dumpRefInfo - Print SAM/BAM Reference Information]]
  −
* [[C++ Executable: bam#dumpIndex|dumpIndex - Dump a BAM index file into an easy to read text version]]
  −
* [[C++ Executable: bam#readIndexedBam|readIndexedBam - Read an indexed BAM file reference by reference id -1 to the max reference id and write it out as a SAM/BAM file]]
  −
* [[C++ Executable: bam#filter|filter - Filter reads by clipping ends with too high of a mismatch percentage and by marking reads unmapped if the quality of mismatches is too high]]
  −
* [[C++ Executable: bam#readReference|readReference - Print the reference string for the specified region]]
     −
This executable is built using the [[C++ Library: bam|bam library]].
     −
Just running ./bam will print the Usage information for the bam executable.
+
== Getting Help ==
    +
If you have any questions please use the [https://github.com/statgen/bamUtil bamUtil GitHub page] to raise and issue.
   −
== validate ==
+
See [[BamUtil: FAQ]] to see if your question has already been answered.
   −
The <code>validate</code> option on the bam executable reads and validates a SAM/BAM file.  This option is documented at: [[BamValidator]]
+
== Where to Find It ==
 +
{{ToolGitRepo|repoName=bamUtil}}
   −
== convert ==
+
== Releases ==
The <code>convert</code> option on the bam executable reads a SAM/BAM file and writes it as a SAM/BAM file.
     −
The executable converts the input file into the format of the output file.  So if you want to convert a BAM file to a SAM file, from the pipeline/bam/ directory you just call:
+
If you prefer to run the last official release rather than the latest development version, you can download that here.
./bam --in <bamFile>.bam --out <newSamFile>.sam
  −
Don't forget to put in the paths to the executable and your test files.
     −
=== Parameters ===
+
There are two versions of the release, one that include libStatGen and one that does not.  If you already have libStatGen installed and want to use your own copy, use the version that does not include libStatGen.
<pre>
  −
    Required Parameters:
  −
--in : the SAM/BAM file to be read
  −
--out : the SAM/BAM file to be written
  −
    Optional Parameters:
  −
--noeof            : do not expect an EOF block on a bam file.
  −
</pre>
     −
=== Usage ===
+
=== Full Release (includes libStatGen) ===
./bam convert --in <inputFile> --out <outputFile.sam/bam/ubam (ubam is uncompressed bam)> [--noeof]
     −
=== Return Value ===
+
To install an official release, unpack the downloaded file (tar xvf), cd into the bamUtil_x.x.x directory and type make all.
Returns the SamStatus for the reads/writes.
     −
=== Example Output ===
+
For version 1.0.14 and later, please download libStatGen and bamUtil separately:
<pre>
  −
Number of records read = 10
  −
Number of records written = 10
  −
</pre>
        −
== dumpHeader ==
+
'''Version 1.0.14 - Released 7/8/2015'''
The <code>dumpHeader</code> option on the bam executable prints the header of the specified SAM/BAM file to cout.
+
*[[LibStatGen Download#Official Releases|libStatGen version 1.0.14]]
 +
*[[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.14]]
   −
=== Parameters ===
  −
<pre>
  −
    Required Parameters:
  −
filename : the sam/bam filename whose header should be printed.
  −
</pre>
     −
=== Usage ===
+
'''Older Releases'''
 +
* [[Media:BamUtilLibStatGen.1.0.13.tgz|BamUtilLibStatGen.1.0.13.tgz‎]] - Released 2/20/2015
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.13]] - see link for version updates
 +
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.13]] - see link for version updates
   −
./bam dumpHeader <inputFile>
     −
=== Return Value ===
+
* [[Media:BamUtilLibStatGen.1.0.12.tar.gz|BamUtilLibStatGen.1.0.12.tgz‎]] - Released 5/14/2014
*     0: the header was successfully read and printed.
+
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.12]] - see link for version updates
* non-0: the header was not successfully read or was not printed. (Returns the SamStatus.)
+
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.12]] - see link for version updates
 +
** Adds regions to [[BamUtil: mergeBam|mergeBam]]
 +
** Accept ',' delimiters for the tags string input in [[BamUtil: squeeze|squeeze]], [[BamUtil: revert|revert]], & [[BamUtil: diff|diff]]
    +
*[[Media:BamUtilLibStatGen.1.0.11.tar.gz|BamUtilLibStatGen.1.0.11.tar.gz‎]] - Released 2/28/2014
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.11]] - see link for version updates
 +
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.11]] - see link for version updates
 +
** Now properly supports 'B' & 'f' tags
 +
** Cleanup - compile issues
   −
=== Example Output ===
+
*[[Media:BamUtilLibStatGen.1.0.10.tar.gz|BamUtilLibStatGen.1.0.10.tar.gz‎]] - Released 1/2/2014
<pre>
+
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.10]] - see link for version updates
@SQ SN:1 LN:247249719
+
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.10]] - see link for version updates
@SQ SN:2 LN:242951149
+
** Adds PhoneHome/Version checking.
@SQ SN:3 LN:199501827
  −
</pre>
      +
*[[Media:BamUtilLibStatGen.1.0.9.tgz|BamUtilLibStatGen.1.0.9.tgz‎]] - Released 7/7/2013
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.9]]
 +
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.9]]
 +
** Update to [[BamUtil: mergeBam|mergeBam]]
 +
*** Update to ignore PG lines with duplicate IDs
 +
*** Update to accept merges of matching RG lines
 +
*** Update to log to stderr if no log/out file is specified
 +
* There is no version 1.0.8.  It was skipped to stay in line with libStatGen versions (libStatGen 1.0.8 added vcf support)
 +
*[[Media:BamUtilLibStatGen.1.0.7.tgz|BamUtilLibStatGen.1.0.7.tgz‎]] - Released 1/29/2013
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.7]]
 +
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.7]]
 +
** Update to fix some compile issues on ubuntu 12.10
 +
** Update use of SamRecord::getStringTag to expect the return of a const string pointer due to libStatGen v1.0.7 updates
 +
** Update SamReferenceInfo usage due to libStatGen v1.0.7 updates
 +
** Update to [[BamUtil: diff|diff]]
 +
***  Fix DIFF to test and properly handle running out of available records.  Previously no message was printed when this happened and there was a bug for which file it freed
 +
** Update to [[BamUtil: clipOverlap|clipOverlap]]
 +
*** Update to facilitate adding other overlap handling functions
 +
** Update to [[BamUtil: mergeBam|mergeBam]] (formerly RGMergeBam)
 +
*** Rename RGMergeBam to MergeBam
 +
*** Update to handle files that already have an RG
   −
== splitChromosome ==
+
*[[Media:BamUtilLibStatGen.1.0.6.tgz|BamUtilLibStatGen.1.0.6.tgz‎]] - Released 11/14/2012
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.6]]
 +
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.6]]
 +
** Update to [[BamUtil: trimBam|trimBam]]
 +
*** Update to allow trimming a different number of bases from each end of the read
 +
*[[Media:BamUtilLibStatGen.1.0.5.tgz|BamUtilLibStatGen.1.0.5.tgz‎]] - Released 10/24/2012
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.5]]
 +
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.5]]
 +
** Updates to: [[BamUtil: dedup|dedup]], [[BamUtil: polishBam|polishBam]], [[BamUtil: recab|recab]]
 +
** Update to add compile option to compile without C++0x/C++11
 +
** See [[#Release of just BamUtil (does not include libStatGen)|below]] for additional details on updates
 +
*BamUtilLibStatGen.1.0.4.tgz‎ - Released skipped
 +
*[[Media:BamUtilLibStatGen.1.0.3.tgz|BamUtilLibStatGen.1.0.3.tgz‎]] - Released 09/19/2012
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.3]]
 +
** Contains: [[#Release of just BamUtil (does not include libStatGen)|bamUtil version 1.0.3]]
 +
** Adds: [[BamUtil: dedup|dedup]] [[BamUtil: recab|recab]]
 +
*[[Media:BamUtilLibStatGen.1.0.2.tgz|BamUtilLibStatGen.1.0.2.tgz‎]] - Released 05/16/2012
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.2]]
 +
** Adds: [[BamUtil: bam2FastQ|bam2FastQ]]
 +
*[[Media:BamUtilLibStatGen.1.0.1.tgz|BamUtilLibStatGen.1.0.1.tgz‎]] - Released 05/04/2012
 +
** Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.1]]
 +
** Adds: [[BamUtil: splitBam|splitBam]], [[BamUtil: clipOverlap|clipOverlap]],  [[BamUtil: trimBam|trimBam]], [[BamUtil: polishBam|polishBam]], [[BamUtil: rgMergeBam|rgMergeBam]], [[BamUtil: gapInfo|gapInfo]]
 +
** Adds additional functionality to [[BamUtil: stats|stats]]
 +
** Adds leftShifting to [[BamUtil: writeRegion|writeRegion]] and [[BamUtil: convert|convert]]
 +
** Adds more diff fields to [[BamUtil: diff|diff]]
 +
* [[Media:BamUtilLibStatGen.1.0.0.tgz|BamUtilLibStatGen.1.0.0.tgz‎]] - Released 10/10/2011
 +
**Initial release of bamUtil that includes libStatGen version 1.0.0.  It started from the tool found in the deprecated StatGen repository.
 +
**Contains: [[LibStatGen Download#Official Releases|libStatGen version 1.0.0]] [[BamUtil: validate|validate]], [[BamUtil: convert|convert]], [[BamUtil: dumpHeader|dumpHeader]], [[BamUtil: splitChromosome|splitChromosome]], [[BamUtil: writeRegion|writeRegion]], [[BamUtil: dumpRefInfo|dumpRefInfo]], [[BamUtil: dumpIndex|dumpIndex]], [[BamUtil: readIndexedBam|readIndexedBam]], [[BamUtil: filter|filter]], [[BamUtil: readReference|readReference]], [[BamUtil: revert|revert]], [[BamUtil: diff|diff]], [[BamUtil: squeeze|squeeze]], [[BamUtil: findCigars|findCigars]], [[BamUtil: stats|stats]]
   −
The <code>splitChromosome</code> option on the bam executable splits an indexed BAM file into multiple files based on the Chromosome (Reference Name)
+
=== Release of just BamUtil (does not include libStatGen) ===
   −
The files all have the same base name, but with an _# where # corresponds with the associated reference id from the BAM file.
+
To install an official release, unpack the downloaded file (tar xvf), cd into the bamUtil_x.x.x directory and type make all.
   −
=== Parameters ===
+
'''BamUtil.1.0.14 Release Notes'''
<pre>
+
* BamUtil Version 1.0.14 - Released 7/8/2015
    Required Parameters:
+
** https://github.com/statgen/bamUtil/archive/v1.0.14.tar.gz
--in      : the SAM/BAM file to be split
+
** Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.14]]
--out      : the base filename for the SAM/BAM files to write into. Does not include the extension.
+
** Update [[BamUtil: trimBam|trimBam]]
            _N will be appended to the basename where N indicates the Chromosome.
+
*** Add option to soft clip (-c) instead of trimming
    Optional Parameters:
+
** Update [[BamUtil: clipOverlap|clipOverlap]]
--noeof  : do not expect an EOF block on a bam file.
+
*** Add option to mark reads as unmapped if they are entirely clipped
--bamIndex : the path/name of the bam index file
+
** Update to [[BamUtil: bam2FastQ|bam2FastQ]]
            (if not specified, uses the --in value + ".bai")
+
*** Add option to gzip the output files
--bamout : write the output files in BAM format (default).
+
*** Add option to split Read Groups into separate fastq files
--samout : write the output files in SAM format.
+
*** Add option to get the quality from a tag
</pre>
+
** Update [[BamUtil: recab|recab]]
 +
*** Update to ignore ref 'N' when building the recalibration table
 +
*** Add ability to bin
 +
** Add Dedup_LowMem tool
   −
=== Usage ===
+
'''Older Releases'''
 +
* BamUtil Version 1.0.13 - Released 2/20/2015
 +
** https://github.com/statgen/bamUtil/archive/v1.0.13.tar.gz
 +
** Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.13]]
 +
** Makefile Updates
 +
*** Improve logic to determine actual path for the library
 +
*** Update to append to USER_COMPILE_VARS even if specified on the command line
 +
** Update [[BamUtil: writeRegion|writeRegion]]
 +
*** Add option to specify readnames to keep in a file
 +
*** Fixed bug that if a read overlapped 2 BED positions, it was printed twice
 +
** Update to [[BamUtil: bam2FastQ|bam2FastQ]]
 +
*** Update to skip non-primary reads
 +
** Update to [[BamUtil: polishBam|polishBam]]
 +
*** Update to handle '\t' string inputs and to add CO option
 +
*** Fix MD5sum calculation to convert fasta to uppercase prior to calculating
   −
./bam splitChromosome --in <inputFilename>  --out <outputFileBaseName> [--bamIndex <bamIndexFile>] [--noeof] [--bamout|--samout]
+
* [[Media:BamUtil.1.0.12.tgz|BamUtil.1.0.12.tgz‎]] - Released 5/14/2014
 +
** Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.12]]
 +
** Update [[BamUtil: mergeBam|mergeBam]]
 +
*** Add a regions option
 +
** Update to [[BamUtil: squeeze|squeeze]], [[BamUtil: revert|revert]], [[BamUtil: diff|diff]]
 +
*** Also accept ',' instead of just ';' as the delimiter in the input tags string.
    +
* [[Media:BamUtil.1.0.11.tgz|BamUtil.1.0.11.tgz‎]] - Released 2/28/2014
 +
** Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.11]]
 +
*** Adds support for 'B' & 'f' tags that did not work properly before.
 +
** Update [[BamUtil: splitBam|splitBam]] & [[BamUtil: polishBam|polishBam]]
 +
*** Update to work properly if log & output file are not specified (no longer creates '.log')
 +
** Update Main dummy/example tool to indicate the correct tool
 +
** Update to [[BamUtil: bam2FastQ|bam2FastQ]], [[BamUtil: clipOverlap|clipOverlap]], [[BamUtil: filter|filter]], [[BamUtil: mergeBam|mergeBam]], [[BamUtil: splitBam|splitBam]], [[BamUtil: squeeze|squeeze]], [[BamUtil: stats|stats]]
 +
*** Cleanup usage/parameter descriptions
 +
** Update [[BamUtil: revert|revert]]
 +
*** Update compatibility with libStatGen due to 'B' & 'f' tag handling updates
 +
** Add tests for 'B' & 'f' tags
   −
=== Return Value ===
+
* [[Media:BamUtil.1.0.10.tar.gz|BamUtil.1.0.10.tar.gz‎]] - Released 1/2/2014
*     0: all records are successfully read and written.
+
** Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.10]]
* non-0: at least one record was not successfully read or written.
+
** All
 +
*** Add PhoneHome/version checking
 +
*** Make sub-program names case independent
 +
*** Fix Logger.cpp compiler warning
 +
** Adds: [[BamUtil: explainFlags|explainFlags]] - describes the SAM/BAM flags based on the flag value
 +
** Update to [[BamUtil: stats|stats]]
 +
*** Fix Stats to not try to not try to process a record after it is out of the loop (it would already have been processed or is invalid)
 +
** Update to [[BamUtil: splitBam|splitBam]]
 +
*** fix description of --noeof option
 +
** Update to [[BamUtil: writeRegion|writeRegion]]
 +
*** add exclude/required flags
 +
** Update to [[BamUtil: dedup|dedup]] & [[BamUtil: recab|recab]]
 +
*** Ignore secondary reads for dedup and making the recalibration table.
 +
*** skip QC Failures
 +
*** add excludeFlags parameters
 +
** Update to [[BamUtil: clipOverlap|clipOverlap]]
 +
*** add exclude flags
 +
*** fix bug for readName sorted when a read is filtered due to flags
 +
*** add sorting validation
 +
** Update to [[BamUtil: bam2FastQ|bam2FastQ]]
 +
*** add --merge option to generate interleaved files.
 +
*** update to open the input file before opening the output files, so if there is an error, the outputs aren't opened
 +
** Update to [[BamUtil: mergeBam|mergeBam]]
 +
*** add option to ignore the RG PI field when checking headers
 +
*** add more informative header merge error messages
   −
=== Example Output ===
+
* [[Media:BamUtil.1.0.9.tgz|BamUtil.1.0.9.tgz‎]] - Released 7/7/2013
<pre>
+
** Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.9]] (version 1.0.7 should also work)
The following parameters are in effect:
+
** Update to [[BamUtil: mergeBam|mergeBam]]
 +
*** Update to ignore PG lines with duplicate IDs
 +
*** Update to accept merges of matching RG lines
 +
*** Update to log to stderr if no log/out file is specified
   −
Input Parameters
+
*[[Media:BamUtil.1.0.7.tgz|BamUtil.1.0.7.tgz‎]] - Released 1/29/2013
  --in [test/testFiles/sortedBam.bam], --out [chromosome], --bamIndex [],
+
** Requires, but does not include: [[LibStatGen Download#Official Releases|libStatGen version 1.0.7]] or above
                --noeof
+
** Update to fix some compile issues on ubuntu 12.10
  Output Type : --bamout [ON], --samout
+
** Update use of SamRecord::getStringTag to expect the return of a const string pointer due to libStatGen v1.0.7 updates
 +
** Update SamReferenceInfo usage due to libStatGen v1.0.7 updates
 +
** Update to [[BamUtil: diff|diff]]
 +
***  Fix DIFF to test and properly handle running out of available records. Previously no message was printed when this happened and there was a bug for which file it freed
 +
** Update to [[BamUtil: clipOverlap|clipOverlap]]
 +
*** Update to facilitate adding other overlap handling functions
 +
** Update to [[BamUtil: mergeBam|mergeBam]] (formerly RGMergeBam)
 +
*** Rename RGMergeBam to MergeBam
 +
*** Update to handle files that already have an RG
 +
*[[Media:BamUtil.1.0.6.tgz|BamUtil.1.0.6.tgz‎]] - Released 11/14/2012
 +
** Update to [[BamUtil: trimBam|trimBam]]
 +
*** Update to allow trimming a different number of bases from each end of the read
 +
*[[Media:BamUtil.1.0.5.tgz|BamUtil.1.0.5.tgz‎]] - Released 10/24/2012
 +
** Update to [[BamUtil: dedup|dedup]]
 +
*** Update logic for which pair to keep if they have the same quality
 +
** Update to [[BamUtil: polishBam|polishBam]]
 +
*** Update to print the number of successful header additions
 +
** Update to [[BamUtil: recab|recab]]
 +
*** Update to print the number of base skipped due to the base quality
 +
** General Updates
 +
*** Update to add compile option to compile without C++0x/C++11
 +
*BamUtil.1.0.4.tgz‎ - Released skipped
 +
*[[Media:BamUtil.1.0.3.tgz|BamUtil.1.0.3.tgz‎]] - Released 09/19/2012
 +
** Adds: [[BamUtil: dedup|dedup]] [[BamUtil: recab|recab]]
 +
** General Updates
 +
*** Update Logger to write to stderr if output is stdout
 +
** Update to [[BamUtil: stats|stats]]
 +
*** Add required/exclude flags
 +
*** Exclude Clips if excluding umapped
 +
*** Add --withinRegion flag
 +
*** Update phred/qual counts to be uint64_t instead of int to avoid overflow
 +
** Update to [[BamUtil: validate|validate]]
 +
*** Detect header failures
 +
** Update to [[BamUtil: diff|diff]]
 +
*** Update to specify chromosome/pos in ZP as a string rather than int so both can be shown
 +
** Update to [[BamUtil: readReference|readReference]]
 +
*** Output error message if the reference name is not found
 +
** Update to [[BamUtil: splitChromosome|splitChromosome]]
 +
*** Update to actually split the chromosomes and not just hard coded to output chromosomes ids 0-22
 +
** Update Makefile to have cloneLib for cloning libStatGen
 +
*[[Media:BamUtil.1.0.2.tgz|BamUtil.1.0.2.tgz‎]] - Released 05/16/2012
 +
** Adds: [[BamUtil: bam2FastQ|bam2FastQ]]
 +
*[[Media:BamUtil.1.0.1.tgz|BamUtil.1.0.1.tgz‎]] - Released 05/04/2012
 +
** Adds: [[BamUtil: splitBam|splitBam]], [[BamUtil: clipOverlap|clipOverlap]], [[BamUtil: trimBam|trimBam]], [[BamUtil: polishBam|polishBam]], [[BamUtil: rgMergeBam|rgMergeBam]], [[BamUtil: gapInfo|gapInfo]]
 +
** Adds additional functionality to [[BamUtil: stats|stats]]
 +
** Adds leftShifting to [[BamUtil: writeRegion|writeRegion]] and [[BamUtil: convert|convert]]
 +
** Adds more diff fields to [[BamUtil: diff|diff]]
 +
*[[Media:BamUtil.1.0.0.tgz|BamUtil.1.0.0.tgz‎]] - Released 10/10/2011
 +
**Initial release of just bamUtil.  It started from the tool found in the deprecated StatGen repository.
 +
**Contains: [[BamUtil: validate|validate]], [[BamUtil: convert|convert]], [[BamUtil: dumpHeader|dumpHeader]], [[BamUtil: splitChromosome|splitChromosome]], [[BamUtil: writeRegion|writeRegion]], [[BamUtil: dumpRefInfo|dumpRefInfo]], [[BamUtil: dumpIndex|dumpIndex]], [[BamUtil: readIndexedBam|readIndexedBam]], [[BamUtil: filter|filter]], [[BamUtil: readReference|readReference]], [[BamUtil: revert|revert]], [[BamUtil: diff|diff]], [[BamUtil: squeeze|squeeze]], [[BamUtil: findCigars|findCigars]], [[BamUtil: stats|stats]]
   −
Reference ID -1 has 2 records
+
== Citation ==
Reference ID 0 has 5 records
+
If you use BamUtil, please cite our publication on GotCloud which includes BamUtil:
Reference ID 1 has 2 records
+
[http://genome.cshlp.org/content/early/2015/04/14/gr.176552.114.abstract Jun, Goo, et al. "An efficient and scalable analysis framework for variant extraction and refinement from population scale DNA sequence data." Genome research (2015): gr-176552.]
Reference ID 2 has 1 records
  −
Reference ID 3 has 0 records
  −
Reference ID 4 has 0 records
  −
Reference ID 5 has 0 records
  −
Reference ID 6 has 0 records
  −
Reference ID 7 has 0 records
  −
Reference ID 8 has 0 records
  −
Reference ID 9 has 0 records
  −
Reference ID 10 has 0 records
  −
Reference ID 11 has 0 records
  −
Reference ID 12 has 0 records
  −
Reference ID 13 has 0 records
  −
Reference ID 14 has 0 records
  −
Reference ID 15 has 0 records
  −
Reference ID 16 has 0 records
  −
Reference ID 17 has 0 records
  −
Reference ID 18 has 0 records
  −
Reference ID 19 has 0 records
  −
Reference ID 20 has 0 records
  −
Reference ID 21 has 0 records
  −
Reference ID 22 has 0 records
  −
Number of records = 10
  −
Returning: 0 (SUCCESS)
  −
</pre>
        −
== writeRegion ==
+
= Programs =
   −
The <code>writeRegion</code> option on the bam executable writes the alignments in the indexed BAM file that fall into the specified region (reference id and start/end position).
+
The software reads the beginning of an input file to determine if it is SAM/BAM. To determine the format (SAM/BAM) of the output file, the software checks the output file's extensionIf the extension is ".bam" it writes a BAM file, otherwise it writes a SAM file.
 
  −
=== Parameters ===
  −
<pre>
  −
Required Parameters:
  −
--in      : the BAM file to be read
  −
--out      : the SAM/BAM file to write to
  −
Optional Parameters:
  −
--noeof : do not expect an EOF block on a bam file.
  −
--bamIndex : the path/name of the bam index file
  −
            (if not specified, uses the --in value + ".bai")
  −
--refID    : the BAM reference ID to read (defaults to -1: unmapped)
  −
--start    : the 0-based start position (defaults to -1)
  −
--end      : the 0-based end position (defaults to -1: meaning til the end of the reference)
  −
</pre>
  −
 
  −
=== Usage ===
  −
 
  −
./bam writeRegion --in <inputFilename>  --out <outputFilename> [--bamIndex <bamIndexFile>] [--noeof]
  −
 
  −
=== Return Value ===
  −
*    0: all records are successfully read and written.
  −
* non-0: at least one record was not successfully read or written.
  −
 
  −
=== Example Output ===
  −
<pre>
  −
The following parameters are in effect:
  −
 
  −
Input Parameters
  −
--in [test/testFiles/sortedBam.bam], --out [t.sam], --bamIndex [], --refID,
  −
--start [1], --end [100], --noeof
  −
 
  −
Wrote t.sam with 2 records.
  −
</pre>
  −
 
  −
 
  −
== dumpRefInfo ==
  −
The <code>dumpRefInfo</code> option on the bam executable prints the SAM/BAM file's reference information.
  −
 
  −
=== Parameters ===
  −
<pre>
  −
Required Parameters:
  −
--in              : the SAM/BAM file to be read
  −
Optional Parameters:
  −
--noeof            : do not expect an EOF block on a bam file.
  −
--printRecordRefs : print the reference information for the records in the file (grouped by reference).
  −
</pre>
  −
 
  −
=== Usage ===
  −
./bam dumpRefInfo --in <inputFilename> [--noeof] [--printRecordRefs]
  −
 
  −
=== Return Value ===
  −
*    0: the file was processed successfully.
  −
* non-0: the file was not processed successfully.
  −
 
  −
 
  −
== dumpIndex ==
  −
The <code>dumpIndex</code> option on the bam executable prints BAM index file in an easy to read format.
  −
 
  −
=== Parameters ===
  −
<pre>
  −
Required Parameters:
  −
--bamIndex : the path/name of the bam index file to display
  −
Optional Parameters:
  −
--refID    : the reference ID to read, defaults to print all
  −
--summary : only print a summary - 1 line per reference.
  −
</pre>
  −
 
  −
=== Usage ===
  −
./bam dumpIndex --bamIndex <bamIndexFile> [--refID <ref#>] [--summary]
  −
 
  −
=== Return Value ===
  −
*    0: the BAM index file was processed successfully.
  −
* non-0: the BAM index file was not processed successfully.
  −
 
  −
 
  −
== readIndexedBam ==
  −
The <code>readIndexedBam</code> option on the bam executable reads an indexed BAM file reference id by reference id -1 to the max reference id and writes it out as a SAM/BAM file.
  −
 
  −
=== Parameters ===
  −
<pre>
  −
Required Parameters:
  −
inputFilename      - path/name of the input BAM file
  −
outputFile.sam/bam - path/name of the output file
  −
bamIndexFile      - path/name of the BAM index file
  −
Optional Parameters:
  −
ref# - the reference number to print (optional) defaults to print all
  −
</pre>
  −
 
  −
=== Usage ===
  −
./bam readIndexedBam <inputFilename> <outputFile.sam/bam> <bamIndexFile>
  −
 
  −
=== Return Value ===
  −
* 0
  −
 
  −
== filter ==
  −
 
  −
The <code>filter</code> option on the bam executable filters the reads in a a SAM/BAM file.  This option is documented at: [[Bam Executable: Filter]]
  −
 
  −
== readReference ==
  −
The <code>readReference</code> option on the bam executable prints the specified region of the reference sequence in an easy to read format.
  −
 
  −
=== Parameters ===
  −
<pre>
  −
Required Parameters:
  −
--refFile : the path/name of the reference file
  −
Optional Parameters:
  −
--refID    : the reference ID to read, defaults to print all
  −
--summary : only print a summary - 1 line per reference.
  −
</pre>
  −
 
  −
=== Usage ===
  −
./bam readReference --refFile <referenceFilename> --start <0 based start> --end <0 based end>|--numBases <number of bases>
     −
=== Return Value ===
+
{{BamUtilPrograms}}
*    0: the reference file was successfully read.
  −
* non-0: the reference file was not successfully read.
 

Navigation menu