Difference between revisions of "BamUtil: splitChromosome"

From Genome Analysis Wiki
Jump to navigationJump to search
(Add splitChromosome)
 
(Updated to actually split all chromsomes)
Line 6: Line 6:
 
The <code>splitChromosome</code> option on the [[bamUtil]] executable splits an indexed BAM file into multiple files based on the Chromosome (Reference Name).   
 
The <code>splitChromosome</code> option on the [[bamUtil]] executable splits an indexed BAM file into multiple files based on the Chromosome (Reference Name).   
  
The files all have the same base name, but with an _# where # corresponds with the associated reference id from the BAM file.
+
The files all have the same base name, but with the chromosome name ".bam" or ".sam" appended.
  
 
= Parameters =
 
= Parameters =
Line 13: Line 13:
 
         --in      : the BAM file to be split
 
         --in      : the BAM file to be split
 
         --out      : the base filename for the SAM/BAM files to write into.  Does not include the extension.
 
         --out      : the base filename for the SAM/BAM files to write into.  Does not include the extension.
                     _N will be appended to the basename where N indicates the Chromosome.
+
                     CHROM.bam or CHROM.sam will be appended to the basename where CHROM is the chromosome name.
 
     Optional Parameters:
 
     Optional Parameters:
 
         --noeof  : do not expect an EOF block on a bam file.
 
         --noeof  : do not expect an EOF block on a bam file.
        --bamIndex : the path/name of the bam index file
 
                    (if not specified, uses the --in value + ".bai")
 
 
         --bamout : write the output files in BAM format (default).
 
         --bamout : write the output files in BAM format (default).
 
         --samout : write the output files in SAM format.
 
         --samout : write the output files in SAM format.
Line 25: Line 23:
 
= Usage =
 
= Usage =
  
  ./bam splitChromosome --in <inputFilename>  --out <outputFileBaseName> [--bamIndex <bamIndexFile>] [--noeof] [--bamout|--samout] [--params]
+
  ./bam splitChromosome --in <inputFilename>  --out <outputFileBaseName> [--noeof] [--bamout|--samout] [--params]
  
  
Line 34: Line 32:
 
= Example Output =
 
= Example Output =
 
<pre>
 
<pre>
Reference ID -1 has 2 records
+
Reference Name: 1 has 5 records
Reference ID 0 has 5 records
+
Reference Name: 2 has 2 records
Reference ID 1 has 2 records
+
Reference Name: 3 has 1 records
Reference ID 2 has 1 records
+
Reference Name: * has 2 records
Reference ID 3 has 0 records
 
Reference ID 4 has 0 records
 
Reference ID 5 has 0 records
 
Reference ID 6 has 0 records
 
Reference ID 7 has 0 records
 
Reference ID 8 has 0 records
 
Reference ID 9 has 0 records
 
Reference ID 10 has 0 records
 
Reference ID 11 has 0 records
 
Reference ID 12 has 0 records
 
Reference ID 13 has 0 records
 
Reference ID 14 has 0 records
 
Reference ID 15 has 0 records
 
Reference ID 16 has 0 records
 
Reference ID 17 has 0 records
 
Reference ID 18 has 0 records
 
Reference ID 19 has 0 records
 
Reference ID 20 has 0 records
 
Reference ID 21 has 0 records
 
Reference ID 22 has 0 records
 
 
Number of records = 10
 
Number of records = 10
 
Returning: 0 (SUCCESS)
 
Returning: 0 (SUCCESS)
 
</pre>
 
</pre>

Revision as of 17:07, 16 July 2012


Overview of the splitChromosome function of bamUtil

The splitChromosome option on the bamUtil executable splits an indexed BAM file into multiple files based on the Chromosome (Reference Name).

The files all have the same base name, but with the chromosome name ".bam" or ".sam" appended.

Parameters

    Required Parameters:
        --in       : the BAM file to be split
        --out      : the base filename for the SAM/BAM files to write into.  Does not include the extension.
                     CHROM.bam or CHROM.sam will be appended to the basename where CHROM is the chromosome name.
    Optional Parameters:
        --noeof  : do not expect an EOF block on a bam file.
        --bamout : write the output files in BAM format (default).
        --samout : write the output files in SAM format.
        --params : print the parameter settings

Usage

./bam splitChromosome --in <inputFilename>  --out <outputFileBaseName> [--noeof] [--bamout|--samout] [--params]


Return Value

  • 0: all records are successfully read and written.
  • non-0: at least one record was not successfully read or written.

Example Output

Reference Name: 1 has 5 records
Reference Name: 2 has 2 records
Reference Name: 3 has 1 records
Reference Name: * has 2 records
Number of records = 10
Returning: 0 (SUCCESS)