Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,292 bytes added ,  17:43, 18 May 2012
no edit summary
Line 7: Line 7:  
= Usage =
 
= Usage =
 
<pre>
 
<pre>
./bam stats --in <inputFile> [--basic] [--qual] [--phred] [--pBaseQC <outputFileName>] [--cBaseQC <outputFileName>] [--baseSum] [--maxNumReads <maxNum>][--unmapped] [--bamIndex <bamIndexFile>] [--regionList <regFileName>] [--minMapQual <minMapQ>] [--dbsnp <dbsnpFile>] [--noeof] [--params]
+
./bam stats --in <inputFile> [--basic] [--qual] [--phred] [--pBaseQC <outputFileName>] [--cBaseQC <outputFileName>] [--maxNumReads <maxNum>][--unmapped] [--bamIndex <bamIndexFile>] [--regionList <regFileName>] [--requiredFlags <integerRequiredFlags>] [--excludeFlags <integerExcludeFlags>] [--noeof] [--params] [--withinRegion] [--baseSum] [--bufferSize <buffSize>] [--minMapQual <minMapQ>] [--dbsnp <dbsnpFile>]
 
</pre>
 
</pre>
   Line 15: Line 15:  
--in : the SAM/BAM file to calculate stats for
 
--in : the SAM/BAM file to calculate stats for
 
Types of Statistics that can be generated:
 
Types of Statistics that can be generated:
--basic       : Turn on basic statistic generation
+
--basic         : Turn on basic statistic generation
--qual       : Generate a count for each quality (displayed as non-phred quality)
+
--qual         : Generate a count for each quality (displayed as non-phred quality)
--phred       : Generate a count for each quality (displayed as phred quality)
+
--phred         : Generate a count for each quality (displayed as phred quality)
--pBaseQC     : Write per base statistics as Percentages to the specified file.
+
--pBaseQC       : Write per base statistics as Percentages to the specified file.
                pBaseQC & cBaseQC cannot both be specified.
+
                  pBaseQC & cBaseQC cannot both be specified.
--cBaseQC     : Write per base statistics as Counts to the specified file.
+
--cBaseQC       : Write per base statistics as Counts to the specified file.
                pBaseQC & cBaseQC cannot both be specified.
+
                  pBaseQC & cBaseQC cannot both be specified.
 
Optional Parameters:
 
Optional Parameters:
--maxNumReads : Maximum number of reads to process
+
--maxNumReads   : Maximum number of reads to process
                Defaults to -1 to indicate all reads.
+
                  Defaults to -1 to indicate all reads.
--unmapped   : Only process unmapped reads (requires a bamIndex file)
+
--unmapped     : Only process unmapped reads (requires a bamIndex file)
--bamIndex   : The path/name of the bam index file
+
--bamIndex     : The path/name of the bam index file
                (if required and not specified, uses the --in value + ".bai")
+
                  (if required and not specified, uses the --in value + ".bai")
--regionList : File containing the regions to be processed chr<tab>start_pos<tab>end<pos>.
+
--regionList   : File containing the regions to be processed chr<tab>start_pos<tab>end<pos>.
                Positions are 0 based and the end_pos is not included in the region.
+
                  Positions are 0 based and the end_pos is not included in the region.
                Uses bamIndex.
+
                  Uses bamIndex.
--minMapQual : The minimum mapping quality for filtering reads in the baseQC stats.
+
--excludeFlags : Skip any records with any of the specified flags set
--dbsnp      : The dbSnp file of positions to exclude from baseQC analysis.
+
                  (specify an integer representation of the flags)
--noeof       : Do not expect an EOF block on a bam file.
+
--requiredFlags : Only process records with all of the specified flags set
--params     : Print the parameter settings.
+
                  (specify an integer representation of the flags)
 +
--noeof         : Do not expect an EOF block on a bam file.
 +
--params       : Print the parameter settings.
 +
Optional phred/qual Only Parameters:
 +
--withinRegion  : Only count qualities if they fall within regions specified.
 +
                  Only applicable if regionList is also specified.
 
Optional BaseQC Only Parameters:
 
Optional BaseQC Only Parameters:
--baseSum     : Print an overall summary of the baseQC for the file to stderr.
+
--baseSum       : Print an overall summary of the baseQC for the file to stderr.
 +
--bufferSize    : Size of the pileup buffer for calculating the BaseQC parameters.
 +
                  Default: 1024
 +
--minMapQual    : The minimum mapping quality for filtering reads in the baseQC stats.
 +
--dbsnp        : The dbSnp file of positions to exclude from baseQC analysis.
 
</pre>  
 
</pre>  
 
For all types of statistics, the bam file used is specified by <code>--in</code>.  
 
For all types of statistics, the bam file used is specified by <code>--in</code>.  
Line 71: Line 80:  
*<code>phred</code> Displays Quality as phred integers [0-93]  
 
*<code>phred</code> Displays Quality as phred integers [0-93]  
 
*<code>qual</code> Displays Quality as non-phred integers (phred + 33) [33-126]
 
*<code>qual</code> Displays Quality as non-phred integers (phred + 33) [33-126]
 +
 +
By default, these counts include all qualities in the BAM file.
 +
 +
To exclude unmapped reads and soft clips, use --excludeFlags 4.
 +
 +
To only include records that overlap a set of regions, use --regionList and specify a bed file with the regions.  If a read overlaps the region, all qualities will be counted even if those bases do not fall in the region.  If you only want to count qualities that fall within the region, also specify --withinRegion.  Without excluding unmapped reads, it will include soft clips that overlap the region.
    
<br>  
 
<br>  

Navigation menu