3,045
edits
Changes
From Genome Analysis Wiki
→BaseQC (--pBaseQC and --cBaseQC and --baseSum)
= Overview of the <code>stats</code> function of <code>bamUtil</code> =
The <code>stats</code> option on the [[BamUtil]] executable generates the specified statistics on a SAM/BAM file. == Troubleshooting ==See [[BamUtil:_FAQ#BamUtil:_stats|BamUtil: FAQ -> BamUtil: stats]] for troubleshooting help.
= Usage =
<pre>
./bam stats --in <inputFile> [--basic] [--qual] [--phred] [--baseQC pBaseQC <outputFileName>] [--cBaseQC <outputFileName>] [--maxNumReads <maxNum>][--unmapped] [--bamIndex <bamIndexFile>] [--regionList <regFileName>] [--minMapQual requiredFlags <minMapQintegerRequiredFlags>] [--dbsnp excludeFlags <dbsnpFileintegerExcludeFlags>] [--sumStatsnoeof] [--params] [--withinRegion] [--baseSum] [--noeofbufferSize <buffSize>] [--minMapQual <minMapQ>] [--paramsdbsnp <dbsnpFile>]</pre>
= Parameters =
<pre>
</pre>
=== Required Flags (<code>--requiredFlags</code>) ===
Use <code>--requiredFlags</code> followed by an integer representation of the flags to only process records with all of the specified flags set.
== Types of Statistics ==
=== Basic (<code>--basic</code>) ===
Prints summary statistics for the file:
*BasesInMappedReads - # of bases in reads marked mapped in the flag
=== Qual/Phred (<code>--phred</code> and <code>--qual</code>) ===
Prints a count of the number of times each quality value appears in the fileto stderr.
*<code>phred</code> Displays Quality as phred integers [0-93]
*<code>qual</code> Displays Quality as non-phred integers (phred + 33) [33-126]
The <brcode> baseSum</code> option can be used with either <code>pBaseQC</code> or <code>cBaseQC</code> or on its own. <code>baseSum</code> generates a summary of the per position statistics and writes it to stderr. It calculates the per position base statistics even if they will not be written anywhere (neither <code>pBaseQC</code> nor <code>cBaseQC</code> are specified).
==== Percentage-Based Output Format (<code>--pBaseQC</code>) ====
Order/Descriptions:
This output does not include a MapQual255 count.
===== Sample Output =====
1 10024 10025 14 12 85.714 39 30 76.923 51.282 25.641 51.282 84.615 38.462 15.385 15.385 11.000 21
</pre>
==== Count-Based Output Format (<code>--cBaseQC</code>) ====
Order/Descriptions:
{|border=1
|}
==== Summary of per Position Statistics (<code>--baseSum</code>) ====
Use <code>--baseSum</code> to print an overall summary of the baseQC for the file to stderr.
This option can be used with or without <code>--pBaseQC</code> and <code>--cBaseQC</code>
The values are tab delimited. First there is a header line describing the summary. The next line has the Means, and the last line has the Standard Deviations.
{|border=1
! Field !! Description !!style="width: 80px"| Excludes Duplicates, QC Failures !!style="width: 80px"| Excludes Unmapped !!style="width: 80px"| Excludes MapQual = 255 !!style="width: 80px"| Excludes Below Min MapQual !!style="width: 80px"| Excludes CIGAR Deletions, Skips
|-
| TotalReads || # of reads that span this position || || || || ||
|-
| Dups || # of reads marked duplicate in the flag || || || || ||
|-
| QCFail || # of reads marked QC failure in the flag || || || || ||
|-
| Mapped || # of reads marked mapped in the flag || align="center"|X || align="center"|X || || ||
|-
| Paired || # of reads marked paired in the flag || align="center"|X || align="center"|X || || ||
|-
| ProperPaired || # of reads marked paired AND proper paired in the flag || align="center"|X || align="center"|X || || ||
|-
| ZeroMapQual || # of reads that have a Mapping Quality of 0 || align="center"|X || align="center"|X || || ||
|-
| MapQual<10(%) || # of reads that have a Mapping Quality < 10 || align="center"|X || align="center"|X || || ||
|-
| MapQual255 || # of reads that have a Mapping Quality = 255 || align="center"|X || align="center"|X || || ||
|-
| PassMapQual || # of reads that have a Mapping Quality >= a minimum Mapping Quality || align="center"|X || align="center"|X || || ||
|-
| AverageMapQuality || sum of included mapping qualities / AverageMapQualCount || align="center"|X || align="center"|X || align="center"|X || ||
|-
| AverageMapQualCount || # of mapping qualities in AverageMapQuality || align="center"|X || align="center"|X || align="center"|X ||
|- ||
| Depth || # of reads that are mapped with acceptable Mapping Quality, and are not duplicates or QC failures || align="center"|X || align="center"|X || align="center"|X || align="center"|X || align="center"|X
|-
| Q20Bases || # of bases at this position with a base quality (from the read) of Q20 or higher || align="center"|X || align="center"|X || align="center"|X || align="center"|X || align="center"|X
|-
|}
===== Sample Output =====
<pre>
Summary of Pileup Stats (1st Row is Mean, 2nd Row is Standard Deviation)
TotalReads Dups QCFail Mapped Paired ProperPaired ZeroMapQual MapQual<10 MapQual255 PassMapQual AverageMapQuality AverageMapQualCount
Depth Q20Bases
14.307692 1.846154 1.846154 8.769231 7.846154 0.923077 2.923077 5.846154 0.000000 2.923077 11.000000 8.769231 2.076923 1.153846
17.670053 2.882307 2.882307 9.038380 7.603137 1.441153 3.012793 6.025586 0.000000 3.012793 0.000000 9.038380 2.841993 1.993579
</pre>
==== Optional BaseQC Only Parameters ====
===== Pileup Buffer Size (<code>--bufferSize</code>) =====
Use the <code>--bufferSize</code> option followed by the size of the pileup buffer to use for [[BaseQC (--pBaseQC and --cBaseQC and --baseSum)|baseQC]] stats.
===== Minimum Mapping Quality (<code>--minMapQual</code>) =====
Use the <code>--minMapQual</code> option followed by the minimum mapping quality for filtering reads in the [[BaseQC (--pBaseQC and --cBaseQC and --baseSum)|baseQC]] stats.
===== DBSNP File (<code>--dbsnp</code>) =====
Use the <code>--dbsnp</code> option followed by the name of the dbsnp file to specify the positions to exclude from [[BaseQC (--pBaseQC and --cBaseQC and --baseSum)|baseQC]] analysis.
{{PhoneHomeParameters}}
= Return Value =
0 on Success, non-0 on failure
[[Category:BamUtil|stats]] [[Category:BAM_Software]] [[Category:Software]]