Changes

From Genome Analysis Wiki
Jump to: navigation, search

BamUtil

2,136 bytes added, 16:53, 29 August 2011
no edit summary
[[Category:BAM Software]]
= bamUtil Overview =
bamUtil is a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable, <code>bam</code>.
== Programs ==
The software reads the beginning of an input file to determine if it is SAM/BAM. To determine the format (SAM/BAM) of the output file, the software checks the output file's extension. If the extension is ".bam" it writes a BAM file, otherwise it writes a SAM file.
</pre>
 
== stats ==
The <code>stats option on the bam executable generates the specified statistics on a SAM/BAM file.
 
=== Parameters ===
 
=== Notes ===
==== BaseQC ====
'''This capability is coming soon, so these notes may be updated prior to it being completed...'''
 
Do we print stats for positions where the reference base is 'N'?? (any special note for those? Qplot would not count them in the depth.)
 
The <code>baseQC</code> option generates the following statistics:
 
For each position, the following counts are incremented if:
# a read spans the reference position (starts before or at this reference position and ends at or after this position)
# regardless of duplicate/qc failure/unmapped/mapping quality
# regardless of the CIGAR for this position (other than clips at the beginning/end which are not counted, but deletions and skips are counted)
*TotalReads(e6) - # of reads that span this position.
*DupRate(%) - # of reads marked duplicate in the flag / TotalReads
*QCFailRate(%) - # of reads marked QC failure in the flag / TotalReads
*PairedReads(%) - # of reads marked paired in the flag / TotalReads
*ProperPaired(%) - # of reads marked paired AND proper paired in the flag / TotalReads
*MappedBases(e9) - # of reads marked mapped in the flag
*MappingRate(%) - # of reads marked mapped in the flag / TotalReads
*ZeroMapQual(%) - # of reads marked mapped in the flag AND have a Mapping Quality of 0 / TotalReads
*MapQual<10(%) - # of reads marked mapped in the flag AND have a Mapping Quality < 10 / TotalReads
*MapRate_MQpass(%) - # of reads marked mapped in the flag AND have a Mapping Quality >= a minimum Mapping Quality / TotalReads
 
 
For each position, the following counts are incremented if:
# a read spans the reference position (starts before or at this reference position and ends at or after this position)
# the read is NOT a duplicate, qc failure, unmapped, or mapped with a mapping quality less than the min
# the CIGAR for this position is a M/=/X (match/mismatch)
TBD - should it count if the read has a base of 'N'
*Depth - # of reads.
*Q20Bases(e9) - TBD
*Q20BasesPct(%) - TBD
*EPS_MSE - TBD

Navigation menu