Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,530 bytes added ,  13:04, 23 September 2014
no edit summary
Line 38: Line 38:  
:# Run <code>'''make clean; make'''</code> in both libStatGen and bamUtil
 
:# Run <code>'''make clean; make'''</code> in both libStatGen and bamUtil
 
:# The optimized executable is <code>'''bamUtil/bin/bam'''</code>
 
:# The optimized executable is <code>'''bamUtil/bin/bam'''</code>
 +
 +
 +
= BamUtil: stats =
 +
==Overflow on the pileup buffer: specifiedPosition = ####, pileup buffer start position: ####, pileup buffer end position: ####==
 +
<pre>
 +
Exiting due to ERROR:
 +
Overflow on the pileup buffer: specifiedPosition = 93252, pileup buffer start position: 92227, pileup buffer end position: 93251
 +
</pre>
 +
:By default, bam stats assumes that a single read covers less than 1024 reference bases.
 +
:This type of error appears if a read is longer than that.  This is most likely to happen if you have large skipped regions in your CIGARs ('N's).
 +
:You need to increase the size of the pileup buffer to cover the largest number of reference bases a single read covers.
 +
 +
:To fix this problem, use the <code>--bufferSize 3000</code> parameter, replacing 3000 by the appropriate number to handle the largest size of the reference covered by a read.  You can increase this to a large number - it will just take up more memory.
 +
 +
:Future versions of bamUtil will print an additional error message including the recordName and CIGAR of the record that failed.  You can use the CIGAR to come up with a better setting for <code>--bufferSize</code>, although keep in mind that future records could cover even a larger number of bases than that failing record.
 +
 +
:To print all of the Cigars in the file that contain an 'N' (skip), you can use (replace test.bam with your bam file):
 +
  ../bin/bam findCigars --noph --cskip --in test.bam --out - |grep -v "^@" |cut -f 6
 +
:This may help to determine a setting for <code>--bufferSize</code>.

Navigation menu